Guidelines for Ansible Amazon AWS module development
The Ansible AWS collection (on Galaxy, source code repository) is maintained by the Ansible AWS Working Group. For further information see the AWS working group community page. If you are planning to contribute AWS modules to Ansible then getting in touch with the working group is a good way to start, especially because a similar module may already be under development.
Requirements
Python Compatibility
AWS content in Ansible 2.9 and 1.x collection releases supported Python 2.7 and newer.
Starting with the 2.0 releases of both collections, Python 2.7 support will be ended in accordance with AWS’ end of Python 2.7 support. Contributions to both collections that target the 2.0 or later collection releases can be written to support Python 3.6+ syntax.
SDK Version Support
Starting with the 2.0 releases of both collections, it is generally the policy to support the versions of botocore and boto3 that were released 12 months prior to the most recent major collection release, following semantic versioning (for example, 2.0.0, 3.0.0).
Features and functionality that require newer versions of the SDK can be contributed provided they are noted in the module documentation:
DOCUMENTATION = '''
---
module: ec2_vol
options:
throughput:
description:
- Volume throughput in MB/s.
- This parameter is only valid for gp3 volumes.
- Valid range is from 125 to 1000.
- Requires at least botocore version 1.19.27.
type: int
version_added: 1.4.0
And handled using the botocore_at_least
helper method:
if module.params.get('throughput'):
if not module.botocore_at_least("1.19.27"):
module.fail_json(msg="botocore >= 1.19.27 is required to set the throughput for a volume")
Starting with the 4.0 releases of both collections, all support for the original boto SDK has been dropped. AWS Modules must be written using the botocore and boto3 SDKs.
Maintaining existing modules
Changelogs
A changelog fragment must be added to any PR that changes functionality or fixes a bug. More information about changelog fragments can be found in the Making your PR merge-worthy section of the Ansible Development Cycle documentation<community_changelogs>
Breaking Changes
Changes that are likely to break existing playbooks using the AWS collections should be avoided, should only be made in a major release, and where practical should be preceeded by a deprecation cycle of at least 1 full major release. Deprecations may be backported to the stable branches.
For example: - A deprecation added in release 3.0.0 may be removed in release 4.0.0. - A deprecation added in release 1.2.0 may be removed in release 3.0.0.
Breaking changes include:
- Removing a parameter.
- Making a parameter required
.
- Updating the default value of a parameter.
- Changing or removing an existing return value.
Adding new features
Try to keep backward compatibility with versions of boto3/botocore that are at least a year old. This means that if you want to implement functionality that uses a new feature of boto3/botocore, it should only fail if that feature is explicitly used, with a message stating the missing feature and minimum required version of botocore. (Feature support is usually defined in botocore and then used by boto3)
module = AnsibleAWSModule(
argument_spec=argument_spec,
...
)
if module.params.get('scope') == 'managed':
module.require_botocore_at_least('1.23.23', reason='to list managed rules')
Release policy and backporting merged PRs
All amazon.aws and community.aws PRs must be merged to the main
branch first. After a PR has
been accepted and merged to the main
branch they can be backported to the stable branches.
The main
branch is a staging location for the next major version (X+1) of the collections and
may include breaking changes.
General backport policy:
New features, deprecations and minor changes can be backported to the latest stable release.
Bugfixes can be backported to the 2 latest stable releases.
Security fixes should be backported to at least the 2 latest stable releases.
Where necessary, additional CI related changes may be introduced to older stable branches to ensure CI continues to function.
The simplest mechanism for backporting PRs is by adding the backport-Y
label to a PR. Once the
PR has been merged the patchback bot will attempt to automatically create a backport PR.
Creating new AWS modules
When writing a new module it is important to think about the scope of the module. In general, try to do one thing and do it well.
Where the Amazon APIs provide a distinction between dependent resources, such as S3 buckets and S3 objects, this is often a good divider between modules. Additionally, resources which have a many-to-many relationship with another resource, such as IAM managed policies and IAM roles, are often best managed by two separate modules.
While it’s possible to write an s3
module which manages all things related to S3, thoroughly
testing and maintaining such a module is difficult. Similarly, while it would be possible to
write a module that manages the base EC2 security group resource, and a second module to manage the
rules on the security group, this would be contrary to what users of the module might anticipate.
There is no hard and fast right answer, but it’s important to think about it, and Amazon have often done this work for you when designing their APIs.
Naming your module
Module names should include the name of the resource being managed and be prefixed with the AWS API that the module is based on. Where examples of a prefix don’t already exist a good rule of thumb is to use whatever client name you use with boto3 as a starting point.
Unless something is a well known abbreviation of a major component of AWS (for example, VPC or ELB) avoid further abbreviating names and don’t create new abbreviations independently.
Where an AWS API primarily manages a single resource, the module managing this resource can be
named as just the name of the API. However, consider using instance
or cluster
for clarity
if Amazon refers to them using these names.
Examples:
ec2_instance
s3_object
(previously namedaws_s3
, but is primarily for manipulating S3 objects)elb_classic_lb
(previouslyec2_elb_lb
, but is part of the ELB API, not EC2)networkfirewall_rule_group
networkfirewall
(while this could be callednetworkfirewall_firewall
the second firewall is redundant and the API is focused around creating these firewall resources)
Note: Prior to the collections being split from Ansible Core, it was common to use aws_
as a
prefix to disambiguate services with a generic name, such as aws_secret
. This is no longer
necessary, and the aws_
prefix is reserved for services with a very broad effect where
referencing the AWS API might cause confusion. For example, aws_region_info
, which
connects to EC2 but provides global information about the regions enabled in an account for all
services.
Use boto3 and AnsibleAWSModule
All new AWS modules must use boto3/botocore and AnsibleAWSModule
.
AnsibleAWSModule
greatly simplifies exception handling and library
management, reducing the amount of boilerplate code. If you cannot
use AnsibleAWSModule
as a base, you must document the reason and request an exception to this rule.
Importing botocore and boto3
The ansible_collections.amazon.aws.plugins.module_utils.ec2
module and
ansible_collections.amazon.aws.plugins.module_utils.core
modules both
automatically import boto3 and botocore. If boto3 is missing from the system then the variable
HAS_BOTO3
will be set to false. Normally, this means that modules don’t need to import
boto3 directly. There is no need to check HAS_BOTO3
when using AnsibleAWSModule
as the module does that check:
from ansible_collections.amazon.aws.plugins.module_utils.core import AnsibleAWSModule
try:
import botocore
except ImportError:
pass # handled by AnsibleAWSModule
or:
from ansible.module_utils.basic import AnsibleModule
from ansible_collections.amazon.aws.plugins.module_utils.ec2 import HAS_BOTO3
try:
import botocore
except ImportError:
pass # handled by imported HAS_BOTO3
def main():
if not HAS_BOTO3:
module.fail_json(msg='boto3 and botocore are required for this module')
Supporting Module Defaults
The existing AWS modules support using module_defaults for common
authentication parameters. To do the same for your new module, add an entry for it in
meta/runtime.yml
. These entries take the form of:
action_groups:
aws:
...
aws_example_module
Module behavior
To reduce the chance of breaking changes occurring when new features are added, the module should avoid modifying the resource attribute when a parameter is not explicitly set in a task.
By convention, when a parameter is explicitly set in a task, the module should set the resource attribute to match what was set in the task. In some cases, such as tags or associations, it can be helpful to add an additional parameter which can be set to change the behavior from replacive to additive. However, the default behavior should still be replacive rather than additive.
See the Dealing with tags<ansible_collections.amazon.aws.docsite.dev_tags>
section for an example with tags
and purge_tags
.
Connecting to AWS
AnsibleAWSModule provides the resource
and client
helper methods for obtaining boto3 connections.
These handle some of the more esoteric connection options, such as security tokens and boto profiles.
If using the basic AnsibleModule then you should use get_aws_connection_info
and then boto3_conn
to connect to AWS as these handle the same range of connection options.
These helpers also check for missing profiles or a region not set when it needs to be, so you don’t have to.
An example of connecting to ec2 is shown below. Note that unlike boto there is no NoAuthHandlerFound
exception handling like in boto. Instead, an AuthFailure
exception will be thrown when you use the
connection. To ensure that authorization, parameter validation and permissions errors are all caught,
you should catch ClientError
and BotoCoreError
exceptions with every boto3 connection call.
See exception handling:
module.client('ec2')
or for the higher level ec2 resource:
module.resource('ec2')
An example of the older style connection used for modules based on AnsibleModule rather than AnsibleAWSModule:
region, ec2_url, aws_connect_params = get_aws_connection_info(module, boto3=True)
connection = boto3_conn(module, conn_type='client', resource='ec2', region=region, endpoint=ec2_url, **aws_connect_params)
region, ec2_url, aws_connect_params = get_aws_connection_info(module, boto3=True)
connection = boto3_conn(module, conn_type='client', resource='ec2', region=region, endpoint=ec2_url, **aws_connect_params)
Common Documentation Fragments for Connection Parameters
There are four common documentation fragments that should be included into almost all AWS modules:
aws
- contains the common boto3 connection parametersec2
- contains the common region parameter required for many AWS modulesboto3
- contains the minimum requirements for the collectiontags
- contains the common tagging parameters used by many AWS modules
These fragments should be used rather than re-documenting these properties to ensure consistency and that the more esoteric connection options are documented. For example:
DOCUMENTATION = '''
module: my_module
# some lines omitted here
extends_documentation_fragment:
- amazon.aws.aws
- amazon.aws.ec2
- amazon.aws.boto3
'''
Handling exceptions
You should wrap any boto3 or botocore call in a try block. If an exception is thrown, then there are a number of possibilities for handling it.
- Catch the general
ClientError
or look for a specific error code with is_boto3_error_code
.
- Catch the general
Use
aws_module.fail_json_aws()
to report the module failure in a standard wayRetry using AWSRetry
Use
fail_json()
to report the failure without usingansible_collections.amazon.aws.plugins.module_utils.core
Do something custom in the case where you know how to handle the exception
For more information on botocore exception handling see the botocore error documentation.
Using is_boto3_error_code
To use ansible_collections.amazon.aws.plugins.module_utils.core.is_boto3_error_code
to catch a single
AWS error code, call it in place of ClientError
in your except clauses. In
this example, only the InvalidGroup.NotFound
error code will be caught here,
and any other error will be raised for handling elsewhere in the program.
try:
info = connection.describe_security_groups(**kwargs)
except is_boto3_error_code('InvalidGroup.NotFound'):
pass
do_something(info) # do something with the info that was successfully returned
Using fail_json_aws()
In the AnsibleAWSModule there is a special method, module.fail_json_aws()
for nice reporting of
exceptions. Call this on your exception and it will report the error together with a traceback for
use in Ansible verbose mode.
You should use the AnsibleAWSModule for all new modules, unless not possible.
from ansible_collections.amazon.aws.plugins.module_utils.core import AnsibleAWSModule
# Set up module parameters
# module params code here
# Connect to AWS
# connection code here
# Make a call to AWS
name = module.params.get['name']
try:
result = connection.describe_frooble(FroobleName=name)
except (botocore.exceptions.BotoCoreError, botocore.exceptions.ClientError) as e:
module.fail_json_aws(e, msg="Couldn't obtain frooble %s" % name)
Note that it should normally be acceptable to catch all normal exceptions here, however if you expect anything other than botocore exceptions you should test everything works as expected.
If you need to perform an action based on the error boto3 returned, use the error code and the
is_boto3_error_code()
helper.
# Make a call to AWS
name = module.params.get['name']
try:
result = connection.describe_frooble(FroobleName=name)
except is_boto3_error_code('FroobleNotFound'):
workaround_failure() # This is an error that we can work around
except (botocore.exceptions.BotoCoreError, botocore.exceptions.ClientError) as e: # pylint: disable=duplicate-except
module.fail_json_aws(e, msg="Couldn't obtain frooble %s" % name)
using fail_json() and avoiding ansible_collections.amazon.aws.plugins.module_utils.core
Boto3 provides lots of useful information when an exception is thrown so pass this to the user along with the message.
from ansible.module_utils.ec2 import HAS_BOTO3
try:
import botocore
except ImportError:
pass # caught by imported HAS_BOTO3
# Connect to AWS
# connection code here
# Make a call to AWS
name = module.params.get['name']
try:
result = connection.describe_frooble(FroobleName=name)
except botocore.exceptions.ClientError as e:
module.fail_json(msg="Couldn't obtain frooble %s: %s" % (name, str(e)),
exception=traceback.format_exc(),
**camel_dict_to_snake_dict(e.response))
Note: we use str(e)
rather than e.message
as the latter doesn’t
work with python3
If you need to perform an action based on the error boto3 returned, use the error code.
# Make a call to AWS
name = module.params.get['name']
try:
result = connection.describe_frooble(FroobleName=name)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == 'FroobleNotFound':
workaround_failure() # This is an error that we can work around
else:
module.fail_json(msg="Couldn't obtain frooble %s: %s" % (name, str(e)),
exception=traceback.format_exc(),
**camel_dict_to_snake_dict(e.response))
except botocore.exceptions.BotoCoreError as e:
module.fail_json_aws(e, msg="Couldn't obtain frooble %s" % name)
API throttling (rate limiting) and pagination
For methods that return a lot of results, boto3 often provides
paginators. If the method
you’re calling has NextToken
or Marker
parameters, you should probably
check whether a paginator exists (the top of each boto3 service reference page has a link
to Paginators, if the service has any). To use paginators, obtain a paginator object,
call paginator.paginate
with the appropriate arguments and then call build_full_result
.
Any time that you are calling the AWS API a lot, you may experience API throttling,
and there is an AWSRetry
decorator that can be used to ensure backoff. Because
exception handling could interfere with the retry working properly (as AWSRetry needs to
catch throttling exceptions to work correctly), you’d need to provide a backoff function
and then put exception handling around the backoff function.
You can use exponential_backoff
or jittered_backoff
strategies - see
the cloud module_utils
()/lib/ansible/module_utils/cloud.py)
and AWS Architecture blog for more details.
The combination of these two approaches is then:
@AWSRetry.jittered_backoff(retries=5, delay=5)
def describe_some_resource_with_backoff(client, **kwargs):
paginator = client.get_paginator('describe_some_resource')
return paginator.paginate(**kwargs).build_full_result()['SomeResource']
def describe_some_resource(client, module):
filters = ansible_dict_to_boto3_filter_list(module.params['filters'])
try:
return describe_some_resource_with_backoff(client, Filters=filters)
except botocore.exceptions.ClientError as e:
module.fail_json_aws(e, msg="Could not describe some resource")
Prior to Ansible 2.10 if the underlying describe_some_resources
API call threw
a ResourceNotFound
exception, AWSRetry
would take this as a cue to retry until
it is not thrown (this is so that when creating a resource, we can just retry until it
exists). This default was changed and it is now necessary to explicitly request
this behaviour. This can be done by using the catch_extra_error_codes
argument on the decorator.
@AWSRetry.jittered_backoff(retries=5, delay=5, catch_extra_error_codes=['ResourceNotFound'])
def describe_some_resource_retry_missing(client, **kwargs):
return client.describe_some_resource(ResourceName=kwargs['name'])['Resources']
def describe_some_resource(client, module):
name = module.params.get['name']
try:
return describe_some_resource_with_backoff(client, name=name)
except (botocore.exceptions.BotoCoreError, botocore.exceptions.ClientError) as e:
module.fail_json_aws(e, msg="Could not describe resource %s" % name)
To make use of AWSRetry easier, it can now be wrapped around a client returned
by AnsibleAWSModule
. any call from a client. To add retries to a client,
create a client:
module.client('ec2', retry_decorator=AWSRetry.jittered_backoff(retries=10))
Any calls from that client can be made to use the decorator passed at call-time
using the aws_retry
argument. By default, no retries are used.
ec2 = module.client('ec2', retry_decorator=AWSRetry.jittered_backoff(retries=10))
ec2.describe_instances(InstanceIds=['i-123456789'], aws_retry=True)
# equivalent with normal AWSRetry
@AWSRetry.jittered_backoff(retries=10)
def describe_instances(client, **kwargs):
return ec2.describe_instances(**kwargs)
describe_instances(module.client('ec2'), InstanceIds=['i-123456789'])
The call will be retried the specified number of times, so the calling functions don’t need to be wrapped in the backoff decorator.
You can also use customization for retries
, delay
and max_delay
parameters used by
AWSRetry.jittered_backoff
API using module params. You can take a look at
the cloudformation <cloudformation_module>
module for example.
- To make all Amazon modules uniform, prefix the module param with
backoff_
, soretries
becomesbackoff_retries
and likewise with
backoff_delay
andbackoff_max_delay
.
Returning Values
When you make a call using boto3, you will probably get back some useful information that you should return in the module. As well as information related to the call itself, you will also have some response metadata. It is OK to return this to the user as well as they may find it useful.
Boto3 returns most keys in CamelCase. Ansible adopts python standards for naming variables and usage.
There is a useful helper function called camel_dict_to_snake_dict
that allows for an easy conversion
of the boto3 response to snake_case. It resides in module_utils/common/dict_transformations
.
You should use this helper function and avoid changing the names of values returned by Boto3. E.g. if boto3 returns a value called ‘SecretAccessKey’ do not change it to ‘AccessKey’.
There is an optional parameter, ignore_list
, which is used to avoid converting a sub-tree
of a dict. This is particularly useful for tags, where keys are case-sensitive.
# Make a call to AWS
resource = connection.aws_call()
# Convert resource response to snake_case
snaked_resource = camel_dict_to_snake_dict(resource, ignore_list=['Tags'])
# Return the resource details to the user without modifying tags
module.exit_json(changed=True, some_resource=snaked_resource)
Note: The returned key representing the details of the specific resource (some_resource
above)
should be a sensible approximation of the resource name. For example, volume
for ec2_vol
,
volumes
for ec2_vol_info
.
Info modules
Info modules that can return information on multiple resources should return a list of
dictionaries, with each dictionary containing information about that particular resource
(i.e. security_groups
in ec2_group_info
).
In cases where the _info module only returns information on a singular resource
(i.e. ec2_tag_info
), a singular dictionary should be returned as opposed to a list
of dictionaries.
In cases where the _info module returns no instances, an empty list ‘[]’ should be returned.
Keys in the returned dictionaries should follow the guidelines above and use snake_case. If a return value can be used as a parameter for its corresponding main module, the key should match either the parameter name itself, or an alias of that parameter.
The following is an example of improper usage of a sample info module with its respective main module:
"security_groups": {
{
"description": "Created by ansible integration tests",
"group_id": "sg-050dba5c3520cba71",
"group_name": "ansible-test-87988625-unknown5c5f67f3ad09-icmp-1",
"ip_permissions": [],
"ip_permissions_egress": [],
"owner_id": "721066863947",
"tags": [
{
"Key": "Tag_One"
"Value": "Tag_One_Value"
},
],
"vpc_id": "vpc-0cbc2380a326b8a0d"
}
}
The sample output above shows a few mistakes in the sample security group info module:
* security_groups
is a dict of dict, not a list of dicts.
* tags
appears to be directly returned from boto3, since they’re a list of dicts.
The following is what the sample output would look like, with the mistakes corrected.
"security_groups": [
{
"description": "Created by ansible integration tests",
"group_id": "sg-050dba5c3520cba71",
"group_name": "ansible-test-87988625-unknown5c5f67f3ad09-icmp-1",
"ip_permissions": [],
"ip_permissions_egress": [],
"owner_id": "721066863947",
"tags": {
"Tag_One": "Tag_One_Value",
},
"vpc_id": "vpc-0cbc2380a326b8a0d"
}
]
Deprecating return values
If changes need to be made to current return values, the new/”correct” keys should be returned in addition to the existing keys to preserve compability with existing playbooks. A deprecation should be added to the return values being replaced, initially placed at least 2 years out, on the 1st of a month.
For example:
# Deprecate old `iam_user` return key to be replaced by `user` introduced on 2022-04-10
module.deprecate("The 'iam_user' return key is deprecated and will be replaced by 'user'. Both values are returned for now.",
date='2024-05-01', collection_name='community.aws')
Dealing with IAM JSON policy
If your module accepts IAM JSON policies then set the type to ‘json’ in the module spec. For example:
argument_spec.update(
dict(
policy=dict(required=False, default=None, type='json'),
)
)
Note that AWS is unlikely to return the policy in the same order that is was submitted. Therefore,
use the compare_policies
helper function which handles this variance.
compare_policies
takes two dictionaries, recursively sorts and makes them hashable for comparison
and returns True if they are different.
from ansible_collections.amazon.aws.plugins.module_utils.ec2 import compare_policies
import json
# some lines skipped here
# Get the policy from AWS
current_policy = json.loads(aws_object.get_policy())
user_policy = json.loads(module.params.get('policy'))
# Compare the user submitted policy to the current policy ignoring order
if compare_policies(user_policy, current_policy):
# Update the policy
aws_object.set_policy(user_policy)
else:
# Nothing to do
pass
Helper functions
Along with the connection functions in Ansible ec2.py module_utils, there are some other useful functions detailed below.
camel_dict_to_snake_dict
boto3 returns results in a dict. The keys of the dict are in CamelCase format. In keeping with Ansible format, this function will convert the keys to snake_case.
camel_dict_to_snake_dict
takes an optional parameter called ignore_list
which is a list of
keys not to convert (this is usually useful for the tags
dict, whose child keys should remain with
case preserved)
Another optional parameter is reversible
. By default, HTTPEndpoint
is converted to http_endpoint
,
which would then be converted by snake_dict_to_camel_dict
to HttpEndpoint
.
Passing reversible=True
converts HTTPEndpoint to h_t_t_p_endpoint
which converts back to HTTPEndpoint
.
snake_dict_to_camel_dict
snake_dict_to_camel_dict
converts snake cased keys to camel case. By default, because it was
first introduced for ECS purposes, this converts to dromedaryCase. An optional
parameter called capitalize_first
, which defaults to False
, can be used to convert to CamelCase.
ansible_dict_to_boto3_filter_list
Converts a an Ansible list of filters to a boto3 friendly list of dicts. This is useful for any
boto3 _facts
modules.
boto_exception
Pass an exception returned from boto or boto3, and this function will consistently get the message from the exception.
Deprecated: use AnsibleAWSModule
’s fail_json_aws
instead.
boto3_tag_list_to_ansible_dict
Converts a boto3 tag list to an Ansible dict. Boto3 returns tags as a list of dicts containing keys called ‘Key’ and ‘Value’ by default. This key names can be overridden when calling the function. For example, if you have already camel_cased your list of tags you may want to pass lowercase key names instead, in other words, ‘key’ and ‘value’.
This function converts the list in to a single dict where the dict key is the tag key and the dict value is the tag value.
ansible_dict_to_boto3_tag_list
Opposite of above. Converts an Ansible dict to a boto3 tag list of dicts. You can again override the key names used if ‘Key’ and ‘Value’ is not suitable.
get_ec2_security_group_ids_from_names
Pass this function a list of security group names or combination of security group names and IDs and this function will return a list of IDs. You should also pass the VPC ID if known because security group names are not necessarily unique across VPCs.
compare_policies
Pass two dicts of policies to check if there are any meaningful differences and returns true if there are. This recursively sorts the dicts and makes them hashable before comparison.
This method should be used any time policies are being compared so that a change in order doesn’t result in unnecessary changes.
Integration Tests for AWS Modules
All new AWS modules should include integration tests to ensure that any changes in AWS APIs that affect the module are detected. At a minimum this should cover the key API calls and check the documented return values are present in the module result.
For general information on running the integration tests see the Integration Tests page of the Module Development Guide, especially the section on configuration for cloud tests.
The integration tests for your module should be added in test/integration/targets/MODULE_NAME
.
You must also have a aliases file in test/integration/targets/MODULE_NAME/aliases
. This file serves
two purposes. First indicates it’s in an AWS test causing the test framework to make AWS credentials
available during the test run. Second putting the test in a test group causing it to be run in the
continuous integration build.
Tests for new modules should be added to the cloud/aws
group. In general just copy
an existing aliases file such as the aws_s3 tests aliases file.
Custom SDK versions for Integration Tests
By default integration tests will run against the earliest supported version of
the AWS SDK. The current supported versions can be found in
tests/integration/constraints.txt
and should not be updated. Where a module
needs access to a later version of the SDK this can be installed by depending on
the setup_botocore_pip
role and setting the botocore_version
variable in
the meta/main.yml
file for your tests.
dependencies:
- role: setup_botocore_pip
vars:
botocore_version: "1.20.24"
Creating EC2 instances in Integration Tests
When started, the integration tests will be passed aws_region
as an extra var.
Any resources created should be created in in this region, this includes EC2
instances. Since AMIs are region specific there is a role which can be
included which will query the APIs for an AMI to use and set the ec2_ami_id
fact. This role can be included by adding the setup_ec2_facts
role as a
dependency in the meta/main.yml
file for your tests.
dependencies:
- role: setup_ec2_facts
The ec2_ami_id
fact can then be used in the tests.
- name: Create launch configuration 1
community.aws.ec2_lc:
name: '{{ resource_prefix }}-lc1'
image_id: '{{ ec2_ami_id }}'
assign_public_ip: yes
instance_type: '{{ ec2_instance_type }}'
security_groups: '{{ sg.group_id }}'
volumes:
- device_name: /dev/xvda
volume_size: 10
volume_type: gp2
delete_on_termination: true
To improve test result reproducability across regions, tests should use this role and the fact it provides to chose an AMI to use.
Resource naming in Integration Tests
AWS has a range of limitations for the name of resources. Where possible, resource names should include a string which makes the resource names unique to the test.
The ansible-test
tool used for running the integration tests provides two
helpful extra vars: resource_prefix
and tiny_prefix
which are unique to the
test set, and should generally used as part of the name. resource_prefix
will generate a prefix based on the host the test is being run on. Sometimes this may result in a resource name that exceeds the character limit allowed by AWS. In these cases, tiny_prefix
will provide a 12-character randomly generated prefix.
AWS Credentials for Integration Tests
The testing framework handles running the test with appropriate AWS credentials, these are made available to your test in the following variables:
aws_region
aws_access_key
aws_secret_key
security_token
So all invocations of AWS modules in the test should set these parameters. To avoid duplicating these for every call, it’s preferable to use module_defaults. For example:
- name: set connection information for aws modules and run tasks
module_defaults:
group/aws:
aws_access_key: "{{ aws_access_key }}"
aws_secret_key: "{{ aws_secret_key }}"
security_token: "{{ security_token | default(omit) }}"
region: "{{ aws_region }}"
block:
- name: Do Something
ec2_instance:
... params ...
- name: Do Something Else
ec2_instance:
... params ...
AWS Permissions for Integration Tests
As explained in the Integration Test guide there are defined IAM policies in mattclay/aws-terminator that contain the necessary permissions to run the AWS integration test.
If your module interacts with a new service or otherwise requires new permissions, tests will fail when you submit a pull request and the Ansibullbot will tag your PR as needing revision. We do not automatically grant additional permissions to the roles used by the continuous integration builds. You will need to raise a Pull Request against mattclay/aws-terminator to add them.
If your PR has test failures, check carefully to be certain the failure is only due to the missing permissions. If you’ve ruled out other sources of failure, add a comment with the ready_for_review
tag and explain that it’s due to missing permissions.
Your pull request cannot be merged until the tests are passing. If your pull request is failing due to missing permissions, you must collect the minimum IAM permissions required to run the tests.
There are two ways to figure out which IAM permissions you need for your PR to pass:
Start with the most permissive IAM policy, run the tests to collect information about which resources your tests actually use, then construct a policy based on that output. This approach only works on modules that use
AnsibleAWSModule
.Start with the least permissive IAM policy, run the tests to discover a failure, add permissions for the resource that addresses that failure, then repeat. If your module uses
AnsibleModule
instead ofAnsibleAWSModule
, you must use this approach.
To start with the most permissive IAM policy:
Create an IAM policy that allows all actions (set
Action
andResource
to*
).Run your tests locally with this policy. On AnsibleAWSModule-based modules, the
debug_botocore_endpoint_logs
option is automatically set toyes
, so you should see a list of AWS ACTIONS after the PLAY RECAP showing all the permissions used. If your tests use a boto/AnsibleModule module, you must start with the least permissive policy (see below).Modify your policy to allow only the actions your tests use. Restrict account, region, and prefix where possible. Wait a few minutes for your policy to update.
Run the tests again with a user or role that allows only the new policy.
If the tests fail, troubleshoot (see tips below), modify the policy, run the tests again, and repeat the process until the tests pass with a restrictive policy.
Open a pull request proposing the minimum required policy to the CI policies.
To start from the least permissive IAM policy:
Run the integration tests locally with no IAM permissions.
- Examine the error when the tests reach a failure.
If the error message indicates the action used in the request, add the action to your policy.
- If the error message does not indicate the action used in the request:
Usually the action is a CamelCase version of the method name - for example, for an ec2 client the method
describe_security_groups
correlates to the actionec2:DescribeSecurityGroups
.Refer to the documentation to identify the action.
If the error message indicates the resource ARN used in the request, limit the action to that resource.
- If the error message does not indicate the resource ARN used:
Determine if the action can be restricted to a resource by examining the documentation.
If the action can be restricted, use the documentation to construct the ARN and add it to the policy.
Add the action or resource that caused the failure to an IAM policy. Wait a few minutes for your policy to update.
Run the tests again with this policy attached to your user or role.
If the tests still fail at the same place with the same error you will need to troubleshoot (see tips below). If the first test passes, repeat steps 2 and 3 for the next error. Repeat the process until the tests pass with a restrictive policy.
Open a pull request proposing the minimum required policy to the CI policies.
Troubleshooting IAM policies
When you make changes to a policy, wait a few minutes for the policy to update before re-running the tests.
Use the policy simulator to verify that each action (limited by resource when applicable) in your policy is allowed.
If you’re restricting actions to certain resources, replace resources temporarily with
*
. If the tests pass with wildcard resources, there is a problem with the resource definition in your policy.If the initial troubleshooting above doesn’t provide any more insight, AWS may be using additional undisclosed resources and actions.
Examine the AWS FullAccess policy for the service for clues.
Re-read the AWS documentation, especially the list of Actions, Resources and Condition Keys for the various AWS services.
Look at the cloudonaut documentation as a troubleshooting cross-reference.
Use a search engine.
Ask in the #ansible-aws chat channel (using Matrix at ansible.im or using IRC at irc.libera.chat).
Unsupported Integration tests
There are a limited number of reasons why it may not be practical to run integration
tests for a module within CI. Where these apply you should add the keyword
unsupported
to the aliases file in test/integration/targets/MODULE_NAME/aliases
.
Some cases where tests should be marked as unsupported: 1) The tests take longer than 10 or 15 minutes to complete 2) The tests create expensive resources 3) The tests create inline policies 4) The tests require the existence of external resources 5) The tests manage Account level security policies such as the password policy or AWS Organizations.
Where one of these reasons apply you should open a pull request proposing the minimum required policy to the unsupported test policies.
Unsupported integration tests will not be automatically run by CI. However, the necessary policies should be available so that the tests can be manually run by someone performing a PR review or writing a patch.