aws_glue_job – Manage an AWS Glue job¶
New in version 2.6.
Synopsis¶
- Manage an AWS Glue job. See https://aws.amazon.com/glue/ for details.
Requirements¶
The below requirements are needed on the host that executes this module.
- boto
- boto3
- python >= 2.6
Parameters¶
Parameter | Choices/Defaults | Comments |
---|---|---|
allocated_capacity
-
|
The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory.
|
|
aws_access_key
-
|
AWS access key. If not set then the value of the AWS_ACCESS_KEY_ID, AWS_ACCESS_KEY or EC2_ACCESS_KEY environment variable is used.
aliases: ec2_access_key, access_key |
|
aws_secret_key
-
|
AWS secret key. If not set then the value of the AWS_SECRET_ACCESS_KEY, AWS_SECRET_KEY, or EC2_SECRET_KEY environment variable is used.
aliases: ec2_secret_key, secret_key |
|
command_name
-
|
Default: "glueetl"
|
The name of the job command. This must be 'glueetl'.
|
command_script_location
-
/ required
|
The S3 path to a script that executes a job.
|
|
connections
-
|
A list of Glue connections used for this job.
|
|
default_arguments
-
|
A dict of default arguments for this job. You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
|
|
description
-
|
Description of the job being defined.
|
|
ec2_url
-
|
Url to use to connect to EC2 or your Eucalyptus cloud (by default the module will use EC2 endpoints). Ignored for modules where region is required. Must be specified for all other modules if region is not used. If not set then the value of the EC2_URL environment variable, if any, is used.
|
|
max_concurrent_runs
-
|
The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit.
|
|
max_retries
-
|
The maximum number of times to retry this job if it fails.
|
|
name
-
/ required
|
The name you assign to this job definition. It must be unique in your account.
|
|
profile
-
added in 1.6 |
Uses a boto profile. Only works with boto >= 2.24.0.
|
|
region
-
|
The AWS region to use. If not specified then the value of the AWS_REGION or EC2_REGION environment variable, if any, is used. See http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region
aliases: aws_region, ec2_region |
|
role
-
/ required
|
The name or ARN of the IAM role associated with this job.
|
|
security_token
-
added in 1.6 |
AWS STS security token. If not set then the value of the AWS_SECURITY_TOKEN or EC2_SECURITY_TOKEN environment variable is used.
aliases: access_token |
|
state
-
/ required
|
|
Create or delete the AWS Glue job.
|
timeout
-
|
The job timeout in minutes.
|
|
validate_certs
boolean
added in 1.5 |
|
When set to "no", SSL certificates will not be validated for boto versions >= 2.6.0.
|
Notes¶
Note
- If parameters are not set within the module, the following environment variables can be used in decreasing order of precedence
AWS_URL
orEC2_URL
,AWS_ACCESS_KEY_ID
orAWS_ACCESS_KEY
orEC2_ACCESS_KEY
,AWS_SECRET_ACCESS_KEY
orAWS_SECRET_KEY
orEC2_SECRET_KEY
,AWS_SECURITY_TOKEN
orEC2_SECURITY_TOKEN
,AWS_REGION
orEC2_REGION
- Ansible uses the boto configuration file (typically ~/.boto) if no credentials are provided. See https://boto.readthedocs.io/en/latest/boto_config_tut.html
AWS_REGION
orEC2_REGION
can be typically be used to specify the AWS region, when required, but this can also be configured in the boto config file
Examples¶
# Note: These examples do not set authentication details, see the AWS Guide for details.
# Create an AWS Glue job
- aws_glue_job:
command_script_location: s3bucket/script.py
name: my-glue-job
role: my-iam-role
state: present
# Delete an AWS Glue job
- aws_glue_job:
name: my-glue-job
state: absent
Return Values¶
Common return values are documented here, the following are the fields unique to this module:
Key | Returned | Description | |
---|---|---|---|
allocated_capacity
integer
|
when state is present |
The number of AWS Glue data processing units (DPUs) allocated to runs of this job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory.
Sample:
10
|
|
command
complex
|
when state is present |
The JobCommand that executes this job.
|
|
name
string
|
when state is present |
The name of the job command.
Sample:
glueetl
|
|
script_location
string
|
when state is present |
Specifies the S3 path to a script that executes a job.
Sample:
mybucket/myscript.py
|
|
connections
dictionary
|
when state is present |
The connections used for this job.
Sample:
{ Connections: [ 'list', 'of', 'connections' ] }
|
|
created_on
string
|
when state is present |
The time and date that this job definition was created.
Sample:
2018-04-21T05:19:58.326000+00:00
|
|
default_arguments
dictionary
|
when state is present |
The default arguments for this job, specified as name-value pairs.
Sample:
{ 'mykey1': 'myvalue1' }
|
|
description
string
|
when state is present |
Description of the job being defined.
Sample:
My first Glue job
|
|
execution_property
complex
|
always |
An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.
|
|
max_concurrent_runs
integer
|
when state is present |
The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit.
Sample:
1
|
|
job_name
string
|
always |
The name of the AWS Glue job.
Sample:
my-glue-job
|
|
last_modified_on
string
|
when state is present |
The last point in time when this job definition was modified.
Sample:
2018-04-21T05:19:58.326000+00:00
|
|
max_retries
integer
|
when state is present |
The maximum number of times to retry this job after a JobRun fails.
Sample:
5
|
|
name
string
|
when state is present |
The name assigned to this job definition.
Sample:
my-glue-job
|
|
role
string
|
when state is present |
The name or ARN of the IAM role associated with this job.
Sample:
my-iam-role
|
|
timeout
integer
|
when state is present |
The job timeout in minutes.
Sample:
300
|
Status¶
- This module is not guaranteed to have a backwards compatible interface. [preview]
- This module is maintained by the Ansible Community. [community]
Authors¶
- Rob White (@wimnat)
Hint
If you notice any issues in this documentation you can edit this document to improve it.