community.aws.glue_job module – Manage an AWS Glue job
Note
This module is part of the community.aws collection (version 10.0.0).
You might already have this collection installed if you are using the ansible package.
It is not included in ansible-core.
To check whether it is installed, run ansible-galaxy collection list.
To install it, use: ansible-galaxy collection install community.aws.
You need further requirements to be able to use this module,
see Requirements for details.
To use it in a playbook, specify: community.aws.glue_job.
New in community.aws 1.0.0
Synopsis
- Manage an AWS Glue job. See https://aws.amazon.com/glue/ for details. 
- Prior to release 5.0.0 this module was called - community.aws.aws_glue_job. The usage did not change.
Aliases: aws_glue_job
Requirements
The below requirements are needed on the host that executes this module.
- python >= 3.6 
- boto3 >= 1.34.0 
- botocore >= 1.34.0 
Parameters
| Parameter | Comments | 
|---|---|
| AWS access key ID. See the AWS documentation for more information about access tokens https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys. The  The aws_access_key and profile options are mutually exclusive. The aws_access_key_id alias was added in release 5.1.0 for consistency with the AWS botocore SDK. | |
| The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. | |
| The location of a CA Bundle to use when validating SSL certificates. The  | |
| A dictionary to modify the botocore configuration. Parameters can be found in the AWS documentation https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html#botocore.config.Config. | |
| The name of the job command. This must be ‘glueetl’. Default:  | |
| Python version being used to execute a Python shell job. AWS currently supports  | |
| The S3 path to a script that executes a job. Required when state=present. | |
| A list of Glue connections used for this job. | |
| Use a  The  Choices: 
 | |
| A dict of default arguments for this job. You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes. | |
| Description of the job being defined. | |
| URL to connect to instead of the default AWS endpoints. While this can be used to connection to other AWS-compatible services the amazon.aws and community.aws collections are only tested against AWS. The  | |
| Glue version determines the versions of Apache Spark and Python that AWS Glue supports. | |
| The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit. | |
| The maximum number of times to retry this job if it fails. | |
| The name you assign to this job definition. It must be unique in your account. | |
| The number of workers of a defined workerType that are allocated when a job runs. | |
| A named AWS profile to use for authentication. See the AWS documentation for more information about named profiles https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html. The  The profile option is mutually exclusive with the aws_access_key, aws_secret_key and session_token options. | |
| If  If the  Tag keys beginning with  Choices: 
 | |
| The AWS region to use. For global services such as IAM, Route53 and CloudFront, region is ignored. The  See the Amazon AWS documentation for more information http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region. | |
| The name or ARN of the IAM role associated with this job. Required when state=present. | |
| AWS secret access key. See the AWS documentation for more information about access tokens https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys. The  The secret_key and profile options are mutually exclusive. The aws_secret_access_key alias was added in release 5.1.0 for consistency with the AWS botocore SDK. | |
| AWS STS session token for use with temporary credentials. See the AWS documentation for more information about access tokens https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys. The  The session_token and profile options are mutually exclusive. | |
| Create or delete the AWS Glue job. Choices: 
 | |
| A dictionary representing the tags to be applied to the resource. If the  | |
| The job timeout in minutes. | |
| When set to  Setting validate_certs=false is strongly discouraged, as an alternative, consider setting aws_ca_bundle instead. Choices: 
 | |
| The type of predefined worker that is allocated when a job runs. Support for instance types  Choices: 
 | 
Notes
Note
- Support for tags and purge_tags was added in release 2.2.0. 
- Caution: For modules, environment variables and configuration files are read from the Ansible ‘host’ context and not the ‘controller’ context. As such, files may need to be explicitly copied to the ‘host’. For lookup and connection plugins, environment variables and configuration files are read from the Ansible ‘controller’ context and not the ‘host’ context. 
- The AWS SDK (boto3) that Ansible uses may also read defaults for credentials and other settings, such as the region, from its configuration files in the Ansible ‘host’ context (typically - ~/.aws/credentials). See https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html for more information.
Examples
# Note: These examples do not set authentication details, see the AWS Guide for details.
# Create an AWS Glue job
- community.aws.glue_job:
    command_script_location: "s3://s3bucket/script.py"
    default_arguments:
      "--extra-py-files": s3://s3bucket/script-package.zip
      "--TempDir": "s3://s3bucket/temp/"
    name: my-glue-job
    role: my-iam-role
    state: present
# Delete an AWS Glue job
- community.aws.glue_job:
    name: my-glue-job
    state: absent
Return Values
Common return values are documented here, the following are the fields unique to this module:
| Key | Description | 
|---|---|
| The number of AWS Glue data processing units (DPUs) allocated to runs of this job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. Returned: when state is present Sample:  | |
| The JobCommand that executes this job. Returned: when state is present | |
| The name of the job command. Returned: when state is present Sample:  | |
| Specifies the Python version. Returned: when state is present Sample:  | |
| Specifies the S3 path to a script that executes a job. Returned: when state is present Sample:  | |
| The connections used for this job. Returned: when state is present Sample:  | |
| The time and date that this job definition was created. Returned: when state is present Sample:  | |
| The default arguments for this job, specified as name-value pairs. Returned: when state is present Sample:  | |
| Description of the job being defined. Returned: when state is present Sample:  | |
| An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job. Returned: always | |
| The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit. Returned: when state is present Sample:  | |
| Glue version. Returned: when state is present Sample:  | |
| The name of the AWS Glue job. Returned: always Sample:  | |
| The last point in time when this job definition was modified. Returned: when state is present Sample:  | |
| The maximum number of times to retry this job after a JobRun fails. Returned: when state is present Sample:  | |
| The name assigned to this job definition. Returned: when state is present Sample:  | |
| The name or ARN of the IAM role associated with this job. Returned: when state is present Sample:  | |
| The job timeout in minutes. Returned: when state is present Sample:  | 
