AWS Batch services.



Execute a job on AWS Batch.


Create an AWS Batch compute environment.

Module Contents

class*, job_name, job_definition, job_queue, container_overrides=None, array_properties=None, ecs_properties_override=None, eks_properties_override=None, node_overrides=None, share_identifier=None, scheduling_priority_override=None, parameters=None, retry_strategy=None, job_id=None, waiters=None, max_retries=4200, status_retries=None, aws_conn_id=None, region_name=None, tags=None, wait_for_completion=True, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), poll_interval=30, awslogs_enabled=False, awslogs_fetch_interval=timedelta(seconds=30), submit_job_timeout=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Execute a job on AWS Batch.

See also

For more information on how to use this operator, take a look at the guide: Submit a new AWS Batch job

  • job_name (str) – the name for the job that will run on AWS Batch (templated)

  • job_definition (str) – the job definition name on AWS Batch

  • job_queue (str) – the queue name on AWS Batch

  • container_overrides (dict | None) – the containerOverrides parameter for boto3 (templated)

  • ecs_properties_override (dict | None) – the ecsPropertiesOverride parameter for boto3 (templated)

  • eks_properties_override (dict | None) – the eksPropertiesOverride parameter for boto3 (templated)

  • node_overrides (dict | None) – the nodeOverrides parameter for boto3 (templated)

  • share_identifier (str | None) – The share identifier for the job. Don’t specify this parameter if the job queue doesn’t have a scheduling policy.

  • scheduling_priority_override (int | None) – The scheduling priority for the job. Jobs with a higher scheduling priority are scheduled before jobs with a lower scheduling priority. This overrides any scheduling priority in the job definition

  • array_properties (dict | None) – the arrayProperties parameter for boto3

  • parameters (dict | None) – the parameters for boto3 (templated)

  • job_id (str | None) – the job ID, usually unknown (None) until the submit_job operation gets the jobId defined by AWS Batch

  • waiters (Any | None) – an BatchWaiters object (see note below); if None, polling is used with max_retries and status_retries.

  • max_retries (int) – exponential back-off retries, 4200 = 48 hours; polling is only used when waiters is None

  • status_retries (int | None) – number of HTTP retries to get job status, 10; polling is only used when waiters is None

  • aws_conn_id (str | None) – connection id of AWS credentials / region name. If None, credential boto3 strategy will be used.

  • region_name (str | None) – region name to use in AWS Hook. Override the region_name in connection (if provided)

  • tags (dict | None) – collection of tags to apply to the AWS Batch job submission if None, no tags are submitted

  • deferrable (bool) – Run operator in the deferrable mode.

  • awslogs_enabled (bool) – Specifies whether logs from CloudWatch should be printed or not, False. If it is an array job, only the logs of the first task will be printed.

  • awslogs_fetch_interval (datetime.timedelta) – The interval with which cloudwatch logs are to be fetched, 30 sec.

  • poll_interval (int) – (Deferrable mode only) Time in seconds to wait between polling.

  • submit_job_timeout (int | None) – Execution timeout in seconds for submitted batch job.


Any custom waiters must return a waiter for these calls: .. code-block:: python

waiter = waiters.get_waiter(“JobExists”) waiter = waiters.get_waiter(“JobRunning”) waiter = waiters.get_waiter(“JobComplete”)

ui_color = '#c3dae0'[source]
arn: str | None = None[source]
template_fields:[str] = ('job_id', 'job_name', 'job_definition', 'job_queue', 'container_overrides', 'array_properties',...[source]
job_id = None[source]
container_overrides = None[source]
ecs_properties_override = None[source]
eks_properties_override = None[source]
node_overrides = None[source]
share_identifier = None[source]
scheduling_priority_override = None[source]
array_properties = None[source]
retry_strategy = None[source]
waiters = None[source]
wait_for_completion = True[source]
deferrable = True[source]
poll_interval = 30[source]
awslogs_enabled = False[source]
submit_job_timeout = None[source]
max_retries = 4200[source]
status_retries = None[source]
aws_conn_id = None[source]
region_name = None[source]
property hook:[source]

Submit and monitor an AWS Batch job.



execute_complete(context, event=None)[source]

Override this method to clean up subprocesses when a task instance gets killed.

Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.


Submit an AWS Batch job.




Monitor an AWS Batch job.

This can raise an exception or an AirflowTaskTimeout if the task was created with execution_timeout.

class, environment_type, state, compute_resources, unmanaged_v_cpus=None, service_role=None, tags=None, poll_interval=30, max_retries=None, aws_conn_id=None, region_name=None, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]

Bases: airflow.models.BaseOperator

Create an AWS Batch compute environment.

See also

For more information on how to use this operator, take a look at the guide: Create an AWS Batch compute environment

  • compute_environment_name (str) – Name of the AWS batch compute environment (templated).

  • environment_type (str) – Type of the compute-environment.

  • state (str) – State of the compute-environment.

  • compute_resources (dict) – Details about the resources managed by the compute-environment (templated). More details:

  • unmanaged_v_cpus (int | None) – Maximum number of vCPU for an unmanaged compute environment. This parameter is only supported when the type parameter is set to UNMANAGED.

  • service_role (str | None) – IAM role that allows Batch to make calls to other AWS services on your behalf (templated).

  • tags (dict | None) – Tags that you apply to the compute-environment to help you categorize and organize your resources.

  • poll_interval (int) – How long to wait in seconds between 2 polls at the environment status. Only useful when deferrable is True.

  • max_retries (int | None) – How many times to poll for the environment status. Only useful when deferrable is True.

  • aws_conn_id (str | None) – Connection ID of AWS credentials / region name. If None, credential boto3 strategy will be used.

  • region_name (str | None) – Region name to use in AWS Hook. Overrides the region_name in connection if provided.

  • deferrable (bool) – If True, the operator will wait asynchronously for the environment to be created. This mode requires aiobotocore module to be installed. (default: False)

template_fields:[str] = ('compute_environment_name', 'compute_resources', 'service_role')[source]
unmanaged_v_cpus = None[source]
service_role = None[source]
poll_interval = 30[source]
max_retries = 120[source]
aws_conn_id = None[source]
region_name = None[source]
deferrable = True[source]
property hook[source]

Create and return a BatchClientHook.


Create an AWS batch compute environment.

execute_complete(context, event=None)[source]

Was this entry helpful?