airflow.providers.google.cloud.operators.ray

This module contains a Google Cloud Ray Job operators.

Attributes

TERMINAL_STATUSES

Exceptions

OperationFailedException

Custom exception to handle failing operations on Jobs.

Classes

RayJobBaseOperator

Base class for Jobs on Ray operators.

RaySubmitJobOperator

Submit and execute Job on Ray cluster.

RayStopJobOperator

Stop Job on Ray cluster.

RayDeleteJobOperator

Delete Job on Ray cluster in a terminal state and all of its associated data.

RayGetJobInfoOperator

Get the latest status and other information associated with a Job on Ray cluster.

RayListJobsOperator

List all jobs along with their status and other information.

Module Contents

airflow.providers.google.cloud.operators.ray.TERMINAL_STATUSES[source]
exception airflow.providers.google.cloud.operators.ray.OperationFailedException[source]

Bases: Exception

Custom exception to handle failing operations on Jobs.

class airflow.providers.google.cloud.operators.ray.RayJobBaseOperator(gcp_conn_id='google_cloud_default', impersonation_chain=None, *args, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Base class for Jobs on Ray operators.

Parameters:
  • gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('gcp_conn_id', 'impersonation_chain')[source]
gcp_conn_id = 'google_cloud_default'[source]
impersonation_chain = None[source]
property hook: airflow.providers.google.cloud.hooks.ray.RayJobHook[source]
class airflow.providers.google.cloud.operators.ray.RaySubmitJobOperator(cluster_address, entrypoint, get_job_logs=False, wait_for_job_done=False, runtime_env=None, metadata=None, submission_id=None, entrypoint_num_cpus=None, entrypoint_num_gpus=None, entrypoint_memory=None, entrypoint_resources=None, submit_job_timeout=60 * 30, *args, **kwargs)[source]

Bases: RayJobBaseOperator

Submit and execute Job on Ray cluster.

When a job is submitted, it runs once to completion or failure. Retries or different runs with different parameters should be handled by the submitter. Jobs are bound to the lifetime of a Ray cluster, so if the cluster goes down, all running jobs on that cluster will be terminated.

Parameters:
  • cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.

  • entrypoint (str) – Required. The shell command to run for this job.

  • get_job_logs (bool | None) – If set to True, the operator will wait until the end of Job execution and output the logs.

  • wait_for_job_done (bool | None) – If set to True, the operator will wait until the end of Job execution. Please note, that if the Job will fail during execution and this parameter is set to False, there will be no indication of the failure.

  • submission_id (str | None) – A unique ID for this job.

  • runtime_env (dict[str, Any] | None) – The runtime environment to install and run this job in.

  • metadata (dict[str, str] | None) – Arbitrary data to store along with this job.

  • entrypoint_num_cpus (int | float | None) – The quantity of CPU cores to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.

  • entrypoint_num_gpus (int | float | None) – The quantity of GPUs to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.

  • entrypoint_memory (int | None) – The quantity of memory to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.

  • entrypoint_resources (dict[str, float] | None) – The quantity of custom resources to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it.

  • gcp_conn_id – The connection ID to use connecting to Google Cloud.

  • impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str][source]
cluster_address[source]
get_job_logs = False[source]
wait_for_job_done = False[source]
entrypoint[source]
runtime_env = None[source]
metadata = None[source]
submission_id = None[source]
entrypoint_num_cpus = None[source]
entrypoint_num_gpus = None[source]
entrypoint_memory = None[source]
entrypoint_resources = None[source]
submit_job_timeout = 1800[source]
execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.ray.RayStopJobOperator(cluster_address, job_id, *args, **kwargs)[source]

Bases: RayJobBaseOperator

Stop Job on Ray cluster.

Parameters:
  • cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.

  • job_id (str) – Required. The job ID or submission ID for the job to be stopped.

  • gcp_conn_id – The connection ID to use connecting to Google Cloud.

  • impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str][source]
cluster_address[source]
job_id[source]
execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.ray.RayDeleteJobOperator(cluster_address, job_id, *args, **kwargs)[source]

Bases: RayJobBaseOperator

Delete Job on Ray cluster in a terminal state and all of its associated data.

If the job is not already in a terminal state, raises an error. This does not delete the job logs from disk. Submitting a job with the same submission ID as a previously deleted job is not supported and may lead to unexpected behavior.

Parameters:
  • cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.

  • job_id (str) – Required. The job ID or submission ID for the job to be stopped.

  • gcp_conn_id – The connection ID to use connecting to Google Cloud.

  • impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str][source]
cluster_address[source]
job_id[source]
execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.ray.RayGetJobInfoOperator(cluster_address, job_id, *args, **kwargs)[source]

Bases: RayJobBaseOperator

Get the latest status and other information associated with a Job on Ray cluster.

Parameters:
  • cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.

  • job_id (str) – Required. The job ID or submission ID for the job to be stopped.

  • gcp_conn_id – The connection ID to use connecting to Google Cloud.

  • impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str][source]
cluster_address[source]
job_id[source]
execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.ray.RayListJobsOperator(cluster_address, *args, **kwargs)[source]

Bases: RayJobBaseOperator

List all jobs along with their status and other information.

Lists all jobs that have ever run on the cluster, including jobs that are currently running and jobs that are no longer running.

Parameters:
  • cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.

  • gcp_conn_id – The connection ID to use connecting to Google Cloud.

  • impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str][source]
cluster_address[source]
execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?