airflow.providers.google.cloud.operators.ray¶
This module contains a Google Cloud Ray Job operators.
Attributes¶
Exceptions¶
Custom exception to handle failing operations on Jobs. |
Classes¶
Base class for Jobs on Ray operators. |
|
Submit and execute Job on Ray cluster. |
|
Stop Job on Ray cluster. |
|
Delete Job on Ray cluster in a terminal state and all of its associated data. |
|
Get the latest status and other information associated with a Job on Ray cluster. |
|
List all jobs along with their status and other information. |
Module Contents¶
- exception airflow.providers.google.cloud.operators.ray.OperationFailedException[source]¶
Bases:
ExceptionCustom exception to handle failing operations on Jobs.
- class airflow.providers.google.cloud.operators.ray.RayJobBaseOperator(gcp_conn_id='google_cloud_default', impersonation_chain=None, *args, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperatorBase class for Jobs on Ray operators.
- Parameters:
gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.
impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str] = ('gcp_conn_id', 'impersonation_chain')[source]¶
- property hook: airflow.providers.google.cloud.hooks.ray.RayJobHook[source]¶
- class airflow.providers.google.cloud.operators.ray.RaySubmitJobOperator(cluster_address, entrypoint, get_job_logs=False, wait_for_job_done=False, runtime_env=None, metadata=None, submission_id=None, entrypoint_num_cpus=None, entrypoint_num_gpus=None, entrypoint_memory=None, entrypoint_resources=None, submit_job_timeout=60 * 30, *args, **kwargs)[source]¶
Bases:
RayJobBaseOperatorSubmit and execute Job on Ray cluster.
When a job is submitted, it runs once to completion or failure. Retries or different runs with different parameters should be handled by the submitter. Jobs are bound to the lifetime of a Ray cluster, so if the cluster goes down, all running jobs on that cluster will be terminated.
- Parameters:
cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.
entrypoint (str) – Required. The shell command to run for this job.
get_job_logs (bool | None) – If set to True, the operator will wait until the end of Job execution and output the logs.
wait_for_job_done (bool | None) – If set to True, the operator will wait until the end of Job execution. Please note, that if the Job will fail during execution and this parameter is set to False, there will be no indication of the failure.
submission_id (str | None) – A unique ID for this job.
runtime_env (dict[str, Any] | None) – The runtime environment to install and run this job in.
metadata (dict[str, str] | None) – Arbitrary data to store along with this job.
entrypoint_num_cpus (int | float | None) – The quantity of CPU cores to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.
entrypoint_num_gpus (int | float | None) – The quantity of GPUs to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.
entrypoint_memory (int | None) – The quantity of memory to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it. Defaults to 0.
entrypoint_resources (dict[str, float] | None) – The quantity of custom resources to reserve for the execution of the entrypoint command, separately from any tasks or actors launched by it.
gcp_conn_id – The connection ID to use connecting to Google Cloud.
impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str][source]¶
- class airflow.providers.google.cloud.operators.ray.RayStopJobOperator(cluster_address, job_id, *args, **kwargs)[source]¶
Bases:
RayJobBaseOperatorStop Job on Ray cluster.
- Parameters:
cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.
job_id (str) – Required. The job ID or submission ID for the job to be stopped.
gcp_conn_id – The connection ID to use connecting to Google Cloud.
impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str][source]¶
- class airflow.providers.google.cloud.operators.ray.RayDeleteJobOperator(cluster_address, job_id, *args, **kwargs)[source]¶
Bases:
RayJobBaseOperatorDelete Job on Ray cluster in a terminal state and all of its associated data.
If the job is not already in a terminal state, raises an error. This does not delete the job logs from disk. Submitting a job with the same submission ID as a previously deleted job is not supported and may lead to unexpected behavior.
- Parameters:
cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.
job_id (str) – Required. The job ID or submission ID for the job to be stopped.
gcp_conn_id – The connection ID to use connecting to Google Cloud.
impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str][source]¶
- class airflow.providers.google.cloud.operators.ray.RayGetJobInfoOperator(cluster_address, job_id, *args, **kwargs)[source]¶
Bases:
RayJobBaseOperatorGet the latest status and other information associated with a Job on Ray cluster.
- Parameters:
cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.
job_id (str) – Required. The job ID or submission ID for the job to be stopped.
gcp_conn_id – The connection ID to use connecting to Google Cloud.
impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str][source]¶
- class airflow.providers.google.cloud.operators.ray.RayListJobsOperator(cluster_address, *args, **kwargs)[source]¶
Bases:
RayJobBaseOperatorList all jobs along with their status and other information.
Lists all jobs that have ever run on the cluster, including jobs that are currently running and jobs that are no longer running.
- Parameters:
cluster_address (str) – Required. Either (1) the address of the Ray cluster, or (2) the HTTP address of the dashboard server on the head node, e.g. “http://<head-node-ip>:8265”. In case (1) it must be specified as an address that can be passed to ray.init(), e.g. a Ray Client address (ray://<head_node_host>:10001), or “auto”, or “localhost:<port>”.
gcp_conn_id – The connection ID to use connecting to Google Cloud.
impersonation_chain – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- template_fields: collections.abc.Sequence[str][source]¶