airflow.providers.alibaba.cloud.operators.analyticdb_spark
¶
Module Contents¶
Classes¶
Abstract base class that defines how users develop AnalyticDB Spark. |
|
Submits a Spark SQL application to the underlying cluster; wraps the AnalyticDB Spark REST API. |
|
Submits a Spark batch application to the underlying cluster; wraps the AnalyticDB Spark REST API. |
- class airflow.providers.alibaba.cloud.operators.analyticdb_spark.AnalyticDBSparkBaseOperator(*, adb_spark_conn_id='adb_spark_default', region=None, polling_interval=0, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Abstract base class that defines how users develop AnalyticDB Spark.
- execute(context)[source]¶
Derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
- poll_for_termination(app_id)[source]¶
Pool for spark application termination.
- Parameters
app_id (str) – id of the spark application to monitor
- class airflow.providers.alibaba.cloud.operators.analyticdb_spark.AnalyticDBSparkSQLOperator(*, sql, conf=None, driver_resource_spec=None, executor_resource_spec=None, num_executors=None, name=None, cluster_id, rg_name, **kwargs)[source]¶
Bases:
AnalyticDBSparkBaseOperator
Submits a Spark SQL application to the underlying cluster; wraps the AnalyticDB Spark REST API.
- Parameters
sql (str) – The SQL query to execute.
conf (dict[Any, Any] | None) – Spark configuration properties.
driver_resource_spec (str | None) – The resource specifications of the Spark driver.
executor_resource_spec (str | None) – The resource specifications of each Spark executor.
num_executors (int | str | None) – number of executors to launch for this application.
name (str | None) – name of this application.
cluster_id (str) – The cluster ID of AnalyticDB MySQL 3.0 Data Lakehouse.
rg_name (str) – The name of resource group in AnalyticDB MySQL 3.0 Data Lakehouse cluster.
- template_fields: collections.abc.Sequence[str] = ('spark_params',)[source]¶
- class airflow.providers.alibaba.cloud.operators.analyticdb_spark.AnalyticDBSparkBatchOperator(*, file, class_name=None, args=None, conf=None, jars=None, py_files=None, files=None, driver_resource_spec=None, executor_resource_spec=None, num_executors=None, archives=None, name=None, cluster_id, rg_name, **kwargs)[source]¶
Bases:
AnalyticDBSparkBaseOperator
Submits a Spark batch application to the underlying cluster; wraps the AnalyticDB Spark REST API.
- Parameters
file (str) – path of the file containing the application to execute.
class_name (str | None) – name of the application Java/Spark main class.
args (collections.abc.Sequence[str | int | float] | None) – application command line arguments.
conf (dict[Any, Any] | None) – Spark configuration properties.
jars (collections.abc.Sequence[str] | None) – jars to be used in this application.
py_files (collections.abc.Sequence[str] | None) – python files to be used in this application.
files (collections.abc.Sequence[str] | None) – files to be used in this application.
driver_resource_spec (str | None) – The resource specifications of the Spark driver.
executor_resource_spec (str | None) – The resource specifications of each Spark executor.
num_executors (int | str | None) – number of executors to launch for this application.
archives (collections.abc.Sequence[str] | None) – archives to be used in this application.
name (str | None) – name of this application.
cluster_id (str) – The cluster ID of AnalyticDB MySQL 3.0 Data Lakehouse.
rg_name (str) – The name of resource group in AnalyticDB MySQL 3.0 Data Lakehouse cluster.
- template_fields: collections.abc.Sequence[str] = ('spark_params',)[source]¶