airflow.providers.apache.spark.hooks.spark_pipelines¶
Exceptions¶
Exception raised when spark-pipelines command fails. |
Classes¶
Hook for interacting with Spark Declarative Pipelines via the spark-pipelines CLI. |
Module Contents¶
- exception airflow.providers.apache.spark.hooks.spark_pipelines.SparkPipelinesException[source]¶
Bases:
airflow.providers.common.compat.sdk.AirflowExceptionException raised when spark-pipelines command fails.
- class airflow.providers.apache.spark.hooks.spark_pipelines.SparkPipelinesHook(pipeline_spec=None, pipeline_command='run', **kwargs)[source]¶
Bases:
airflow.providers.apache.spark.hooks.spark_submit.SparkSubmitHookHook for interacting with Spark Declarative Pipelines via the spark-pipelines CLI.
Extends SparkSubmitHook to leverage existing connection management while providing pipeline-specific functionality.
Two connection modes are supported:
Legacy spark-submit-style (
spark/yarn/k8sconnection types) — invokes thespark-pipelineslauncher with--master,--deploy-modeand the rest of the standard cluster-manager flags assembled bySparkSubmitHook.Spark Connect (
spark_connectconnection type, Spark 4.x+) — setsSPARK_REMOTEfrom the connection’ssc://URI and invokes the Connect-nativepyspark.pipelines.cliPython module directly. The cluster-manager flags are not emitted: the Connect-native CLI rejects them withSparkException: Remote cannot be specified with master and/or deploy mode, and thespark-pipelinesbash launcher itself starts a JVMSparkContextthat collides with the Connect daemon’s gRPC port.
- Parameters: