airflow.providers.databricks.sensors.databricks_partition

This module contains Databricks sensors.

Module Contents

Classes

DatabricksPartitionSensor

Sensor to detect the presence of table partitions in Databricks.

class airflow.providers.databricks.sensors.databricks_partition.DatabricksPartitionSensor(*, databricks_conn_id=DatabricksSqlHook.default_conn_name, http_path=None, sql_warehouse_name=None, session_configuration=None, http_headers=None, catalog='', schema='default', table_name, partitions, partition_operator='=', handler=fetch_all_handler, client_parameters=None, **kwargs)[source]

Bases: airflow.sensors.base.BaseSensorOperator

Sensor to detect the presence of table partitions in Databricks.

Parameters
  • databricks_conn_id (str) – Reference to Databricks connection id (templated), defaults to DatabricksSqlHook.default_conn_name.

  • sql_warehouse_name (str | None) – Optional name of Databricks SQL warehouse. If not specified, http_path must be provided as described below, defaults to None

  • http_path (str | None) – Optional string specifying HTTP path of Databricks SQL warehouse or All Purpose cluster. If not specified, it should be either specified in the Databricks connection’s extra parameters, or sql_warehouse_name must be specified.

  • session_configuration – An optional dictionary of Spark session parameters. If not specified, it could be specified in the Databricks connection’s extra parameters, defaults to None

  • http_headers (list[tuple[str, str]] | None) – An optional list of (k, v) pairs that will be set as HTTP headers on every request. (templated).

  • catalog (str) – An optional initial catalog to use. Requires Databricks Runtime version 9.0+ (templated), defaults to “”

  • schema (str) – An optional initial schema to use. Requires Databricks Runtime version 9.0+ (templated), defaults to “default”

  • table_name (str) – Name of the table to check partitions.

  • partitions (dict) – Name of the partitions to check. Example: {“date”: “2023-01-03”, “name”: [“abc”, “def”]}

  • partition_operator (str) – Optional comparison operator for partitions, such as >=.

  • handler (Callable[[Any], Any]) – Handler for DbApiHook.run() to return results, defaults to fetch_all_handler

  • client_parameters (dict[str, Any] | None) – Additional parameters internal to Databricks SQL connector parameters.

template_fields: collections.abc.Sequence[str] = ('databricks_conn_id', 'catalog', 'schema', 'table_name', 'partitions', 'http_headers')[source]
template_ext: collections.abc.Sequence[str] = ('.sql',)[source]
template_fields_renderers[source]
poke(context)[source]

Check the table partitions and return the results.

Was this entry helpful?