Airflow Summit 2025 is coming October 07-09. Register now for early bird ticket!

airflow.providers.databricks.utils.openlineage

Attributes

log

Functions

emit_openlineage_events_for_databricks_queries(...[, ...])

Emit OpenLineage events for executed Databricks queries.

Module Contents

airflow.providers.databricks.utils.openlineage.log[source]
airflow.providers.databricks.utils.openlineage.emit_openlineage_events_for_databricks_queries(query_ids, query_source_namespace, task_instance, hook=None, additional_run_facets=None, additional_job_facets=None)[source]

Emit OpenLineage events for executed Databricks queries.

Metadata retrieval from Databricks is attempted only if a DatabricksSqlHook is provided. If metadata is available, execution details such as start time, end time, execution status, error messages, and SQL text are included in the events. If no metadata is found, the function defaults to using the Airflow task instance’s state and the current timestamp.

Note that both START and COMPLETE event for each query will be emitted at the same time. If we are able to query Databricks for query execution metadata, event times will correspond to actual query execution times.

Args:

query_ids: A list of Databricks query IDs to emit events for. query_source_namespace: The namespace to be included in ExternalQueryRunFacet. task_instance: The Airflow task instance that run these queries. hook: A hook instance used to retrieve query metadata if available. additional_run_facets: Additional run facets to include in OpenLineage events. additional_job_facets: Additional job facets to include in OpenLineage events.

Was this entry helpful?