Troubleshooting¶
Attention
OpenLineage is under active development. Before troubleshooting, please upgrade to the latest provider and client version and verify that the issue still occurs, as it may have already been resolved in a newer release.
Update the provider and its core dependencies
Enable debugging
Inspect events locally
Perform a quick check of your setup
Check for reported bugs in OpenLineage provider and OpenLineage client
Before asking for help anywhere, gather all the information listed below that will help diagnose the issue
If you are still facing the issue, please open an issue on the provider repository.
If all else fails, you can try asking for help on the OpenLineage slack channel (but remember that it is a community channel and not a support channel, and people are volunteering their time to help you).
1. Upgrade the provider and its core dependencies¶
Upgrade the OpenLineage provider and the OpenLineage client. If you’d like to know the difference between the two, you can read more about it in the OpenLineage provider vs client.
pip install --upgrade apache-airflow-providers-openlineage openlineage-python
Then verify the versions in use are the latest available:
pip show apache-airflow-providers-openlineage openlineage-python | cat
2. Enable debug settings¶
Enable debug logs by setting the logging level to DEBUG for both Airflow and the OpenLineage client:
In Airflow, use the logging_level configuration and set the logging level to DEBUG. You can do this f.e. by exporting and env variable
AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG.OpenLineage client should automatically pick up the logging level from Airflow, but you can also set it explicitly by exporting and env variable
OPENLINEAGE_LOG_LEVEL=DEBUG.
Enable Debug Mode so that the DebugFacet (additional diagnostic info) is attached to events. It can drastically increase the size of the events and logs, so this should only be used temporarily.
3. Inspect events locally¶
With debug logs enabled, raw OpenLineage events will be logged before emitting. Check logs for Openlineage events.
You can also use some simple transport like the ConsoleTransport to print events to task logs or FileTransport to save events to json files, e.g.
[openlineage] transport = {"type": "console"}
Or run Marquez locally to inspect whether events are emitted and received.
4. Perform a quick check of your setup¶
Verify the documentation of provider and client, maybe something has changed.
Configuration present: Ensure a working transport is configured. See Transport.
Disabled settings: Verify you did not disable the integration globally via Disabled or selectively via Disabled for operators or Selective Enable policy.
Extraction precedence: If inputs/outputs are missing, remember the order described in Extraction precedence.
Custom extractors registration: If using custom extractors, confirm they are registered via Extractors and importable by both Scheduler and Workers.
Environment variables: For legacy environments, note the backwards-compatibility env vars in Backwards Compatibility (e.g.,
OPENLINEAGE_URL) but prefer Airflow config.
5. Check for common symptoms and fixes¶
No events emitted at all:
Ensure the provider is installed and at a supported Airflow version (see provider “Requirements”).
Check Disabled is not set to
true.If using selective enablement, verify Selective Enable and that the DAG/task is enabled via
enable_lineage.Confirm the OpenLineage plugin/listener is loaded in Scheduler/Worker logs.
Events emitted but not received by backend
Validate Transport or Config Path. See “Transport setup” in the configuration section and “Configuration precedence”.
Test with
ConsoleTransportto rule out backend/network issues.Verify network connectivity, auth configuration, and endpoint values.
Inputs/Outputs missing
Review Extraction precedence and ensure either custom Extractor or Operator OpenLineage methods are implemented.
For methods, follow best practices: import OpenLineage-related objects inside the OpenLineage methods, not at module top level; avoid heavy work in
executethat you need in_on_start.For SQL-like operators, ensure relevant job IDs or runtime metadata are available to enrich lineage in
_on_complete.
Custom Extractor not working
Confirm it’s listed under Extractors (or env var equivalent) and importable from both Scheduler and Workers.
Avoid cyclical imports: import from Airflow only within
_execute_extraction/extract_on_complete/extract_on_failure, and guard type-only imports withtyping.TYPE_CHECKING.Unit test the Extractor to validate
OperatorLineagecontents; mock external calls. See example tests referenced in Custom Extractors.
Custom Run Facets not present
Register functions via Custom Run Facets.
Function signature must accept
TaskInstanceandTaskInstanceStateand returndict[str, RunFacet]orNone.Avoid duplicate facet keys across functions; duplicates lead to non-deterministic selection.
Functions execute on START and COMPLETE/FAIL.
Spark jobs missing parent linkage or transport settings
If any
spark.openlineage.parent*orspark.openlineage.transport*properties are explicitly set in the Spark job config, the integration will not override them.If supported by your Operator, enable Enabling Automatic Parent Job Information Injection and Enabling Automatic Transport Information Injection
Very large event payloads or serialization failures
If Include Full Task Info is enabled, events may become large; consider disabling or trimming task parameters.
Disable Source Code can reduce payloads for Python/Bash operators that include source code by default.
6. Check for open bugs and issues in the provider and the client¶
Check for open bugs and issues in the provider and the client.
7. Gather crucial information¶
Airflow scheduler logs (with log level set to DEBUG, see Step 2 above)
Airflow worker (task) logs (with log level set to DEBUG, see Step 2 above)
OpenLineage events (with Debug Mode enabled)
Airflow version, OpenLineage provider version and OpenLineage client version
Details on any custom deployment/environment modifications
8. Open an issue on the provider repository¶
If you are still facing the issue, please open an issue on the provider repository and include all the information gathered in the previous step together with a simple example on how to reproduce the issue. Do not paste your entire codebase, try to come up with a simple code that will demonstrate the problem - this increase chances of bug getting fixed quickly.
9. Ask for help on the OpenLineage slack channel¶
If all else fails, you can try asking for help on the OpenLineage slack channel (but remember that it is a community channel and not a support channel, and people are volunteering their time to help you).