Advanced logging configuration¶
Not all configuration options are available from the airflow.cfg
file. The config file describes
how to configure logging for tasks, because the logs generated by tasks are not only logged in separate
files by default but has to be also accessible via the webserver.
By default standard airflow component logs are written to the $AIRFLOW_HOME/logs
directory, but you
can also customize it and configure it as you want by overriding Python logger configuration that can
be configured by providing custom logging configuration object. You can also create and use logging configuration
for specific operators and tasks.
Some configuration options require that the logging config class be overwritten. You can do it by copying the default configuration of Airflow and modifying it to suit your needs.
The default configuration can be seen in the airflow_local_settings.py template and you can see the loggers and handlers used there.
See Configuring local settings for details on how to configure local settings.
Except the custom loggers and handlers configurable there via the airflow.cfg
, the logging methods in Airflow follow the usual Python logging convention,
that Python objects log to loggers that follow naming convention of <package>.<module_name>
.
You can read more about standard python logging classes (Loggers, Handlers, Formatters) in the Python logging documentation.
Create a custom logging class¶
Configuring your logging classes can be done via the logging_config_class
option in airflow.cfg
file.
This configuration should specify the import path to a configuration compatible with
logging.config.dictConfig()
. If your file is a standard import location, then you should set a
PYTHONPATH
environment variable.
Follow the steps below to enable custom logging config class:
Start by setting environment variable to known directory e.g.
~/airflow/
export PYTHONPATH=~/airflow/
Create a directory to store the config file e.g.
~/airflow/config
Create file called
~/airflow/config/log_config.py
with following the contents:from copy import deepcopy from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)
At the end of the file, add code to modify the default dictionary configuration.
Update
$AIRFLOW_HOME/airflow.cfg
to contain:[logging] logging_config_class = log_config.LOGGING_CONFIG
You can also use the logging_config_class
together with remote logging if you plan to just extend/update
the configuration with remote logging enabled. Then the deep-copied dictionary will contain the remote logging
configuration generated for you and your modification will apply after remote logging configuration has
been added:
[logging] remote_logging = True logging_config_class = log_config.LOGGING_CONFIG
Restart the application.
See Modules Management for details on how Python and Airflow manage modules.
Note
You can override the way both standard logs of the components and “task” logs are handled.
Custom logger for Operators, Hooks and Tasks¶
You can create custom logging handlers and apply them to specific Operators, Hooks and tasks. By default, the Operators
and Hooks loggers are child of the airflow.task
logger: They follow respectively the naming convention
airflow.task.operators.<package>.<module_name>
and airflow.task.hooks.<package>.<module_name>
. After
creating a custom logging class,
you can assign specific loggers to them.
Example of custom logging for the SQLExecuteQueryOperator
and the HttpHook
:
from copy import deepcopy from pydantic.utils import deep_update from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG LOGGING_CONFIG = deep_update( deepcopy(DEFAULT_LOGGING_CONFIG), { "loggers": { "airflow.task.operators.airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator": { "handlers": ["task"], "level": "DEBUG", "propagate": True, }, "airflow.task.hooks.airflow.providers.http.hooks.http.HttpHook": { "handlers": ["task"], "level": "WARNING", "propagate": False, }, } }, )
You can also set a custom name to a Dag’s task with the logger_name
attribute. This can be useful if multiple tasks
are using the same Operator, but you want to disable logging for some of them.
Example of custom logger name:
# In your Dag file SQLExecuteQueryOperator(..., logger_name="sql.big_query") # In your custom `log_config.py` LOGGING_CONFIG = deep_update( deepcopy(DEFAULT_LOGGING_CONFIG), { "loggers": { "airflow.task.operators.sql.big_query": { "handlers": ["task"], "level": "WARNING", "propagate": True, }, } }, )
If you want to limit the log size of the tasks, you can add the handlers.task.max_bytes parameter.
Example of limiting the size of tasks:
from copy import deepcopy from pydantic.utils import deep_update from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG LOGGING_CONFIG = deep_update( deepcopy(DEFAULT_LOGGING_CONFIG), { "handlers": { "task": {"max_bytes": 104857600, "backup_count": 1} # 100MB and keep 1 history rotate log. } }, )