Airflow Summit 2026 is coming August 31 - September 2 in Austin, TX. Register now to secure your spot!

Object Storage State Store Backend

The default state store backend is MetastoreStateBackend, which persists task and asset state in the Airflow metadata database via the API Server’s Execution API. For larger values, you may want to store state on object storage directly from the task instead.

To enable object storage for task and asset state store, set state_store_backend in the [workers] section to airflow.providers.common.io.state_store.backend.StateStoreObjectStorageBackend, and set state_store_objectstorage_path to the desired base location. The connection id is obtained from the user part of the URL, e.g. state_store_objectstorage_path = s3://conn_id@mybucket/task-state/.

Task state is stored under <dag_id>/<run_id>/<task_id>/<map_index>/<key> and asset state under assets/<asset_identifier>/<key> beneath the configured base path.

By default (state_store_objectstorage_threshold = 0) all serialized values are offloaded to object storage. Set state_store_objectstorage_threshold to a positive number of bytes to only offload values whose serialized size meets or exceeds the threshold, anything smaller are stored in the Airflow metadata database.

Optionally set state_store_objectstorage_compression to an fsspec-supported compression algorithm such as gzip or snappy to compress values before writing.

The following example stores all task and asset state in S3, compressed with gzip:

[workers]
state_store_backend = airflow.providers.common.io.state_store.backend.StateStoreObjectStorageBackend

[common.io]
state_store_objectstorage_path = s3://conn_id@mybucket/task-state/
state_store_objectstorage_compression = gzip

To only offload values larger than 1 MB:

[workers]
state_store_backend = airflow.providers.common.io.state_store.backend.StateStoreObjectStorageBackend

[common.io]
state_store_objectstorage_path = s3://conn_id@mybucket/task-state/
state_store_objectstorage_threshold = 1048576

Using the local filesystem (useful for development):

[workers]
state_store_backend = airflow.providers.common.io.state_store.backend.StateStoreObjectStorageBackend

[common.io]
state_store_objectstorage_path = file:///var/airflow/task-state/

Note

Compression requires the relevant library to be installed in your Python environment. For example, snappy requires python-snappy. Gzip and bz2 work out of the box.

Note

expires_at is not enforced by this backend. Values written to object storage persist indefinitely until explicitly deleted. Use your object storage provider’s lifecycle policies (e.g. S3 lifecycle rules, GCS object lifecycle) to automatically expire old state.

Note

Task state paths are keyed on (dag_id, run_id, task_id, map_index) and are stable across task retries. This makes this backend suitable for operators that use ResumableJobMixin to reconnect to external jobs after a retry.

Was this entry helpful?