Task and Asset Store Overview

Added in version 3.3.

Airflow has always modeled tasks as stateless, idempotent units of work. A growing class of workloads, however, require some amount of data to be persisted outside of a task’s return value, like a submitted job ID that must survive a worker crash, a watermark that advances run-by-run, or a row counter exposed for observability. Task store and Asset store fill that gap without touching the XCom or Variable systems.

Task and Asset Store

Task and Asset store provide two key/value stores to persist data like a job ID, watermark, or row count. These two stores are differentiated by what they are scoped to:

Store

Scope

Default lifetime

Primary use case

Task store

A single task Instance (dag_id + run_id + task_id + map_index)

Configurable retention; cleared on task success when clear_on_success = True

Survive retries, track in-flight jobs, checkpoint progress within a run, resume progress from checkpoint set by a past run

Asset store

An asset (independent of any particular run)

Persists indefinitely; removed only when the asset is deactivated

Cross-run watermarks, incremental-load cursors, per-asset metadata

Both stores accept JSON-able values. Values can be stored using the default metastore backend, or be offloaded via a custom worker-side backend.

When to use Task and Asset Store

Use this table to choose the right mechanism for your use case.

Mechanism

When to use it

XCom

Pass data between tasks within a single Dag run (e.g. the output of one task consumed by a downstream task) or across different multiple Dag runs (referencing the data persisted from another run). XComs are cleared on retry, and should NOT be used to persist data across task retries or across runs.

Variables

Deployment-wide or installation-wide configuration that changes infrequently and is set by operators rather than by tasks themselves.

Task store

Data that must survive a worker crash or data that must survive across retries within the same run. An external job ID written before a long-running job completes is a perfect use case for task store.

Asset store

Data that must persist across asset events or while asset “watching” and is logically owned by an asset rather than a task. For example, a watermark that advances each time a file lands in an object store.

Note

If your current implementation already leverages an XCom-based pattern successfully, there’s no need to migrate to task store. Task store is meant to solve problems that XCom was never designed for.

Further reading

Was this entry helpful?