airflow.timetables.base

Classes

`PartitionMapperInfo`	JSON-serializable snapshot of one asset's partition mapper attributes.
`DataInterval`	A data interval for a DagRun to operate over.
`TimeRestriction`	Restriction on when a DAG can be scheduled for a run.
`DagRunInfo`	Information to schedule a DagRun.
`Timetable`	Protocol that all Timetable classes are expected to implement.

Functions

compute_rollup_fingerprint(timetable)

Return the rollup-definition fingerprint for timetable.

Module Contents

class airflow.timetables.base.PartitionMapperInfo[source]

Bases: TypedDict

JSON-serializable snapshot of one asset’s partition mapper attributes.

Stored as DagModel.partition_mapper_info (a list of these) so the UI can resolve mapper attributes without deserializing the timetable on each request. Either name, uri, or both identify the asset; Asset.ref(name=...) omits uri and Asset.ref(uri=...) omits name.

is_rollup: bool[source]

name: NotRequired[str][source]

uri: NotRequired[str][source]

class airflow.timetables.base.DataInterval[source]

Bases: NamedTuple

A data interval for a DagRun to operate over.

Both start and end MUST be “aware”, i.e. contain timezone information.

start: pendulum.DateTime[source]

end: pendulum.DateTime[source]

classmethod exact(at)[source]

Represent an “interval” containing only an exact time.

class airflow.timetables.base.TimeRestriction[source]

Bases: NamedTuple

Restriction on when a DAG can be scheduled for a run.

Specifically, the run must not be earlier than earliest, nor later than latest. If catchup is False, the run must also not be earlier than the current time, i.e. “missed” schedules are not backfilled.

These values are generally set on the DAG or task’s start_date, end_date, and catchup arguments.

Both earliest and latest, if not None, are inclusive; a DAG run can happen exactly at either point of time. They are guaranteed to be aware (i.e. contain timezone information) for TimeRestriction instances created by Airflow.

earliest: pendulum.DateTime | None[source]

latest: pendulum.DateTime | None[source]

catchup: bool[source]

class airflow.timetables.base.DagRunInfo[source]

Bases: NamedTuple

Information to schedule a DagRun.

Instances of this will be returned by timetables when they are asked to schedule a DagRun creation.

run_after: pendulum.DateTime[source]

The earliest time this DagRun is created and its tasks scheduled.

This MUST be “aware”, i.e. contain timezone information.

data_interval: DataInterval | None[source]: The data interval this DagRun to operate over.

partition_date: pendulum.DateTime | None = None[source]

partition_key: str | None = None[source]

classmethod exact(at)[source]

Represent a run on an exact time.

classmethod interval(start, end)[source]

Represent a run on a continuous schedule.

In such a schedule, each data interval starts right after the previous one ends, and each run is scheduled right after the interval ends. This applies to all schedules prior to AIP-39 except @once and None.

property logical_date: pendulum.DateTime | None[source]

Infer the logical date to represent a DagRun.

This replaces execution_date in Airflow 2.1 and prior. The idea is essentially the same, just a different name.

class airflow.timetables.base.Timetable[source]

Bases: Protocol

Protocol that all Timetable classes are expected to implement.

description: str = ''[source]

Human-readable description of the timetable.

For example, this can produce something like 'At 21:30, only on Friday' from the cron expression '30 21 * * 5'. This is used in the webserver UI.

periodic: bool = True[source]

Whether this timetable runs periodically.

This defaults to and should generally be True, but some special setups like schedule=None and "@once" set it to False.

can_be_scheduled: bool = True[source]

Whether this timetable can actually schedule runs in an automated manner.

This defaults to and should generally be True (including non periodic execution types like @once and data triggered tables), but NullTimetable sets this to False.

run_ordering: collections.abc.Sequence[str] = ('data_interval_end', 'logical_date')[source]

How runs triggered from this timetable should be ordered in UI.

This should be a list of field names on the DAG run object.

active_runs_limit: int | None = None[source]

Maximum active runs that can be active at one time for a DAG.

This is called during DAG initialization, and the return value is used as the DAG’s default max_active_runs. This should generally return None, but there are good reasons to limit DAG run parallelism in some cases, such as for ContinuousTimetable.

asset_condition: airflow.serialization.definitions.assets.SerializedAssetBase[source]: The asset condition that triggers a DAG using this timetable.

partitioned: bool = False[source]

Whether this timetable considers asset partitions.

This is True for timetables that switch scheduling to use partitions instead of the traditional logic based on logical dates and data intervals.

partitioned_at_runtime: bool = False[source]

Whether this timetable defers partition selection to task runtime.

True for PartitionedAtRuntime; downstream code can branch on this flag instead of using isinstance.

get_partition_mapper(*, name='', uri='')[source]

Return the partition mapper for the asset identified by name or uri.

Only called by the scheduler when partitioned is True. The default implementation raises NotImplementedError; timetables that set partitioned = True must override this.

iter_partition_dagrun_infos(*, earliest, latest)[source]

Yield one DagRunInfo per partition whose partition_date lies in [earliest, latest] (both inclusive).

The iteration granularity follows the timetable’s own partition cadence (e.g. one tick per hour for CronPartitionTimetable("0 * * * *")), so a sub-day window yields only the partitions inside it rather than every partition of the surrounding calendar day.

Only called for partitioned timetables (partitioned is True). The default implementation raises NotImplementedError; timetables that set partitioned = True must override this.

localize_partition_datetime(dt)[source]

Re-interpret dt’s wall-clock reading as a moment in this timetable’s timezone.

The base implementation treats the timetable as UTC: the wall-clock is kept as-is and the result is simply a timezone-aware UTC instant (a no-op for already-UTC inputs). Timetables with a local timezone (e.g. CronMixin subclasses) override this to re-localize the wall-clock to their own timezone before converting to UTC, preserving sub-day precision for narrow windows on sub-daily schedules.

Used by apply_partition_date_window() to convert user-supplied partition_date filter bounds without truncating the time component.

resolve_partition_date(partition_key)[source]

Decode partition_key into the period-start datetime it represents.

Returns the timezone-aware datetime that was used to format partition_key when the timetable originally created the run, or None when no temporal meaning can be derived from the key. None is returned without decoding when partition_key is None, when this timetable is not partitioned, or when it defers partition selection to runtime (partitioned_at_runtime).

Partitioned timetables whose keys carry a temporal structure override _decode_partition_date():

CronPartitionTimetable parses the key with strptime using its key_format and localizes with its timezone.
PartitionedAssetTimetable delegates to each asset’s partition mapper; when the mappers agree on the same instant it is returned, otherwise None is returned.

Parameters:: partition_key (str | None) – The partition key string to decode, or None.
Returns:: The period-start datetime, or None if not resolvable.
Raises:: InvalidPartitionKeyError – When partition_key is syntactically invalid for this timetable’s key format (e.g. strptime fails).
Return type:: datetime.datetime | None

property partition_mapper_info: list[PartitionMapperInfo][source]

JSON-serializable per-asset partition mapper attributes.

Empty list for timetables without asset-level partition mappers (the default, including non-partitioned timetables and cron-driven partitioned timetables). Asset-driven partitioned timetables override this with one entry per asset (or asset ref) — see PartitionMapperInfo.

classmethod deserialize(data)[source]

Deserialize a timetable from data.

This is called when a serialized DAG is deserialized. data will be whatever was returned by serialize during DAG serialization. The default implementation constructs the timetable without any arguments.

serialize()[source]

Serialize the timetable for JSON encoding.

This is called during DAG serialization to store timetable information in the database. This should return a JSON-serializable dict that will be fed into deserialize when the DAG is deserialized. The default implementation returns an empty dict.

validate()[source]

Validate the timetable is correctly specified.

Override this method to provide run-time validation raised when a DAG is put into a dagbag. The default implementation does nothing.

Raises:: AirflowTimetableInvalid on validation failure.

property summary: str[source]

A short summary for the timetable.

This is used to display the timetable in the web UI. A cron expression timetable, for example, can use this to display the expression. The default implementation returns the timetable’s type name.

property type_name: str[source]

This is primarily intended for filtering dags based on timetable type.

For built-in timetables (defined in airflow.timetables or airflow.sdk.definitions.timetables), this returns the class name only. For custom timetables (user-defined via plugins), this returns the full import path to avoid confusion between multiple implementations with the same class name.

For example, built-in timetables return: "NullTimetable" or "CronDataIntervalTimetable" while custom timetables return the full path: "my_company.timetables.CustomTimetable"

abstract infer_manual_data_interval(*, run_after)[source]

When a DAG run is manually triggered, infer a data interval for it.

This is used for e.g. manually-triggered runs, where run_after would be when the user triggers the run. The default implementation raises NotImplementedError.

abstract next_dagrun_info(*, last_automated_data_interval, restriction)[source]

Provide information to schedule the next DagRun.

The default implementation raises NotImplementedError.

Parameters:

last_automated_data_interval (DataInterval | None) – The data interval of the associated DAG’s last scheduled or backfilled run (manual runs not considered). This is only None when the Dag is being scheduled for the first time, which happens when the Dag processor first parses the Dag – before any Dag run exists.
restriction (TimeRestriction) – Restriction to apply when scheduling the DAG run. See documentation of TimeRestriction for details.

Returns:

Information on when the next DagRun can be scheduled. None means a DagRun will not happen. This does not mean no more runs will be scheduled even again for this DAG; the timetable can return a DagRunInfo object when asked at another time.

Return type:

DagRunInfo | None

generate_run_id(*, run_type, run_after, data_interval, **extra)[source]

Generate a unique run ID.

Parameters:

run_type (airflow.utils.types.DagRunType) – The type of DAG run.
run_after (pendulum.DateTime) – the datetime before which to Dag cannot run.
data_interval (DataInterval | None) – The data interval of the DAG run.

next_dagrun_info_v2(*, last_dagrun_info, restriction)[source]

Provide information to schedule the next DagRun.

The default implementation raises NotImplementedError.

Parameters:

last_dagrun_info (DagRunInfo | None) – The DagRunInfo object of the Dag’s last scheduled or backfilled run.
restriction (TimeRestriction) – Restriction to apply when scheduling the Dag run. See documentation of TimeRestriction for details.

Returns:

Information on when the next DagRun can be scheduled. None means a DagRun should not be created. This does not mean no more runs will be scheduled ever again for this Dag; the timetable can return a DagRunInfo object when asked at another time.

Return type:

DagRunInfo | None

next_run_info_from_dag_model(*, dag_model)[source]

run_info_from_dag_run(*, dag_run)[source]

airflow.timetables.base.compute_rollup_fingerprint(timetable)[source]

Return the rollup-definition fingerprint for timetable.

The fingerprint is a dict[str, Any] mapping "{name}|{uri}" to the JSON-encoded partition mapper for each partitioned asset reachable from the timetable’s asset_condition. Keys are inserted in sorted order so the dict is stable across Python runs.

Non-partitioned timetables (timetable.partitioned is False) return an empty dict. The scheduler stamps this on AssetPartitionDagRun at creation time and compares it on the next tick; only mapper / window changes trigger cleanup of a stale partition Dag run, leaving unrelated Dag edits untouched.

Both the creation side (assets/manager.py) and the cleanup side (jobs/scheduler_job_runner.py) call this helper to guarantee the two fingerprints are computed by identical logic.