airflow.timetables.trigger

Attributes

log

Classes

DeltaTriggerTimetable

Timetable that triggers DAG runs according to a cron expression.

CronTriggerTimetable

Timetable that triggers DAG runs according to a cron expression.

MultipleCronTriggerTimetable

Timetable that triggers Dag runs according to multiple cron expressions.

CronPartitionTimetable

Timetable that triggers Dag runs according to a cron expression.

Module Contents

airflow.timetables.trigger.log[source]
class airflow.timetables.trigger.DeltaTriggerTimetable(delta, *, interval=datetime.timedelta())[source]

Bases: airflow.timetables._delta.DeltaMixin, _TriggerTimetable

Timetable that triggers DAG runs according to a cron expression.

This is different from DeltaDataIntervalTimetable, where the delta value specifies the data interval of a DAG run. With this timetable, the data intervals are specified independently. Also for the same reason, this timetable kicks off a DAG run immediately at the start of the period, instead of needing to wait for one data interval to pass.

Parameters:
  • delta (datetime.timedelta | dateutil.relativedelta.relativedelta) – How much time to wait between each run.

  • interval (datetime.timedelta | dateutil.relativedelta.relativedelta) – The data interval of each run. Default is 0.

classmethod deserialize(data)[source]

Deserialize a timetable from data.

This is called when a serialized DAG is deserialized. data will be whatever was returned by serialize during DAG serialization. The default implementation constructs the timetable without any arguments.

serialize()[source]

Serialize the timetable for JSON encoding.

This is called during DAG serialization to store timetable information in the database. This should return a JSON-serializable dict that will be fed into deserialize when the DAG is deserialized. The default implementation returns an empty dict.

class airflow.timetables.trigger.CronTriggerTimetable(cron, *, timezone, interval=datetime.timedelta(), run_immediately=False)[source]

Bases: airflow.timetables._cron.CronMixin, _TriggerTimetable

Timetable that triggers DAG runs according to a cron expression.

This is different from CronDataIntervalTimetable, where the cron expression specifies the data interval of a DAG run. With this timetable, the data intervals are specified independently from the cron expression. Also for the same reason, this timetable kicks off a DAG run immediately at the start of the period (similar to POSIX cron), instead of needing to wait for one data interval to pass.

Don’t pass @once in here; use OnceTimetable instead.

Parameters:
  • cron (str) – cron string that defines when to run

  • timezone (str | pendulum.tz.timezone.Timezone | pendulum.tz.timezone.FixedTimezone) – Which timezone to use to interpret the cron string

  • interval (datetime.timedelta | dateutil.relativedelta.relativedelta) – timedelta that defines the data interval start. Default 0.

run_immediately controls, if no start_time is given to the DAG, when the first run of the DAG should be scheduled. It has no effect if there already exist runs for this DAG.

  • If True, always run immediately the most recent possible DAG run.

  • If False, wait to run until the next scheduled time in the future.

  • If passed a timedelta, will run the most recent possible DAG run if that run’s data_interval_end is within timedelta of now.

  • If None, the timedelta is calculated as 10% of the time between the most recent past scheduled time and the next scheduled time. E.g. if running every hour, this would run the previous time if less than 6 minutes had past since the previous run time, otherwise it would wait until the next hour.

classmethod deserialize(data)[source]

Deserialize a timetable from data.

This is called when a serialized DAG is deserialized. data will be whatever was returned by serialize during DAG serialization. The default implementation constructs the timetable without any arguments.

serialize()[source]

Serialize the timetable for JSON encoding.

This is called during DAG serialization to store timetable information in the database. This should return a JSON-serializable dict that will be fed into deserialize when the DAG is deserialized. The default implementation returns an empty dict.

class airflow.timetables.trigger.MultipleCronTriggerTimetable(*crons, timezone, interval=datetime.timedelta(), run_immediately=False)[source]

Bases: airflow.timetables.base.Timetable

Timetable that triggers Dag runs according to multiple cron expressions.

This combines multiple CronTriggerTimetable instances underneath, and triggers a Dag run whenever one of the timetables want to trigger a run.

Only at most one run is triggered for any given time, even if more than one timetable fires at the same time.

description = ''[source]

Human-readable description of the timetable.

For example, this can produce something like 'At 21:30, only on Friday' from the cron expression '30 21 * * 5'. This is used in the webserver UI.

classmethod deserialize(data)[source]

Deserialize a timetable from data.

This is called when a serialized DAG is deserialized. data will be whatever was returned by serialize during DAG serialization. The default implementation constructs the timetable without any arguments.

serialize()[source]

Serialize the timetable for JSON encoding.

This is called during DAG serialization to store timetable information in the database. This should return a JSON-serializable dict that will be fed into deserialize when the DAG is deserialized. The default implementation returns an empty dict.

property summary: str[source]

A short summary for the timetable.

This is used to display the timetable in the web UI. A cron expression timetable, for example, can use this to display the expression. The default implementation returns the timetable’s type name.

infer_manual_data_interval(*, run_after)[source]

When a DAG run is manually triggered, infer a data interval for it.

This is used for e.g. manually-triggered runs, where run_after would be when the user triggers the run. The default implementation raises NotImplementedError.

next_dagrun_info(*, last_automated_data_interval, restriction)[source]

Provide information to schedule the next DagRun.

The default implementation raises NotImplementedError.

Parameters:
Returns:

Information on when the next DagRun can be scheduled. None means a DagRun will not happen. This does not mean no more runs will be scheduled even again for this DAG; the timetable can return a DagRunInfo object when asked at another time.

Return type:

airflow.timetables.base.DagRunInfo | None

class airflow.timetables.trigger.CronPartitionTimetable(cron, *, timezone, run_offset=None, run_immediately=False, key_format='%Y-%m-%dT%H:%M:%S')[source]

Bases: CronTriggerTimetable

Timetable that triggers Dag runs according to a cron expression.

Creates runs for partition keys.

The cron expression determines the sequence of run dates. And the partition dates are derived from those according to the run_offset. The partition key is then formatted using the partition date.

A run_offset of 1 means the partition_date will be one cron interval after the run date; negative means the partition date will be one cron interval prior to the run date.

Parameters:
  • cron (str) – cron string that defines when to run

  • timezone (str | pendulum.tz.timezone.Timezone | pendulum.tz.timezone.FixedTimezone) – Which timezone to use to interpret the cron string

  • run_offset (int | datetime.timedelta | dateutil.relativedelta.relativedelta | None) – Integer offset that determines which partition date to run for. The partition key will be derived from the partition date.

  • key_format (str) – How to translate the partition date into a string partition key.

run_immediately controls, if no start_time is given to the Dag, when the first run of the Dag should be scheduled. It has no effect if there already exist runs for this Dag.

  • If True, always run immediately the most recent possible Dag run.

  • If False, wait to run until the next scheduled time in the future.

  • If passed a timedelta, will run the most recent possible Dag run if that run’s data_interval_end is within timedelta of now.

  • If None, the timedelta is calculated as 10% of the time between the most recent past scheduled time and the next scheduled time. E.g. if running every hour, this would run the previous time if less than 6 minutes had past since the previous run time, otherwise it would wait until the next hour.

# todo: AIP-76 talk about how we can have auto-reprocessing of partitions # todo: AIP-76 we could allow a tuple of integer + time-based

partitioned = True[source]

Whether this timetable considers asset partitions.

This is True for timetables that switch scheduling to use partitions instead of the traditional logic based on logical dates and data intervals.

classmethod deserialize(data)[source]

Deserialize a timetable from data.

This is called when a serialized DAG is deserialized. data will be whatever was returned by serialize during DAG serialization. The default implementation constructs the timetable without any arguments.

serialize()[source]

Serialize the timetable for JSON encoding.

This is called during DAG serialization to store timetable information in the database. This should return a JSON-serializable dict that will be fed into deserialize when the DAG is deserialized. The default implementation returns an empty dict.

next_dagrun_info_v2(*, last_dagrun_info, restriction)[source]

Provide information to schedule the next DagRun.

The default implementation raises NotImplementedError.

Parameters:
Returns:

Information on when the next DagRun can be scheduled. None means a DagRun should not be created. This does not mean no more runs will be scheduled ever again for this Dag; the timetable can return a DagRunInfo object when asked at another time.

Return type:

airflow.timetables.base.DagRunInfo | None

generate_run_id(*, run_type, run_after, data_interval, **extra)[source]

Generate a unique run ID.

Parameters:
  • run_type (airflow.utils.types.DagRunType) – The type of DAG run.

  • run_after (pendulum.DateTime) – the datetime before which to Dag cannot run.

  • data_interval (airflow.timetables.base.DataInterval | None) – The data interval of the DAG run.

Was this entry helpful?