Executable Bundle Spec¶
This document specifies the on-disk format of a build artifact produced by an
Airflow native-executable SDK (Go, Rust, C++, Zig, …) and consumed by
ExecutableCoordinator
at deployment time.
The goal is a single, language-agnostic bundle shape so that scheduler, worker, and UI behave identically regardless of which compiled SDK produced the DAG.
Bundle-spec version: 1.0.
Container¶
A bundle is the compiled executable itself, with a fixed-format footer
appended after the binary’s normal end-of-file. The executable remains
directly runnable; the footer is data that follows the last byte the OS
loader cares about and is invisible to exec(). There is no enclosing
archive.
A bundle file therefore has three regions, in order from offset 0:
The native executable (ELF / Mach-O / PE), including any code-signing structures the platform appends.
The primary DAG source file, embedded verbatim (UTF-8). MAY have length 0.
The build-time manifest (
airflow-metadata.yamlcontent, UTF-8).
The file ends with a fixed 64-byte trailer that locates regions (2) and (3), carries an integrity hash of the binary region, and identifies the file as a bundle. See Trailer Layout.
Filenames follow OS conventions for executables: no extension on Linux/macOS,
.exe on Windows. The scanner identifies bundles by the trailer’s magic,
not by the filename.
The complete bundle file regions are:
[0, source_start) native binary (must be non-empty)
[source_start, metadata_start) embedded source (may be zero length)
[metadata_start, file_size-64) build-time manifest
[file_size-64, file_size) 64-byte trailer
where metadata_start = file_size - 64 - metadata_len and
source_start = metadata_start - source_len.
Reference Implementation¶
Below is a simple implementation to append the trailer with Python as a
reference when building your own packer. A language SDK is encouraged to
integrate trailer-packing into the build process to streamline the experience
for SDK users. Go SDK’s airflow-go-pack is a good example.
#!/usr/bin/env python3
import hashlib
import shutil
import struct
BINARY = pathlib.Path(...) # Path to the compiled executable.
OUTPUT = pathlib.Path(...) # Where to put the processed executable.
SOURCE = b"..." # Source code to embed.
METADATA = b"..." # UTF-8-encoded YAML metadata.
# SHA-256 covers the binary region only: bytes [0, source_start).
binary_sha256 = hashlib.sha256(BINARY.read_bytes()).digest()
trailer = struct.pack(
"<III 32s 12s 8s",
len(SOURCE), # source_len
len(METADATA), # metadata_len
1, # footer_ver
binary_sha256,
bytes(12), # reserved
b"AFBNDL01", # magic
)
assert len(trailer) == 64
shutil.copy(BINARY, OUTPUT)
with OUTPUT.open("ab") as fh:
fh.write(SOURCE) # Embedded source region.
fh.write(METADATA) # Metadata region.
fh.write(trailer)
OUTPUT.chmod(0o755)
Trailer Layout¶
The last 64 bytes of a conforming bundle are the trailer. All multi-byte integers are little-endian.
bytes 0..3 source_len uint32 length of the source region in bytes
bytes 4..7 metadata_len uint32 length of the metadata region in bytes
bytes 8..11 footer_ver uint32 currently 1
bytes 12..43 binary_sha256 32 bytes SHA-256 of the binary region [0, source_start)
bytes 44..55 reserved 12 bytes MUST be zero
bytes 56..63 magic 8 bytes ASCII "AFBNDL01"
The magic is the byte sequence 0x41 0x46 0x42 0x4E 0x44 0x4C 0x30 0x31
("AFBNDL01"). The trailing 01 is the footer-format version repeated
in ASCII so a human can identify a bundle at a glance
(tail -c 8 ./mybundle | xxd); the binary footer_ver field is the
authoritative source of truth for parsing.
binary_sha256 is the SHA-256 digest computed over the binary region
only — bytes [0, source_start). The hash field sits inside the trailer
and therefore cannot cover the bytes it occupies; it provides integrity
(the binary region has not been truncated, corrupted, or naively edited
between packing and exec) rather than authenticity
(see Code Signing for how authenticity layers on top).
Reader algorithm:
Open the file. Seek to
EOF - 64. Read 64 bytes.Compare bytes
56..63against"AFBNDL01". If different, the file is not a bundle; the scanner MUST ignore it.Parse
footer_ver. If unknown, fail with a versioning error.Compute
metadata_start = filesize - 64 - metadata_lenandsource_start = metadata_start - source_len.Validate
source_start >= 0and that the implied binary region ([0, source_start)) is non-empty.Compute SHA-256 over the binary region
[0, source_start)and compare tobinary_sha256. Mismatch is a hard failure handled identically to a magic-check failure: the scanner logs and skips the file. The result MAY be cached by(path, inode, mtime, size)so the runtime does not re-hash on every exec; a cache miss (file replaced, mtime bumped) triggers re-verification.Read
metadata_lenbytes frommetadata_startfor the manifest.Read
source_lenbytes fromsource_startfor the source view. Ifsource_len == 0, no source is embedded; the UI displays “(source not available)”.
Source comes before metadata so a future footer_ver MAY introduce
additional trailing blobs (e.g. signed checksums, compressed deps) by
extending the trailer rather than inserting between existing blobs.
airflow-metadata.yaml schema¶
The metadata region carries the same YAML manifest documented previously,
produced at build time from a static scan of the DAG source. A
machine-readable JSON Schema is published at
airflow-metadata.schema.json for use by build tooling, validators,
and editors.
airflow_bundle_metadata_version: "1.0"
sdk:
language: go
version: "0.1.0"
supervisor_schema_version: "2026-06-16"
source: example.go
dags:
example_dag:
tasks:
- extract
- transform
- load
another_dag:
tasks:
- run
Top-level keys:
airflow_bundle_metadata_version(string, required)The bundle-spec version this manifest conforms to. Currently
"1.0".sdk(mapping, required)Identifies the SDK that produced the bundle.
language(string, required): lower-case source-language identifier (e.g.go,rust,cpp,zig).version(string, required): SDK version used at build time.supervisor_schema_version(string, required): dated AIP-72 supervisor wire-schema version the bundle was compiled against, inYYYY-MM-DDformat (e.g."2026-06-16"). The coordinator passes this value to the supervisor so it can downgrade outbound messages / upgrade inbound messages to a shape the bundle understands. The value MUST resolve against the supervisor’s schema bundle; the coordinator validates it lazily when matching a bundle to a task at task-execution time, and an unknown version causes that bundle to be skipped.
source(string, required)Original filename of the primary DAG source file (e.g.
example.go). The file’s bytes live in the source region of the bundle, not at this path; this field is a display name the Airflow UI uses to label the source-view panel and pick a syntax-highlighting mode from the extension.dags(mapping, required)Mapping of
dag_idto a DAG entry. Everydag_idthe bundle exposes MUST appear here. The scanner uses these keys to match a DAG parsing or task-execution request to the bundle that owns it.
DAG entry fields:
tasks(list of strings, required)Static list of
task_ids declared in the DAG. Empty lists are permitted but discouraged.
Unrecognized top-level or DAG-entry keys MUST be ignored by the consumer so that future SDK versions can extend the manifest without breaking older runtimes.
Examples¶
Go bundle:
example
├── ELF/Mach-O/PE executable
├── source region: contents of example.go
├── metadata region: airflow-metadata.yaml (source: example.go)
└── trailer (64 B): lengths + binary_sha256 + AFBNDL01 magic
Rust bundle:
pipeline
├── ELF/Mach-O/PE executable
├── source region: contents of main.rs
├── metadata region: airflow-metadata.yaml (source: main.rs)
└── trailer (64 B): lengths + binary_sha256 + AFBNDL01 magic
The bundle is one file. ./example runs the binary; the appended data
is invisible to exec().
Build Pipeline Ordering¶
The footer is appended after the executable is otherwise complete. Producers that perform additional post-build steps MUST observe the following order:
Strip debug symbols before appending the footer. Strip implementations operate on the binary’s defined end and either leave trailing data intact or truncate it; do not rely on either behaviour.
Compute binary_sha256 over the on-disk bytes as they stand immediately before the append. At that moment the whole file is the binary region; nothing has been written past its OS-defined end yet, so the digest matches what the reader will recompute over
[0, source_start)after the append.Append
<source><metadata><trailer>in a single write so a partially written file fails the magic or hash check rather than appearing as a half-valid bundle.
Code Signing¶
The bundle format itself does not require OS-level code signing.
binary_sha256 provides integrity against truncation, in-flight
corruption, and naive tampering, and Airflow’s threat model treats
executables_root as Deployment-Manager-controlled — authenticity
(signed by a trusted identity) is a deployment-time concern rather than a
bundle-format one.
Compressors such as UPX are NOT supported. They rewrite the file end-to-end, destroying both the trailer and the hash invariant.
Determinism: the trailer is byte-identical for byte-identical inputs, so a
deterministic build plus a canonical (sorted-key) manifest serialization
yields a byte-identical bundle file (and therefore a stable
binary_sha256).
Deployment Layout¶
Bundle files are placed as-is in any of the directories configured as the
executables_root kwarg on the
ExecutableCoordinator entry
under [sdk] coordinators. The scanner walks each directory recursively
and considers only regular files whose executable bit is set for the
invoking user; files without the executable bit are skipped without reading
their trailer. For each candidate it reads the last 64 bytes and treats files
whose magic matches "AFBNDL01" as bundles. Matched files are then
SHA-256-verified per the reader algorithm; a mismatch demotes the file back
to “ignored, with an error log.” Files without the magic are silently
ignored, so non-bundle files (READMEs, dotfiles) MAY share the directory
without interfering with the scan.
/opt/airflow/executable-bundles/
├── example
├── team-a/
│ └── pipeline
└── analytics
At task-execution time the runtime execs the bundle file directly with the
coordinator arguments (--comm=<addr> / --logs=<addr>). No extraction,
no transient cache directory, no chmod-after-extract step is required: the
file is already a runnable executable with the appropriate permission bits
preserved by the build pipeline. The integrity check runs at scan/discovery
time and is cached by (path, inode, mtime, size), so the exec hot path
does not re-hash.
The compiled executable MUST honor the SDK coordinator protocol —
--comm=<host:port> / --logs=<host:port> socket-based IPC.
See ExecutableCoordinator
for the consumer-side coordinator.
Inspection¶
Because the bundle is a single executable rather than an archive,
inspecting the embedded source and manifest requires a small CLI rather
than an off-the-shelf unzip. The Go SDK’s airflow-go-pack tool
provides an inspect subcommand that dumps both regions; equivalent
helpers are expected from each language’s packer.
Compatibility and Versioning¶
The current bundle-spec format version is
1.0(airflow_bundle_metadata_version); the current trailer format version is1(footer_ver = 1).Backward-incompatible bundle-spec changes increment the major component of
airflow_bundle_metadata_versionand are gated behind an explicit opt-in on the consumer side.New optional manifest fields MAY be added in minor versions and MUST be ignored by older consumers.
New trailer-format versions append fields after
binary_sha256(consuming the reserved region) or extend the trailer with additional trailing blobs ahead of the magic. Older readers MUST reject unknownfooter_verrather than guessing.