Executable Bundle Spec

This document specifies the on-disk format of a build artifact produced by an Airflow native-executable SDK (Go, Rust, C++, Zig, …) and consumed by ExecutableCoordinator at deployment time.

The goal is a single, language-agnostic bundle shape so that scheduler, worker, and UI behave identically regardless of which compiled SDK produced the DAG.

Bundle-spec version: 1.0.

Container

A bundle is the compiled executable itself, with a fixed-format footer appended after the binary’s normal end-of-file. The executable remains directly runnable; the footer is data that follows the last byte the OS loader cares about and is invisible to exec(). There is no enclosing archive.

A bundle file therefore has three regions, in order from offset 0:

  1. The native executable (ELF / Mach-O / PE), including any code-signing structures the platform appends.

  2. The primary DAG source file, embedded verbatim (UTF-8). MAY have length 0.

  3. The build-time manifest (airflow-metadata.yaml content, UTF-8).

The file ends with a fixed 64-byte trailer that locates regions (2) and (3), carries an integrity hash of the binary region, and identifies the file as a bundle. See Trailer Layout.

Filenames follow OS conventions for executables: no extension on Linux/macOS, .exe on Windows. The scanner identifies bundles by the trailer’s magic, not by the filename.

The complete bundle file regions are:

[0,            source_start)    native binary (must be non-empty)
[source_start, metadata_start)  embedded source (may be zero length)
[metadata_start, file_size-64)  build-time manifest
[file_size-64, file_size)       64-byte trailer

where metadata_start = file_size - 64 - metadata_len and source_start = metadata_start - source_len.

Reference Implementation

Below is a simple implementation to append the trailer with Python as a reference when building your own packer. A language SDK is encouraged to integrate trailer-packing into the build process to streamline the experience for SDK users. Go SDK’s airflow-go-pack is a good example.

#!/usr/bin/env python3

import hashlib
import shutil
import struct

BINARY = pathlib.Path(...)  # Path to the compiled executable.
OUTPUT = pathlib.Path(...)  # Where to put the processed executable.
SOURCE = b"..."  # Source code to embed.
METADATA = b"..."  # UTF-8-encoded YAML metadata.

# SHA-256 covers the binary region only: bytes [0, source_start).
binary_sha256 = hashlib.sha256(BINARY.read_bytes()).digest()

trailer = struct.pack(
    "<III 32s 12s 8s",
    len(SOURCE),  # source_len
    len(METADATA),  # metadata_len
    1,  # footer_ver
    binary_sha256,
    bytes(12),  # reserved
    b"AFBNDL01",  # magic
)
assert len(trailer) == 64

shutil.copy(BINARY, OUTPUT)
with OUTPUT.open("ab") as fh:
    fh.write(SOURCE)  # Embedded source region.
    fh.write(METADATA)  # Metadata region.
    fh.write(trailer)
OUTPUT.chmod(0o755)

Trailer Layout

The last 64 bytes of a conforming bundle are the trailer. All multi-byte integers are little-endian.

bytes  0..3    source_len     uint32     length of the source region in bytes
bytes  4..7    metadata_len   uint32     length of the metadata region in bytes
bytes  8..11   footer_ver     uint32     currently 1
bytes 12..43   binary_sha256  32 bytes   SHA-256 of the binary region [0, source_start)
bytes 44..55   reserved       12 bytes   MUST be zero
bytes 56..63   magic          8 bytes    ASCII "AFBNDL01"

The magic is the byte sequence 0x41 0x46 0x42 0x4E 0x44 0x4C 0x30 0x31 ("AFBNDL01"). The trailing 01 is the footer-format version repeated in ASCII so a human can identify a bundle at a glance (tail -c 8 ./mybundle | xxd); the binary footer_ver field is the authoritative source of truth for parsing.

binary_sha256 is the SHA-256 digest computed over the binary region only — bytes [0, source_start). The hash field sits inside the trailer and therefore cannot cover the bytes it occupies; it provides integrity (the binary region has not been truncated, corrupted, or naively edited between packing and exec) rather than authenticity (see Code Signing for how authenticity layers on top).

Reader algorithm:

  1. Open the file. Seek to EOF - 64. Read 64 bytes.

  2. Compare bytes 56..63 against "AFBNDL01". If different, the file is not a bundle; the scanner MUST ignore it.

  3. Parse footer_ver. If unknown, fail with a versioning error.

  4. Compute metadata_start = filesize - 64 - metadata_len and source_start = metadata_start - source_len.

  5. Validate source_start >= 0 and that the implied binary region ([0, source_start)) is non-empty.

  6. Compute SHA-256 over the binary region [0, source_start) and compare to binary_sha256. Mismatch is a hard failure handled identically to a magic-check failure: the scanner logs and skips the file. The result MAY be cached by (path, inode, mtime, size) so the runtime does not re-hash on every exec; a cache miss (file replaced, mtime bumped) triggers re-verification.

  7. Read metadata_len bytes from metadata_start for the manifest.

  8. Read source_len bytes from source_start for the source view. If source_len == 0, no source is embedded; the UI displays “(source not available)”.

Source comes before metadata so a future footer_ver MAY introduce additional trailing blobs (e.g. signed checksums, compressed deps) by extending the trailer rather than inserting between existing blobs.

airflow-metadata.yaml schema

The metadata region carries the same YAML manifest documented previously, produced at build time from a static scan of the DAG source. A machine-readable JSON Schema is published at airflow-metadata.schema.json for use by build tooling, validators, and editors.

airflow_bundle_metadata_version: "1.0"
sdk:
  language: go
  version: "0.1.0"
  supervisor_schema_version: "2026-06-16"
source: example.go
dags:
  example_dag:
    tasks:
      - extract
      - transform
      - load
  another_dag:
    tasks:
      - run

Top-level keys:

airflow_bundle_metadata_version (string, required)

The bundle-spec version this manifest conforms to. Currently "1.0".

sdk (mapping, required)

Identifies the SDK that produced the bundle.

  • language (string, required): lower-case source-language identifier (e.g. go, rust, cpp, zig).

  • version (string, required): SDK version used at build time.

  • supervisor_schema_version (string, required): dated AIP-72 supervisor wire-schema version the bundle was compiled against, in YYYY-MM-DD format (e.g. "2026-06-16"). The coordinator passes this value to the supervisor so it can downgrade outbound messages / upgrade inbound messages to a shape the bundle understands. The value MUST resolve against the supervisor’s schema bundle; the coordinator validates it lazily when matching a bundle to a task at task-execution time, and an unknown version causes that bundle to be skipped.

source (string, required)

Original filename of the primary DAG source file (e.g. example.go). The file’s bytes live in the source region of the bundle, not at this path; this field is a display name the Airflow UI uses to label the source-view panel and pick a syntax-highlighting mode from the extension.

dags (mapping, required)

Mapping of dag_id to a DAG entry. Every dag_id the bundle exposes MUST appear here. The scanner uses these keys to match a DAG parsing or task-execution request to the bundle that owns it.

DAG entry fields:

tasks (list of strings, required)

Static list of task_ids declared in the DAG. Empty lists are permitted but discouraged.

Unrecognized top-level or DAG-entry keys MUST be ignored by the consumer so that future SDK versions can extend the manifest without breaking older runtimes.

Examples

Go bundle:

example
├── ELF/Mach-O/PE executable
├── source region:   contents of example.go
├── metadata region: airflow-metadata.yaml (source: example.go)
└── trailer (64 B):  lengths + binary_sha256 + AFBNDL01 magic

Rust bundle:

pipeline
├── ELF/Mach-O/PE executable
├── source region:   contents of main.rs
├── metadata region: airflow-metadata.yaml (source: main.rs)
└── trailer (64 B):  lengths + binary_sha256 + AFBNDL01 magic

The bundle is one file. ./example runs the binary; the appended data is invisible to exec().

Build Pipeline Ordering

The footer is appended after the executable is otherwise complete. Producers that perform additional post-build steps MUST observe the following order:

  • Strip debug symbols before appending the footer. Strip implementations operate on the binary’s defined end and either leave trailing data intact or truncate it; do not rely on either behaviour.

  • Compute binary_sha256 over the on-disk bytes as they stand immediately before the append. At that moment the whole file is the binary region; nothing has been written past its OS-defined end yet, so the digest matches what the reader will recompute over [0, source_start) after the append.

  • Append <source><metadata><trailer> in a single write so a partially written file fails the magic or hash check rather than appearing as a half-valid bundle.

Code Signing

The bundle format itself does not require OS-level code signing. binary_sha256 provides integrity against truncation, in-flight corruption, and naive tampering, and Airflow’s threat model treats executables_root as Deployment-Manager-controlled — authenticity (signed by a trusted identity) is a deployment-time concern rather than a bundle-format one.

Compressors such as UPX are NOT supported. They rewrite the file end-to-end, destroying both the trailer and the hash invariant.

Determinism: the trailer is byte-identical for byte-identical inputs, so a deterministic build plus a canonical (sorted-key) manifest serialization yields a byte-identical bundle file (and therefore a stable binary_sha256).

Deployment Layout

Bundle files are placed as-is in any of the directories configured as the executables_root kwarg on the ExecutableCoordinator entry under [sdk] coordinators. The scanner walks each directory recursively and considers only regular files whose executable bit is set for the invoking user; files without the executable bit are skipped without reading their trailer. For each candidate it reads the last 64 bytes and treats files whose magic matches "AFBNDL01" as bundles. Matched files are then SHA-256-verified per the reader algorithm; a mismatch demotes the file back to “ignored, with an error log.” Files without the magic are silently ignored, so non-bundle files (READMEs, dotfiles) MAY share the directory without interfering with the scan.

/opt/airflow/executable-bundles/
├── example
├── team-a/
│   └── pipeline
└── analytics

At task-execution time the runtime execs the bundle file directly with the coordinator arguments (--comm=<addr> / --logs=<addr>). No extraction, no transient cache directory, no chmod-after-extract step is required: the file is already a runnable executable with the appropriate permission bits preserved by the build pipeline. The integrity check runs at scan/discovery time and is cached by (path, inode, mtime, size), so the exec hot path does not re-hash.

The compiled executable MUST honor the SDK coordinator protocol — --comm=<host:port> / --logs=<host:port> socket-based IPC.

See ExecutableCoordinator for the consumer-side coordinator.

Inspection

Because the bundle is a single executable rather than an archive, inspecting the embedded source and manifest requires a small CLI rather than an off-the-shelf unzip. The Go SDK’s airflow-go-pack tool provides an inspect subcommand that dumps both regions; equivalent helpers are expected from each language’s packer.

Compatibility and Versioning

  • The current bundle-spec format version is 1.0 (airflow_bundle_metadata_version); the current trailer format version is 1 (footer_ver = 1).

  • Backward-incompatible bundle-spec changes increment the major component of airflow_bundle_metadata_version and are gated behind an explicit opt-in on the consumer side.

  • New optional manifest fields MAY be added in minor versions and MUST be ignored by older consumers.

  • New trailer-format versions append fields after binary_sha256 (consuming the reserved region) or extend the trailer with additional trailing blobs ahead of the magic. Older readers MUST reject unknown footer_ver rather than guessing.

Was this entry helpful?