airflow.providers.common.sql.config

Attributes

TABLE_PROVIDERS

Classes

ConnectionConfig

Configuration for datafusion object store connections.

FormatType

Supported data formats.

StorageType

Storage types for Data Fusion.

DataSourceConfig

Configuration for an input data source.

Module Contents

class airflow.providers.common.sql.config.ConnectionConfig[source]

Configuration for datafusion object store connections.

conn_id: str[source]
credentials: dict[str, Any][source]
extra_config: dict[str, Any][source]
class airflow.providers.common.sql.config.FormatType[source]

Bases: str, enum.Enum

Supported data formats.

PARQUET = 'parquet'[source]
CSV = 'csv'[source]
AVRO = 'avro'[source]
ICEBERG = 'iceberg'[source]
airflow.providers.common.sql.config.TABLE_PROVIDERS: frozenset[str][source]
class airflow.providers.common.sql.config.StorageType[source]

Bases: str, enum.Enum

Storage types for Data Fusion.

S3 = 's3'[source]
LOCAL = 'local'[source]
class airflow.providers.common.sql.config.DataSourceConfig[source]

Configuration for an input data source.

File-based formats (parquet, csv, avro) require uri and infer storage_type automatically.

Catalog-managed formats (iceberg, and in the future delta, etc.) do not require uri or storage_type; they use conn_id and format-specific keys in options (e.g. catalog_table_name for Iceberg).

Parameters:
  • conn_id – The connection ID to use for accessing the data source.

  • uri – The URI of the data source (e.g., file path, S3 bucket, etc.). Not required for catalog-managed formats.

  • format – The format of the data (e.g., ‘parquet’, ‘csv’, ‘iceberg’).

  • table_name – The name to register the table under in DataFusion.

  • db_name – The namespace for table provider eg: iceberg needs to catalog it to look

  • storage_type – The type of storage (automatically inferred from URI). Not used for catalog-managed formats.

  • options – Additional options for the data source. e.g. you can set partition columns for any file-based datasource, or catalog_table_name for Iceberg.

conn_id: str[source]
table_name: str[source]
uri: str = ''[source]
format: str = ''[source]
db_name: str | None = None[source]
storage_type: StorageType | None = None[source]
options: dict[str, Any][source]
property is_table_provider: bool[source]

Whether this format is catalog-managed (no object store needed).

__post_init__()[source]

Was this entry helpful?