airflow.providers.common.ai.utils.sql_validation

SQL safety validation for LLM-generated queries.

Uses an allowlist approach: only explicitly permitted statement types pass. This is safer than a denylist because new/unexpected statement types (INSERT, UPDATE, MERGE, TRUNCATE, COPY, etc.) are blocked by default.

Attributes

DEFAULT_ALLOWED_TYPES

READ_ONLY_METADATA_TYPES

Exceptions

SQLSafetyError

Generated SQL failed safety validation.

Functions

resolve_sqlglot_dialect(dialect_name)

Normalize a SQLAlchemy dialect name to a sqlglot dialect.

validate_sql(sql, *[, allowed_types, dialect, ...])

Parse SQL and verify all statements are in the allowed types list.

Module Contents

airflow.providers.common.ai.utils.sql_validation.resolve_sqlglot_dialect(dialect_name)[source]

Normalize a SQLAlchemy dialect name to a sqlglot dialect.

Returns None (dialect-agnostic parsing) for empty, non-string, or unknown inputs, so a bad dialect value never breaks SQL validation.

Parameters:

dialect_name (str | None) – A SQLAlchemy dialect_name (e.g. "postgresql").

Returns:

The matching sqlglot dialect (e.g. "postgres"), or None.

Return type:

str | None

airflow.providers.common.ai.utils.sql_validation.DEFAULT_ALLOWED_TYPES: tuple[type[sqlglot.exp.Expr], Ellipsis][source]
airflow.providers.common.ai.utils.sql_validation.READ_ONLY_METADATA_TYPES: tuple[type[sqlglot.exp.Expr], Ellipsis][source]
exception airflow.providers.common.ai.utils.sql_validation.SQLSafetyError[source]

Bases: Exception

Generated SQL failed safety validation.

airflow.providers.common.ai.utils.sql_validation.validate_sql(sql, *, allowed_types=None, dialect=None, allow_multiple_statements=False, allow_read_only_metadata=False)[source]

Parse SQL and verify all statements are in the allowed types list.

By default, only a single SELECT-family statement is allowed. Multi-statement SQL (separated by semicolons) is rejected unless allow_multiple_statements=True, because multi-statement inputs can hide dangerous operations after a benign SELECT.

Returns parsed statements on success, raises SQLSafetyError on violation.

Parameters:
  • sql (str) – SQL string to validate.

  • allowed_types (tuple[type[sqlglot.exp.Expr], Ellipsis] | None) – Tuple of sqlglot expression types to permit. Defaults to (Select, Union, Intersect, Except). When supplied, the caller takes full control of the allow-list and allow_read_only_metadata is ignored.

  • dialect (str | None) – SQL dialect for parsing (postgres, mysql, etc.).

  • allow_multiple_statements (bool) – Whether to allow multiple semicolon-separated statements. Default False.

  • allow_read_only_metadata (bool) – Also permit read-only metadata statements (DESCRIBE/SHOW) on top of the default read-only allow-list. Ignored when allowed_types is supplied. Note SHOW only parses to a metadata statement when a dialect that supports it is given. Default False.

Returns:

List of parsed sqlglot Expression objects.

Raises:

SQLSafetyError – If the SQL is empty, contains disallowed statement types, or has multiple statements when not permitted.

Return type:

list[sqlglot.exp.Expr]

Was this entry helpful?