airflow.providers.common.ai.utils.sql_validation

SQL safety validation for LLM-generated queries.

Uses an allowlist approach: only explicitly permitted statement types pass. This is safer than a denylist because new/unexpected statement types (INSERT, UPDATE, MERGE, TRUNCATE, COPY, etc.) are blocked by default.

Attributes

DEFAULT_ALLOWED_TYPES

Exceptions

SQLSafetyError

Generated SQL failed safety validation.

Functions

validate_sql(sql, *[, allowed_types, dialect, ...])

Parse SQL and verify all statements are in the allowed types list.

Module Contents

airflow.providers.common.ai.utils.sql_validation.DEFAULT_ALLOWED_TYPES: tuple[type[sqlglot.exp.Expression], Ellipsis][source]
exception airflow.providers.common.ai.utils.sql_validation.SQLSafetyError[source]

Bases: Exception

Generated SQL failed safety validation.

airflow.providers.common.ai.utils.sql_validation.validate_sql(sql, *, allowed_types=None, dialect=None, allow_multiple_statements=False)[source]

Parse SQL and verify all statements are in the allowed types list.

By default, only a single SELECT-family statement is allowed. Multi-statement SQL (separated by semicolons) is rejected unless allow_multiple_statements=True, because multi-statement inputs can hide dangerous operations after a benign SELECT.

Returns parsed statements on success, raises SQLSafetyError on violation.

Parameters:
  • sql (str) – SQL string to validate.

  • allowed_types (tuple[type[sqlglot.exp.Expression], Ellipsis] | None) – Tuple of sqlglot expression types to permit. Defaults to (Select, Union, Intersect, Except).

  • dialect (str | None) – SQL dialect for parsing (postgres, mysql, etc.).

  • allow_multiple_statements (bool) – Whether to allow multiple semicolon-separated statements. Default False.

Returns:

List of parsed sqlglot Expression objects.

Raises:

SQLSafetyError – If the SQL is empty, contains disallowed statement types, or has multiple statements when not permitted.

Return type:

list[sqlglot.exp.Expression]

Was this entry helpful?