airflow.providers.common.ai.hooks.langchain

Hook for LangChain integration with Airflow connections.

Classes

LangChainHook

Bridge an Airflow connection to LangChain chat and embedding models.

Module Contents

class airflow.providers.common.ai.hooks.langchain.LangChainHook(llm_conn_id=None, embed_conn_id=None, llm_model=None, embed_model=None, **kwargs)[source]

Bases: airflow.providers.common.compat.sdk.BaseHook

Bridge an Airflow connection to LangChain chat and embedding models.

The hook resolves credentials (API key, optional base URL) from the Airflow connection and returns LangChain model objects via two universal entry-point functions:

  • langchain.chat_models.init_chat_model() dispatches to the right chat-model vendor based on the model identifier.

  • langchain.embeddings.init_embeddings() dispatches to the right embedding-model vendor based on the model identifier.

Both identifiers use the provider:name format (e.g. "openai:gpt-4o", "openai:text-embedding-3-small"). Only OpenAI-compatible providers (OpenAI itself, Anthropic, Groq, Mistral AI chat, Ollama, DeepSeek, …) work with this hook’s api_key + optional base_url credential surface. Providers with bespoke auth (AWS Bedrock, Google Vertex AI / GenAI, Azure OpenAI, Cohere, HuggingFace) reject these kwargs; per-vendor subclasses can be added later mirroring the pydantic-ai pattern.

Connection fields:

  • password: API key passed as api_key= to the model constructor.

  • host: Optional base URL passed as base_url= (custom endpoints, Ollama, vLLM).

  • extra JSON: {"model": "openai:gpt-4o", "embed_model": "openai:text-embedding-3-small"} – default chat and embedding model identifiers.

Parameters:
  • llm_conn_id (str | None) – Airflow connection ID for the LLM provider. Falls back to default_conn_name ("langchain_default") if not provided.

  • embed_conn_id (str | None) – Optional separate Airflow connection ID for the embedding provider. Falls back to llm_conn_id when not provided – the common case of one provider for both chat and embeddings stays a single hook instance.

  • llm_model (str | None) – Chat model identifier in provider:name format (e.g. "openai:gpt-4o", "anthropic:claude-3-7-sonnet"). Overrides extra["model"] on the connection.

  • embed_model (str | None) – Embedding model identifier in provider:name format (e.g. "openai:text-embedding-3-small"). Overrides extra["embed_model"] on the connection.

conn_name_attr = 'llm_conn_id'[source]
default_conn_name = 'langchain_default'[source]
conn_type = 'langchain'[source]
hook_name = 'LangChain'[source]
llm_conn_id[source]
embed_conn_id[source]
llm_model = None[source]
embed_model = None[source]
static get_ui_field_behaviour()[source]

Return custom field behaviour for the Airflow connection form.

get_chat_model()[source]

Return a LangChain chat model configured from the Airflow connection.

Dispatch is delegated to init_chat_model, which picks the right vendor class based on the provider:name prefix in the model id.

get_embedding_model()[source]

Return a LangChain embedding model configured from the Airflow connection.

Dispatch is delegated to init_embeddings, which picks the right vendor class based on the provider:name prefix in the model id. Uses embed_conn_id if set (falls back to llm_conn_id).

Was this entry helpful?