LlamaIndex LlamaIndexRetrievalOperator¶
Load a persisted LlamaIndex index and run similarity search. Designed to
sit between
LlamaIndexEmbeddingOperator
(which builds the index) and
LLMOperator (which
synthesises an answer from the retrieved chunks).
Passes the embedding model directly to
load_index_from_storage(..., embed_model=...) – no LlamaIndex
Settings mutation. The embedding model must match the one used when
the index was originally built.
Basic usage¶
@dag(schedule=None, tags=["example"])
def example_llamaindex_retrieve():
"""Load a persisted index and run similarity search."""
retrieve = LlamaIndexRetrievalOperator(
task_id="retrieve",
query="{{ params.query }}",
index_persist_dir="/opt/airflow/data/library_index",
embed_model="text-embedding-3-small",
llm_conn_id="llamaindex_default",
top_k=5,
)
retrieve
query is templated, so DAG-run params, XCom, and Variables all flow
through cleanly.
Cloud-persisted indexes¶
index_persist_dir accepts the same local-path-or-URI shape as
LlamaIndexEmbeddingOperator.persist_dir. Pass persist_conn_id to point at
the Airflow connection that holds cloud credentials. The operator raises
FileNotFoundError with a clear “did you run LlamaIndexEmbeddingOperator first?”
message when the path is missing.
Bring-your-own embedding model¶
Same shape as LlamaIndexEmbeddingOperator: embed_model accepts either a
string model name (OpenAI via the hook) or a pre-built BaseEmbedding
instance for non-OpenAI vendors. See the BYO example in
LlamaIndex LlamaIndexEmbeddingOperator.
Parameters¶
Parameter |
Description |
|---|---|
|
The query string. Templated. |
|
Local path or storage URI pointing at the persisted index. Templated. |
|
Cloud credentials connection ID for |
|
String model name OR pre-built |
|
Airflow connection ID used when |
|
Optional separate connection ID for the embedding provider. Falls
back to |
|
Number of top similarity results to return (default 5). |
Output¶
Returns a dict with:
{
"query": str,
"chunks": [
{
"text": str,
"score": float,
"metadata": dict,
"node_id": str,
},
...
],
}