Google Cloud Generative AI Operators¶

The Google Cloud Generative AI Operators ecosystem is anchored by the Gemini family of multimodal models, which provide interfaces for generating and processing diverse inputs like text, images, and audio. By leveraging these foundation models, developers can securely prompt, tune, and ground AI using their own proprietary data. This capability enables the construction of versatile applications, ranging from custom chatbots and code assistants to automated content summarization tools.

Interacting with Generative AI on Vertex AI¶

The Google Cloud VertexAI extends Vertex AI with powerful foundation models capable of generating text, images, and other modalities. It provides access to Google Gemini family of multimodal models and other pre-trained generative models through a unified API, SDK, and console. Developers can prompt, tune, and ground these models using their own data to build applications such as chat bots, content creation tools, code assistants, and summarization systems. With Vertex AI, you can securely integrate generative capabilities into enterprise workflows, monitor usage, evaluate model quality, and deploy models at scale — all within the same managed ML platform.

To generate text embeddings you can use GenAIGenerateEmbeddingsOperator. The operator returns the model’s response in XCom under model_response key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model.py[source]

generate_embeddings_task = GenAIGenerateEmbeddingsOperator(
    task_id="generate_embeddings_task",
    project_id=PROJECT_ID,
    location=REGION,
    contents=CONTENTS,
    model=TEXT_EMBEDDING_MODEL,
)

To generate content with a generative model you can use GenAIGenerateContentOperator. The operator returns the model’s response in XCom under model_response key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model.py[source]

generate_content_task = GenAIGenerateContentOperator(
    task_id="generate_content_task",
    project_id=PROJECT_ID,
    contents=CONTENTS,
    location=REGION_GLOBAL,
    generation_config=GENERATION_CONFIG_CREATE_CONTENT,
    model=MULTIMODAL_MODEL,
)

To run a supervised fine tuning job you can use GenAISupervisedFineTuningTrainOperator. The operator returns the tuned model’s endpoint name in XCom under tuned_model_endpoint_name key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model_tuning.py[source]

sft_train_task = GenAISupervisedFineTuningTrainOperator(
    task_id="sft_train_task",
    project_id=PROJECT_ID,
    location=REGION,
    source_model=SOURCE_MODEL,
    training_dataset=TRAIN_DATASET,
    tuning_job_config=TUNING_JOB_CONFIG,
)

You can also use supervised fine tuning job for video tasks (training and tracking):

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model_tuning.py[source]

sft_video_task = GenAISupervisedFineTuningTrainOperator(
    task_id="sft_train_video_task",
    project_id=PROJECT_ID,
    location=REGION,
    source_model=SOURCE_MODEL,
    training_dataset=TRAIN_VIDEO_DATASET,
    tuning_job_config=TUNING_JOB_VIDEO_MODEL_CONFIG,
)

To calculates the number of input tokens before sending a request to the Gemini API you can use: GenAICountTokensOperator. The operator returns the total tokens in XCom under total_tokens key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model.py[source]

count_tokens_task = GenAICountTokensOperator(
    task_id="count_tokens_task",
    project_id=PROJECT_ID,
    contents=CONTENTS,
    location=REGION_GLOBAL,
    model=MULTIMODAL_MODEL,
)

To create cached content you can use GenAICreateCachedContentOperator. The operator returns the cached content resource name in XCom under cached_content key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model.py[source]

create_cached_content_task = GenAICreateCachedContentOperator(
    task_id="create_cached_content_task",
    project_id=PROJECT_ID,
    location=REGION_GLOBAL,
    model=CACHED_MODEL,
    cached_content_config=CACHED_CONTENT_CONFIG,
)

To generate a response from cached content you can use GenAIGenerateContentOperator. The operator returns the cached content response in XCom under model_response key.

tests/system/google/cloud/gen_ai/example_gen_ai_generative_model.py[source]

generate_from_cached_content_task = GenAIGenerateContentOperator(
    task_id="generate_from_cached_content_task",
    project_id=PROJECT_ID,
    location=REGION_GLOBAL,
    contents=["What are the papers about?"],
    generation_config={
        "cached_content": create_cached_content_task.output,
    },
    model=CACHED_MODEL,
)

Interacting with Gemini Batch API¶

The Gemini Batch API is designed to process large volumes of requests asynchronously at 50% of the standard cost. The target turnaround time is 24 hours, but in majority of cases, it is much quicker. Use Batch API for large-scale, non-urgent tasks such as data pre-processing or running evaluations where an immediate response is not required.

Create batch job¶

To create batch job via Batch API you can use GenAIGeminiCreateBatchJobOperator. The operator returns the job name in XCom under job_name key.

Two option of input source is allowed: inline requests, file.

If you use inline requests take a look at this example: