airflow.providers.amazon.aws.hooks.sagemaker_unified_studio

This module contains the Amazon SageMaker Unified Studio Notebook hook.

Classes

SageMakerNotebookHook

Interact with Sagemaker Unified Studio Workflows.

Module Contents

class airflow.providers.amazon.aws.hooks.sagemaker_unified_studio.SageMakerNotebookHook(execution_name, input_config=None, domain_id=None, project_id=None, output_config=None, domain_region=None, compute=None, termination_condition=None, tags=None, waiter_delay=10, waiter_max_attempts=1440, *args, **kwargs)[source]

Bases: airflow.providers.common.compat.sdk.BaseHook

Interact with Sagemaker Unified Studio Workflows.

This hook provides a wrapper around the Sagemaker Workflows Notebook Execution API.

Examples:
from airflow.providers.amazon.aws.hooks.sagemaker_unified_studio import SageMakerNotebookHook

notebook_hook = SageMakerNotebookHook(
    execution_name="notebook_execution",
    domain_id="dzd-example123456",
    project_id="example123456",
    input_config={"input_path": "path/to/notebook.ipynb", "input_params": {"param1": "value1"}},
    output_config={"output_uri": "folder/output/location/prefix", "output_formats": "NOTEBOOK"},
    domain_region="us-east-1",
    waiter_delay=10,
    waiter_max_attempts=1440,
)
Parameters:
  • execution_name (str) – The name of the notebook job to be executed, this is same as task_id.

  • domain_id (str | None) – The domain ID for Amazon SageMaker Unified Studio. Optional - if not provided, the SDK will attempt to resolve it from the environment.

  • project_id (str | None) – The project ID for Amazon SageMaker Unified Studio. Optional - if not provided, the SDK will attempt to resolve it from the environment.

  • input_config (dict | None) – Configuration for the input file. Example: {‘input_path’: ‘folder/input/notebook.ipynb’, ‘input_params’: {‘param1’: ‘value1’}}

  • output_config (dict | None) – Configuration for the output format. It should include an output_formats parameter to specify the output format. Example: {‘output_formats’: [‘NOTEBOOK’]}

  • domain_region (str | None) – The AWS region for the domain. If not provided, the default AWS region will be used.

  • compute (dict | None) –

    compute configuration to use for the notebook execution. This is a required attribute if the execution is on a remote compute. Example:

    {
        "instance_type": "ml.c5.xlarge",
        "image_details": {
            "image_name": "sagemaker-distribution-prod",
            "image_version": "3",
            "ecr_uri": "123456123456.dkr.ecr.us-west-2.amazonaws.com/ImageName:latest",
        },
    }
    

  • termination_condition (dict | None) – conditions to match to terminate the remote execution. Example: {"MaxRuntimeInSeconds": 3600}

  • tags (dict | None) – tags to be associated with the remote execution runs. Example: {"md_analytics": "logs"}

  • waiter_delay (int) – Interval in seconds to check the task execution status.

  • waiter_max_attempts (int) – Number of attempts to wait before returning FAILED.

execution_name[source]
domain_id = None[source]
project_id = None[source]
domain_region = None[source]
input_config[source]
output_config[source]
compute = None[source]
termination_condition[source]
tags[source]
waiter_delay = 10[source]
waiter_max_attempts = 1440[source]
start_notebook_execution()[source]
wait_for_execution_completion(execution_id, context)[source]

Was this entry helpful?