airflow.providers.amazon.aws.hooks.sagemaker_unified_studio¶
This module contains the Amazon SageMaker Unified Studio Notebook hook.
Classes¶
Interact with Sagemaker Unified Studio Workflows. |
Module Contents¶
- class airflow.providers.amazon.aws.hooks.sagemaker_unified_studio.SageMakerNotebookHook(execution_name, input_config=None, output_config=None, compute=None, termination_condition=None, tags=None, waiter_delay=10, waiter_max_attempts=1440, *args, **kwargs)[source]¶
Bases:
airflow.hooks.base.BaseHook
Interact with Sagemaker Unified Studio Workflows.
This hook provides a wrapper around the Sagemaker Workflows Notebook Execution API.
- Examples:
from airflow.providers.amazon.aws.hooks.sagemaker_unified_studio import SageMakerNotebookHook notebook_hook = SageMakerNotebookHook( input_config={"input_path": "path/to/notebook.ipynb", "input_params": {"param1": "value1"}}, output_config={"output_uri": "folder/output/location/prefix", "output_formats": "NOTEBOOK"}, execution_name="notebook_execution", waiter_delay=10, waiter_max_attempts=1440, )
- Parameters:
execution_name (str) – The name of the notebook job to be executed, this is same as task_id.
input_config (dict | None) – Configuration for the input file. Example: {‘input_path’: ‘folder/input/notebook.ipynb’, ‘input_params’: {‘param1’: ‘value1’}}
output_config (dict | None) – Configuration for the output format. It should include an output_formats parameter to specify the output format. Example: {‘output_formats’: [‘NOTEBOOK’]}
compute (dict | None) – compute configuration to use for the notebook execution. This is a required attribute if the execution is on a remote compute. Example: { “instance_type”: “ml.m5.large”, “volume_size_in_gb”: 30, “volume_kms_key_id”: “”, “image_uri”: “string”, “container_entrypoint”: [ “string” ]}
termination_condition (dict | None) – conditions to match to terminate the remote execution. Example: { “MaxRuntimeInSeconds”: 3600 }
tags (dict | None) – tags to be associated with the remote execution runs. Example: { “md_analytics”: “logs” }
waiter_delay (int) – Interval in seconds to check the task execution status.
waiter_max_attempts (int) – Number of attempts to wait before returning FAILED.