Amazon SageMaker Unified Studio¶
Amazon SageMaker Unified Studio is a unified development experience that brings together AWS data, analytics, artificial intelligence (AI), and machine learning (ML) services. It provides a place to build, deploy, execute, and monitor end-to-end workflows from a single interface. This helps drive collaboration across teams and facilitate agile development.
Airflow provides operators to orchestrate Notebooks, Querybooks, and Visual ETL jobs within SageMaker Unified Studio Workflows.
Prerequisite Tasks¶
To use these operators, you must do a few things:
Create a SageMaker Unified Studio domain and project, following the instruction in AWS documentation.
Within your project: * Navigate to the “Compute > Workflow environments” tab, and click “Create” to create a new MWAA environment. * Create a Notebook, Querybook, or Visual ETL job and save it to your project.
Operators¶
Create an Amazon SageMaker Unified Studio Workflow¶
To create an Amazon SageMaker Unified Studio workflow to orchestrate your notebook, querybook, and visual ETL runs you can use
SageMakerNotebookOperator
.
amazon/tests/system/amazon/aws/example_sagemaker_unified_studio.py
notebook_path = "test_notebook.ipynb" # This should be the path to your .ipynb, .sqlnb, or .vetl file in your project.
run_notebook = SageMakerNotebookOperator(
task_id="run-notebook",
input_config={"input_path": notebook_path, "input_params": {}},
output_config={"output_formats": ["NOTEBOOK"]}, # optional
compute={
"instance_type": "ml.m5.large",
"volume_size_in_gb": 30,
}, # optional
termination_condition={"max_runtime_in_seconds": 600}, # optional
tags={}, # optional
wait_for_completion=True, # optional
waiter_delay=5, # optional
deferrable=False, # optional
executor_config={ # optional
"overrides": {"containerOverrides": {"environment": mock_mwaa_environment_params}}
},
)