Amazon SageMaker Unified Studio

Amazon SageMaker Unified Studio is a unified development experience that brings together AWS data, analytics, artificial intelligence (AI), and machine learning (ML) services. It provides a place to build, deploy, execute, and monitor end-to-end workflows from a single interface. This helps drive collaboration across teams and facilitate agile development.

Airflow provides operators to orchestrate Notebooks, Querybooks, and Visual ETL jobs within SageMaker Unified Studio Workflows.

Prerequisite Tasks

To use these operators, you must do a few things:

  • Create a SageMaker Unified Studio domain and project, following the instruction in AWS documentation.

  • Within your project: * Navigate to the “Compute > Workflow environments” tab, and click “Create” to create a new MWAA environment. * Create a Notebook, Querybook, or Visual ETL job and save it to your project.

Operators

Create an Amazon SageMaker Unified Studio Workflow

To create an Amazon SageMaker Unified Studio workflow to orchestrate your notebook, querybook, and visual ETL runs you can use SageMakerNotebookOperator.

amazon/tests/system/amazon/aws/example_sagemaker_unified_studio.py

notebook_path = "test_notebook.ipynb"  # This should be the path to your .ipynb, .sqlnb, or .vetl file in your project.

run_notebook = SageMakerNotebookOperator(
    task_id="run-notebook",
    input_config={"input_path": notebook_path, "input_params": {}},
    output_config={"output_formats": ["NOTEBOOK"]},  # optional
    compute={
        "instance_type": "ml.m5.large",
        "volume_size_in_gb": 30,
    },  # optional
    termination_condition={"max_runtime_in_seconds": 600},  # optional
    tags={},  # optional
    wait_for_completion=True,  # optional
    waiter_delay=5,  # optional
    deferrable=False,  # optional
    executor_config={  # optional
        "overrides": {"containerOverrides": {"environment": mock_mwaa_environment_params}}
    },
)

Reference

Was this entry helpful?