Google Cloud Storage to Azure Blob Storage transfer

Google Cloud Storage and Azure Blob Storage are object stores commonly used for data lakes and file exchange. This guide describes copying objects from GCS into an Azure Blob container.

Install the optional dependency when using this operator:

pip install 'apache-airflow-providers-microsoft-azure[google]'

Prerequisite Tasks

To use these operators, you must do a few things:

Operator

Use GCSToAzureBlobStorageOperator to list objects under a GCS prefix and upload them to a container using blob_prefix as the base path. Use keep_directory_structure and flatten_structure the same way as GCSToS3Operator (flatten_structure wins when both apply). Object keys ending with / (GCS console folder markers) are not copied.

Example:

copy_gcs_to_azure = GCSToAzureBlobStorageOperator(
    task_id="gcs_to_azure_blob",
    gcs_bucket="my-gcs-bucket",
    prefix="exports/daily/",
    container_name="my-container",
    blob_prefix="imports/daily",
    gcp_conn_id="google_cloud_default",
    wasb_conn_id="wasb_default",
    replace=True,
)

Reference

Was this entry helpful?