airflow.providers.microsoft.azure.transfers.s3_to_wasb

Module Contents

Classes

S3ToAzureBlobStorageOperator

Operator to move data from and AWS S3 Bucket to Microsoft Azure Blob Storage.

exception airflow.providers.microsoft.azure.transfers.s3_to_wasb.TooManyFilesToMoveException(number_of_files)[source]

Bases: Exception

Custom exception thrown when attempting to move multiple files from S3 to a single Azure Blob.

exception airflow.providers.microsoft.azure.transfers.s3_to_wasb.InvalidAzureBlobParameters[source]

Bases: Exception

Custom exception raised when neither a blob_prefix or blob_name are passed to the operator.

exception airflow.providers.microsoft.azure.transfers.s3_to_wasb.InvalidKeyComponents[source]

Bases: Exception

Custom exception raised when neither a full_path or file_name + prefix are provided to _create_key.

class airflow.providers.microsoft.azure.transfers.s3_to_wasb.S3ToAzureBlobStorageOperator(*, aws_conn_id='aws_default', wasb_conn_id='wasb_default', s3_bucket, container_name, s3_prefix=None, s3_key=None, blob_prefix=None, blob_name=None, create_container=False, replace=False, s3_verify=False, s3_extra_args=None, wasb_extra_args=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Operator to move data from and AWS S3 Bucket to Microsoft Azure Blob Storage.

A similar class exists to move data from Microsoft Azure Blob Storage to an AWS S3 Bucket, and lives in the airflow/providers/amazon/aws/transfers/azure_blob_to_s3.py file

Either an explicit S3 key can be provided, or a prefix containing the files that are to be transferred to Azure blob storage. The same holds for a Blob name; an explicit name can be passed, or a Blob prefix can be provided for the file to be stored to

Parameters
  • aws_conn_id (str) – ID for the AWS S3 connection to use.

  • wasb_conn_id (str) – ID for the Azure Blob Storage connection to use.

  • s3_bucket (str) – The name of the AWS S3 bucket that an object (or objects) would be transferred from. (templated)

  • container_name (str) – The name of the Azure Storage Blob container an object (or objects) would be transferred to. (templated)

  • s3_prefix (str | None) – Prefix string that filters any S3 objects that begin with this prefix. (templated)

  • s3_key (str | None) – An explicit S3 key (object) to be transferred. (templated)

  • blob_prefix (str | None) – Prefix string that would provide a path in the Azure Storage Blob container for an object (or objects) to be moved to. (templated)

  • blob_name (str | None) – An explicit blob name that an object would be transferred to. This can only be used if a single file is being moved. If there are multiple files in an S3 bucket that are to be moved to a single Azure blob, an exception will be raised. (templated)

  • create_container (bool) – True if a container should be created if it did not already exist, False otherwise.

  • replace (bool) – If a blob exists in the container and replace takes a value of true, it will be overwritten. If replace is False and a blob exists in the container, the file will NOT be overwritten.

  • s3_verify (bool) –

    Whether or not to verify SSL certificates for S3 connection. By default, SSL certificates are verified. You can provide the following values:

    • False: do not validate SSL certificates. SSL will still be used

      (unless use_ssl is False), but SSL certificates will not be verified.

    • path/to/cert/bundle.pem: A filename of the CA cert bundle to uses.

      You can specify this argument if you want to use a different CA cert bundle than the one used by botocore.

  • s3_extra_args (dict | None) – kwargs to pass to S3Hook.

  • wasb_extra_args (dict | None) – kwargs to pass to WasbHook.

template_fields: collections.abc.Sequence[str] = ('s3_bucket', 'container_name', 's3_prefix', 's3_key', 'blob_prefix', 'blob_name')[source]
s3_hook()[source]

Create and return an S3Hook.

wasb_hook()[source]

Create and return a WasbHook.

execute(context)[source]

Execute logic below when operator is executed as a task.

get_files_to_move()[source]

Determine the list of files that need to be moved, and return the name.

move_file(file_name)[source]

Move file from S3 to Azure Blob storage.

Was this entry helpful?