airflow.providers.microsoft.azure.transfers.gcs_to_wasb¶
This module contains Google Cloud Storage to Azure Blob Storage operator.
Classes¶
Synchronizes objects from a Google Cloud Storage bucket to Azure Blob Storage. |
Module Contents¶
- class airflow.providers.microsoft.azure.transfers.gcs_to_wasb.GCSToAzureBlobStorageOperator(*, gcs_bucket, container_name, blob_prefix='', prefix=None, gcp_conn_id='google_cloud_default', google_impersonation_chain=None, wasb_conn_id='wasb_default', replace=False, keep_directory_structure=True, flatten_structure=False, match_glob=None, gcp_user_project=None, create_container=False, **kwargs)[source]¶
Bases:
airflow.providers.common.compat.sdk.BaseOperatorSynchronizes objects from a Google Cloud Storage bucket to Azure Blob Storage.
Note
When
flatten_structure=True, it takes precedence overkeep_directory_structure. For example, withflatten_structure=True,folder/subfolder/file.txtbecomesfile.txtregardless of thekeep_directory_structuresetting.Objects whose names end with
/(GCS console folder markers) and keys that become an empty destination path afterflatten_structureare skipped.See also
For more information on how to use this operator, take a look at the guide: Operator
- Parameters:
gcs_bucket (str) – The GCS bucket to list objects from. (templated)
prefix (str | None) – Prefix to filter object names under the bucket. (templated)
gcp_conn_id (str) – Airflow connection ID for Google Cloud.
google_impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional impersonation chain for GCP credentials.
gcp_user_project (str | None) – Requester-pays billing project for GCS requests, if required.
match_glob (str | None) – Optional glob filter for object names (requires
apache-airflow-providers-google>=10.3.0).container_name (str) – Azure Blob container to upload into. (templated)
blob_prefix (str) – Base blob path for uploaded objects. (templated)
wasb_conn_id (str) – Airflow connection ID for Azure Blob Storage.
replace (bool) – If
True, overwrite existing blobs (overwrite=Trueon upload) and upload all listed objects. IfFalse, skip objects that already exist underblob_prefixwith the same relative path and passoverwrite=Falseon upload.keep_directory_structure (bool) – When
Falseandprefixis set (andflatten_structureisFalse), appendprefixtoblob_prefix.flatten_structure (bool) – If
True, upload each object using only its file name underblob_prefix. Takes precedence overkeep_directory_structure.create_container (bool) – If
True, create the container when missing before upload.
- template_fields: collections.abc.Sequence[str] = ('gcs_bucket', 'prefix', 'blob_prefix', 'container_name', 'google_impersonation_chain',...[source]¶