airflow.providers.google.suite.hooks.drive

Hook for Google Drive service.

Classes

GoogleDriveHook

Hook for the Google Drive APIs.

Module Contents

class airflow.providers.google.suite.hooks.drive.GoogleDriveHook(api_version='v3', gcp_conn_id='google_cloud_default', impersonation_chain=None)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for the Google Drive APIs.

Parameters:
  • api_version (str) – API version used (for example v3).

  • gcp_conn_id (str) – The connection ID to use when fetching connection info.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account.

api_version = 'v3'[source]
get_conn()[source]

Retrieve the connection to Google Drive.

Returns:

Google Drive services object.

Return type:

Any

get_media_request(file_id)[source]

Return a get_media http request to a Google Drive object.

Parameters:

file_id (str) – The Google Drive file id

Returns:

request

Return type:

googleapiclient.http.HttpRequest

exists(folder_id, file_name, drive_id=None, *, include_trashed=True)[source]

Check to see if a file exists within a Google Drive folder.

Parameters:
  • folder_id (str) – The id of the Google Drive folder in which the file resides

  • file_name (str) – The name of a file in Google Drive

  • drive_id (str | None) – Optional. The id of the shared Google Drive in which the file resides.

  • include_trashed (bool) – Whether to include objects in trash or not, default True as in Google API.

Returns:

True if the file exists, False otherwise

Return type:

bool

get_file_id(folder_id, file_name, drive_id=None, *, include_trashed=True)[source]

Return the file id of a Google Drive file.

Parameters:
  • folder_id (str) – The id of the Google Drive folder in which the file resides

  • file_name (str) – The name of a file in Google Drive

  • drive_id (str | None) – Optional. The id of the shared Google Drive in which the file resides.

  • include_trashed (bool) – Whether to include objects in trash or not, default True as in Google API.

Returns:

Google Drive file id if the file exists, otherwise None

Return type:

dict

upload_file(local_location, remote_location, chunk_size=100 * 1024 * 1024, resumable=False, folder_id='root', show_full_target_path=True)[source]

Upload a file that is available locally to a Google Drive service.

Parameters:
  • local_location (str) – The path where the file is available.

  • remote_location (str) – The path where the file will be send

  • chunk_size (int) – File will be uploaded in chunks of this many bytes. Only used if resumable=True. Pass in a value of -1 if the file is to be uploaded as a single chunk. Note that Google App Engine has a 5MB limit on request size, so you should never set your chunk size larger than 5MB, or to -1.

  • resumable (bool) – True if this is a resumable upload. False means upload in a single request.

  • folder_id (str) – The base/root folder id for remote_location (part of the drive URL of a folder).

  • show_full_target_path (bool) – If true then it reveals full available file path in the logs.

Returns:

File ID

Return type:

str

download_file(file_id, file_handle, chunk_size=100 * 1024 * 1024)[source]

Download a file from Google Drive.

Parameters:
  • file_id (str) – the id of the file

  • file_handle (IO) – file handle used to write the content to

  • chunk_size (int) – File will be downloaded in chunks of this many bytes.

Was this entry helpful?