airflow.providers.google.cloud.operators.dlp

Various Google Cloud DLP operators which allow you to perform basic operations using Cloud DLP.

Module Contents

Classes

CloudDLPCancelDLPJobOperator

Starts asynchronous cancellation on a long-running DlpJob.

CloudDLPCreateDeidentifyTemplateOperator

Create a deidentify template to reuse frequently-used configurations for content, images, and storage.

CloudDLPCreateDLPJobOperator

Creates a new job to inspect storage or calculate risk metrics.

CloudDLPCreateInspectTemplateOperator

Create an InspectTemplate to reuse frequently-used configurations for content, images, and storage.

CloudDLPCreateJobTriggerOperator

Create a job trigger to run DLP actions such as scanning storage for sensitive info on a set schedule.

CloudDLPCreateStoredInfoTypeOperator

Creates a pre-built stored infoType to be used for inspection.

CloudDLPDeidentifyContentOperator

De-identifies potentially sensitive info from a content item; limits input size and output size.

CloudDLPDeleteDeidentifyTemplateOperator

Deletes a DeidentifyTemplate.

CloudDLPDeleteDLPJobOperator

Deletes a long-running DlpJob.

CloudDLPDeleteInspectTemplateOperator

Deletes an InspectTemplate.

CloudDLPDeleteJobTriggerOperator

Deletes a job trigger.

CloudDLPDeleteStoredInfoTypeOperator

Deletes a stored infoType.

CloudDLPGetDeidentifyTemplateOperator

Gets a DeidentifyTemplate.

CloudDLPGetDLPJobOperator

Gets the latest state of a long-running DlpJob.

CloudDLPGetInspectTemplateOperator

Gets an InspectTemplate.

CloudDLPGetDLPJobTriggerOperator

Gets a job trigger.

CloudDLPGetStoredInfoTypeOperator

Gets a stored infoType.

CloudDLPInspectContentOperator

Finds potentially sensitive info in content; limits input size, processing time, and output size.

CloudDLPListDeidentifyTemplatesOperator

Lists DeidentifyTemplates.

CloudDLPListDLPJobsOperator

Lists DlpJobs that match the specified filter in the request.

CloudDLPListInfoTypesOperator

Returns a list of the sensitive information types that the DLP API supports.

CloudDLPListInspectTemplatesOperator

Lists InspectTemplates.

CloudDLPListJobTriggersOperator

Lists job triggers.

CloudDLPListStoredInfoTypesOperator

Lists stored infoTypes.

CloudDLPRedactImageOperator

Redacts potentially sensitive info from an image; limits input size, processing time, and output size.

CloudDLPReidentifyContentOperator

Re-identifies content that has been de-identified.

CloudDLPUpdateDeidentifyTemplateOperator

Updates the DeidentifyTemplate.

CloudDLPUpdateInspectTemplateOperator

Updates the InspectTemplate.

CloudDLPUpdateJobTriggerOperator

Updates a job trigger.

CloudDLPUpdateStoredInfoTypeOperator

Updates the stored infoType by creating a new version.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCancelDLPJobOperator(*, dlp_job_id, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Starts asynchronous cancellation on a long-running DlpJob.

See also

For more information on how to use this operator, take a look at the guide: Canceling a Job

Parameters
  • dlp_job_id (str) – ID of the DLP job resource to be cancelled.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dlp_job_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCreateDeidentifyTemplateOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, deidentify_template=None, template_id=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Create a deidentify template to reuse frequently-used configurations for content, images, and storage.

See also

For more information on how to use this operator, take a look at the guide: De-Identification Template

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • deidentify_template (dict | google.cloud.dlp_v2.types.DeidentifyTemplate | None) – (Optional) The DeidentifyTemplate to create.

  • template_id (str | None) – (Optional) The template ID.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'deidentify_template', 'template_id', 'gcp_conn_id',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCreateDLPJobOperator(*, project_id=PROVIDE_PROJECT_ID, inspect_job=None, risk_job=None, job_id=None, retry=DEFAULT, timeout=None, metadata=(), wait_until_finished=True, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Creates a new job to inspect storage or calculate risk metrics.

See also

For more information on how to use this operator, take a look at the guide: Creating Job

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • inspect_job (dict | google.cloud.dlp_v2.types.InspectJobConfig | None) – (Optional) The configuration for the inspect job.

  • risk_job (dict | google.cloud.dlp_v2.types.RiskAnalysisJobConfig | None) – (Optional) The configuration for the risk job.

  • job_id (str | None) – (Optional) The job ID.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • wait_until_finished (bool) – (Optional) If true, it will keep polling the job state until it is set to DONE.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'inspect_job', 'risk_job', 'job_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCreateInspectTemplateOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, inspect_template=None, template_id=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Create an InspectTemplate to reuse frequently-used configurations for content, images, and storage.

See also

For more information on how to use this operator, take a look at the guide: Creating Template

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • inspect_template (google.cloud.dlp_v2.types.InspectTemplate | None) – (Optional) The InspectTemplate to create.

  • template_id (str | None) – (Optional) The template ID.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'inspect_template', 'template_id', 'gcp_conn_id',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCreateJobTriggerOperator(*, project_id=PROVIDE_PROJECT_ID, job_trigger=None, trigger_id=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Create a job trigger to run DLP actions such as scanning storage for sensitive info on a set schedule.

See also

For more information on how to use this operator, take a look at the guide: Creating Job Trigger

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • job_trigger (dict | google.cloud.dlp_v2.types.JobTrigger | None) – (Optional) The JobTrigger to create.

  • trigger_id (str | None) – (Optional) The JobTrigger ID.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'job_trigger', 'trigger_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPCreateStoredInfoTypeOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, config=None, stored_info_type_id=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Creates a pre-built stored infoType to be used for inspection.

See also

For more information on how to use this operator, take a look at the guide: Create Stored Info-Type

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • config (google.cloud.dlp_v2.types.StoredInfoTypeConfig | None) – (Optional) The config for the StoredInfoType.

  • stored_info_type_id (str | None) – (Optional) The StoredInfoType ID.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'config', 'stored_info_type_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeidentifyContentOperator(*, project_id=PROVIDE_PROJECT_ID, deidentify_config=None, inspect_config=None, item=None, inspect_template_name=None, deidentify_template_name=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

De-identifies potentially sensitive info from a content item; limits input size and output size.

See also

For more information on how to use this operator, take a look at the guide: De-identify Content

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • deidentify_config (dict | google.cloud.dlp_v2.types.DeidentifyConfig | None) – (Optional) Configuration for the de-identification of the content item. Items specified here will override the template referenced by the deidentify_template_name argument.

  • inspect_config (dict | google.cloud.dlp_v2.types.InspectConfig | None) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.

  • item (dict | google.cloud.dlp_v2.types.ContentItem | None) – (Optional) The item to de-identify. Will be treated as text.

  • inspect_template_name (str | None) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.

  • deidentify_template_name (str | None) – (Optional) Optional template to use. Any configuration directly specified in deidentify_config will override those set in the template.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'deidentify_config', 'inspect_config', 'item', 'inspect_template_name',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeleteDeidentifyTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes a DeidentifyTemplate.

See also

For more information on how to use this operator, take a look at the guide: De-Identification Template

Parameters
  • template_id (str) – The ID of deidentify template to be deleted.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeleteDLPJobOperator(*, dlp_job_id, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes a long-running DlpJob.

This method indicates that the client is no longer interested in the DlpJob result. The job will be cancelled if possible.

See also

For more information on how to use this operator, take a look at the guide: Deleting Job

Parameters
  • dlp_job_id (str) – The ID of the DLP job resource to be deleted.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dlp_job_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeleteInspectTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes an InspectTemplate.

See also

For more information on how to use this operator, take a look at the guide: Deleting Template

Parameters
  • template_id (str) – The ID of the inspect template to be deleted.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeleteJobTriggerOperator(*, job_trigger_id, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes a job trigger.

See also

For more information on how to use this operator, take a look at the guide: Content Method

Parameters
  • job_trigger_id (str) – The ID of the DLP job trigger to be deleted.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('job_trigger_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPDeleteStoredInfoTypeOperator(*, stored_info_type_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes a stored infoType.

See also

For more information on how to use this operator, take a look at the guide: Deleting Stored Info-Type

Parameters
  • stored_info_type_id (str) – The ID of the stored info type to be deleted.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('stored_info_type_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPGetDeidentifyTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Gets a DeidentifyTemplate.

See also

For more information on how to use this operator, take a look at the guide: De-Identification Template

Parameters
  • template_id (str) – The ID of deidentify template to be read.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPGetDLPJobOperator(*, dlp_job_id, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Gets the latest state of a long-running DlpJob.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Job

Parameters
  • dlp_job_id (str) – The ID of the DLP job resource to be read.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dlp_job_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPGetInspectTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Gets an InspectTemplate.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Template

Parameters
  • template_id (str) – The ID of inspect template to be read.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPGetDLPJobTriggerOperator(*, job_trigger_id, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Gets a job trigger.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Job Trigger

Parameters
  • job_trigger_id (str) – The ID of the DLP job trigger to be read.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('job_trigger_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPGetStoredInfoTypeOperator(*, stored_info_type_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Gets a stored infoType.

See also

For more information on how to use this operator, take a look at the guide: Retrieve Stored Info-Type

Parameters
  • stored_info_type_id (str) – The ID of the stored info type to be read.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('stored_info_type_id', 'organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPInspectContentOperator(*, project_id=PROVIDE_PROJECT_ID, inspect_config=None, item=None, inspect_template_name=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Finds potentially sensitive info in content; limits input size, processing time, and output size.

See also

For more information on how to use this operator, take a look at the guide: Using Template

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • inspect_config (dict | google.cloud.dlp_v2.types.InspectConfig | None) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.

  • item (dict | google.cloud.dlp_v2.types.ContentItem | None) – (Optional) The item to de-identify. Will be treated as text.

  • inspect_template_name (str | None) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'inspect_config', 'item', 'inspect_template_name', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListDeidentifyTemplatesOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, page_size=None, order_by=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists DeidentifyTemplates.

See also

For more information on how to use this operator, take a look at the guide: De-Identification Template

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • page_size (int | None) – (Optional) The maximum number of resources contained in the underlying API response.

  • order_by (str | None) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListDLPJobsOperator(*, project_id=PROVIDE_PROJECT_ID, results_filter=None, page_size=None, job_type=None, order_by=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists DlpJobs that match the specified filter in the request.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Job

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • results_filter (str | None) – (Optional) Filter used to specify a subset of results.

  • page_size (int | None) – (Optional) The maximum number of resources contained in the underlying API response.

  • job_type (str | None) – (Optional) The type of job.

  • order_by (str | None) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListInfoTypesOperator(*, project_id=PROVIDE_PROJECT_ID, language_code=None, results_filter=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Returns a list of the sensitive information types that the DLP API supports.

See also

For more information on how to use this operator, take a look at the guide: Retrieve Stored Info-Type

Parameters
  • language_code (str | None) – (Optional) Optional BCP-47 language code for localized infoType friendly names. If omitted, or if localized strings are not available, en-US strings will be returned.

  • results_filter (str | None) – (Optional) Filter used to specify a subset of results.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('language_code', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListInspectTemplatesOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, page_size=None, order_by=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists InspectTemplates.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Template

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • page_size (int | None) – (Optional) The maximum number of resources contained in the underlying API response.

  • order_by (str | None) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListJobTriggersOperator(*, project_id=PROVIDE_PROJECT_ID, page_size=None, order_by=None, results_filter=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists job triggers.

See also

For more information on how to use this operator, take a look at the guide: Retrieving Job Trigger

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • page_size (int | None) – (Optional) The maximum number of resources contained in the underlying API response.

  • order_by (str | None) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.

  • results_filter (str | None) – (Optional) Filter used to specify a subset of results.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPListStoredInfoTypesOperator(*, organization_id=None, project_id=PROVIDE_PROJECT_ID, page_size=None, order_by=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists stored infoTypes.

See also

For more information on how to use this operator, take a look at the guide: Retrieve Stored Info-Type

Parameters
  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • page_size (int | None) – (Optional) The maximum number of resources contained in the underlying API response.

  • order_by (str | None) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('organization_id', 'project_id', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPRedactImageOperator(*, project_id=PROVIDE_PROJECT_ID, inspect_config=None, image_redaction_configs=None, include_findings=None, byte_item=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Redacts potentially sensitive info from an image; limits input size, processing time, and output size.

See also

For more information on how to use this operator, take a look at the guide: Reference

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • inspect_config (dict | google.cloud.dlp_v2.types.InspectConfig | None) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.

  • image_redaction_configs (None | list[dict] | list[google.cloud.dlp_v2.types.RedactImageRequest.ImageRedactionConfig]) – (Optional) The configuration for specifying what content to redact from images.

  • include_findings (bool | None) – (Optional) Whether the response should include findings along with the redacted image.

  • byte_item (dict | google.cloud.dlp_v2.types.ByteContentItem | None) – (Optional) The content must be PNG, JPEG, SVG or BMP.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'inspect_config', 'image_redaction_configs', 'include_findings', 'byte_item',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPReidentifyContentOperator(*, project_id=PROVIDE_PROJECT_ID, reidentify_config=None, inspect_config=None, item=None, inspect_template_name=None, reidentify_template_name=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Re-identifies content that has been de-identified.

See also

For more information on how to use this operator, take a look at the guide: Re-identify Content

Parameters
  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • reidentify_config (dict | google.cloud.dlp_v2.types.DeidentifyConfig | None) – (Optional) Configuration for the re-identification of the content item.

  • inspect_config (dict | google.cloud.dlp_v2.types.InspectConfig | None) – (Optional) Configuration for the inspector.

  • item (dict | google.cloud.dlp_v2.types.ContentItem | None) – (Optional) The item to re-identify. Will be treated as text.

  • inspect_template_name (str | None) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.

  • reidentify_template_name (str | None) – (Optional) Optional template to use. References an instance of DeidentifyTemplate. Any configuration directly specified in reidentify_config or inspect_config will override those set in the template.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('project_id', 'reidentify_config', 'inspect_config', 'item', 'inspect_template_name',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPUpdateDeidentifyTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, deidentify_template=None, update_mask=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Updates the DeidentifyTemplate.

See also

For more information on how to use this operator, take a look at the guide: De-Identification Template

Parameters
  • template_id (str) – The ID of deidentify template to be updated.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • deidentify_template (dict | google.cloud.dlp_v2.types.DeidentifyTemplate | None) – New DeidentifyTemplate value.

  • update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask to control which fields get updated.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'deidentify_template', 'update_mask',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPUpdateInspectTemplateOperator(*, template_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, inspect_template=None, update_mask=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Updates the InspectTemplate.

See also

For more information on how to use this operator, take a look at the guide: Updating Template

Parameters
  • template_id (str) – The ID of the inspect template to be updated.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • inspect_template (dict | google.cloud.dlp_v2.types.InspectTemplate | None) – New InspectTemplate value.

  • update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask to control which fields get updated.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('template_id', 'organization_id', 'project_id', 'inspect_template', 'update_mask',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPUpdateJobTriggerOperator(*, job_trigger_id, project_id=PROVIDE_PROJECT_ID, job_trigger=None, update_mask=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Updates a job trigger.

See also

For more information on how to use this operator, take a look at the guide: Updating Job Trigger

Parameters
  • job_trigger_id – The ID of the DLP job trigger to be updated.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • job_trigger (dict | google.cloud.dlp_v2.types.JobTrigger | None) – New JobTrigger value.

  • update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask to control which fields get updated.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('job_trigger_id', 'project_id', 'job_trigger', 'update_mask', 'gcp_conn_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.dlp.CloudDLPUpdateStoredInfoTypeOperator(*, stored_info_type_id, organization_id=None, project_id=PROVIDE_PROJECT_ID, config=None, update_mask=None, retry=DEFAULT, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Updates the stored infoType by creating a new version.

See also

For more information on how to use this operator, take a look at the guide: Update Stored Info-Type

Parameters
  • stored_info_type_id – The ID of the stored info type to be updated.

  • organization_id (str | None) – (Optional) The organization ID. Required to set this field if parent resource is an organization.

  • project_id (str) – (Optional) Google Cloud project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organization.

  • config (dict | google.cloud.dlp_v2.types.StoredInfoTypeConfig | None) – Updated configuration for the storedInfoType. If not provided, a new version of the storedInfoType will be created with the existing configuration.

  • update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask to control which fields get updated.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('stored_info_type_id', 'organization_id', 'project_id', 'config', 'update_mask', 'gcp_conn_id',...[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?