airflow.providers.google.cloud.operators.automl

This module contains Google AutoML operators.

Module Contents

Classes

AutoMLTrainModelOperator

Creates Google Cloud AutoML model.

AutoMLPredictOperator

Runs prediction operation on Google Cloud AutoML.

AutoMLBatchPredictOperator

Perform a batch prediction on Google Cloud AutoML.

AutoMLCreateDatasetOperator

Creates a Google Cloud AutoML dataset.

AutoMLImportDataOperator

Imports data to a Google Cloud AutoML dataset.

AutoMLTablesListColumnSpecsOperator

Lists column specs in a table.

AutoMLTablesUpdateDatasetOperator

Updates a dataset.

AutoMLGetModelOperator

Get Google Cloud AutoML model.

AutoMLDeleteModelOperator

Delete Google Cloud AutoML model.

AutoMLDeployModelOperator

Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect.

AutoMLTablesListTableSpecsOperator

Lists table specs in a dataset.

AutoMLListDatasetOperator

Lists AutoML Datasets in project.

AutoMLDeleteDatasetOperator

Deletes a dataset and all of its contents.

Attributes

MetaData

airflow.providers.google.cloud.operators.automl.MetaData[source]
class airflow.providers.google.cloud.operators.automl.AutoMLTrainModelOperator(*, model, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Creates Google Cloud AutoML model.

See also

For more information on how to use this operator, take a look at the guide: Operations On Models

Parameters
  • model (dict) – Model definition.

  • project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLPredictOperator(*, model_id=None, endpoint_id=None, location, payload, operation_params=None, instances=None, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Runs prediction operation on Google Cloud AutoML.

Warning

AutoMLPredictOperator for text, image, and video prediction has been deprecated. Please use endpoint_id param instead of model_id param.

See also

For more information on how to use this operator, take a look at the guide: Making Predictions

Parameters
  • model_id (str | None) – Name of the model requested to serve the batch prediction.

  • endpoint_id (str | None) – Name of the endpoint used for the prediction.

  • payload (dict) – Name of the model used for the prediction.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • operation_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model_id', 'location', 'project_id', 'impersonation_chain')[source]
hook()[source]
model()[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLBatchPredictOperator(*, model_id, input_config, output_config, location, project_id=PROVIDE_PROJECT_ID, prediction_params=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Perform a batch prediction on Google Cloud AutoML.

See also

For more information on how to use this operator, take a look at the guide: Making Predictions

Parameters
  • project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.

  • location (str) – The location of the project.

  • model_id (str) – Name of the model_id requested to serve the batch prediction.

  • input_config (dict) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig

  • output_config (dict) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig

  • prediction_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.

  • project_id – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model_id', 'input_config', 'output_config', 'location', 'project_id', 'impersonation_chain')[source]
hook()[source]
model()[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLCreateDatasetOperator(*, dataset, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Creates a Google Cloud AutoML dataset.

AutoMLCreateDatasetOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.dataset.CreateDatasetOperator, airflow.providers.google.cloud.operators.translate.TranslateCreateDatasetOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Creating Datasets

Parameters
  • dataset (dict) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLImportDataOperator(*, dataset_id, location, input_config, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Imports data to a Google Cloud AutoML dataset.

Warning

AutoMLImportDataOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.dataset.ImportDataOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Creating Datasets

Parameters
  • dataset_id (str) – ID of dataset to be updated.

  • input_config (dict) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset_id', 'input_config', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLTablesListColumnSpecsOperator(*, dataset_id, table_spec_id, location, field_mask=None, filter_=None, page_size=None, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists column specs in a table.

Warning

Operator AutoMLTablesListColumnSpecsOperator has been deprecated due to shutdown of a legacy version of AutoML Tables on March 31, 2024. For additional information see: https://cloud.google.com/automl-tables/docs/deprecations.

See also

For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs

Parameters
  • dataset_id (str) – Name of the dataset.

  • table_spec_id (str) – table_spec_id for path builder.

  • field_mask (dict | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask

  • filter – Filter expression, see go/filtering.

  • page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per page, this determines the maximum number of resources in a page.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset_id', 'table_spec_id', 'field_mask', 'filter_', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLTablesUpdateDatasetOperator(*, dataset, location, update_mask=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Updates a dataset.

Warning

Operator AutoMLTablesUpdateDatasetOperator has been deprecated due to shutdown of a legacy version of AutoML Tables on March 31, 2024. For additional information see: https://cloud.google.com/automl-tables/docs/deprecations. Please use airflow.providers.google.cloud.operators.vertex_ai.dataset.UpdateDatasetOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Creating Datasets

Parameters
  • dataset (dict) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset.

  • update_mask (dict | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset', 'update_mask', 'location', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLGetModelOperator(*, model_id, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Get Google Cloud AutoML model.

Warning

AutoMLGetModelOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.model_service.GetModelOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Operations On Models

Parameters
  • model_id (str) – Name of the model requested to serve the prediction.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model_id', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLDeleteModelOperator(*, model_id, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Delete Google Cloud AutoML model.

Warning

AutoMLDeleteModelOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.model_service.DeleteModelOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Operations On Models

Parameters
  • model_id (str) – Name of the model requested to serve the prediction.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model_id', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLDeployModelOperator(*, model_id, location, project_id=PROVIDE_PROJECT_ID, image_detection_metadata=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect.

Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability.

Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically.

Warning

Operator AutoMLDeployModelOperator has been deprecated due to shutdown of a legacy version of AutoML Natural Language, Vision, Video Intelligence on March 31, 2024. For additional information see: https://cloud.google.com/vision/automl/docs/deprecations . Please use airflow.providers.google.cloud.operators.vertex_ai.endpoint_service.DeployModelOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Operations On Models

Parameters
  • model_id (str) – Name of the model to be deployed.

  • image_detection_metadata (dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • params – Additional domain-specific parameters for the predictions.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('model_id', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLTablesListTableSpecsOperator(*, dataset_id, location, page_size=None, filter_=None, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists table specs in a dataset.

Warning

Operator AutoMLTablesListTableSpecsOperator has been deprecated due to shutdown of a legacy version of AutoML Tables on March 31, 2024. For additional information see: https://cloud.google.com/automl-tables/docs/deprecations.

See also

For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs

Parameters
  • dataset_id (str) – Name of the dataset.

  • filter – Filter expression, see go/filtering.

  • page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.

  • project_id (str) – ID of the Google Cloud project if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset_id', 'filter_', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLListDatasetOperator(*, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Lists AutoML Datasets in project.

Warning

AutoMLListDatasetOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.dataset.ListDatasetsOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets

Parameters
  • project_id (str) – ID of the Google Cloud project where datasets are located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.automl.AutoMLDeleteDatasetOperator(*, dataset_id, location, project_id=PROVIDE_PROJECT_ID, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Deletes a dataset and all of its contents.

AutoMLDeleteDatasetOperator for tables, video intelligence, vision and natural language has been deprecated and no longer available. Please use airflow.providers.google.cloud.operators.vertex_ai.dataset.DeleteDatasetOperator instead.

See also

For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets

Parameters
  • dataset_id (str | list[str]) – Name of the dataset_id, list of dataset_id or string of dataset_id coma separated to be deleted.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.

  • impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: collections.abc.Sequence[str] = ('dataset_id', 'location', 'project_id', 'impersonation_chain')[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?