airflow.providers.google.cloud.hooks.automl
¶
This module contains a Google AutoML hook.
Module Contents¶
Classes¶
Google Cloud AutoML hook. |
- class airflow.providers.google.cloud.hooks.automl.CloudAutoMLHook(gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.common.hooks.base_google.GoogleBaseHook
Google Cloud AutoML hook.
All the methods in the hook where project_id is used must be called with keyword arguments rather than positional.
- get_conn()[source]¶
Retrieve connection to AutoML.
- Returns
Google Cloud AutoML client object.
- Return type
google.cloud.automl_v1beta1.AutoMlClient
- prediction_client()[source]¶
Creates PredictionServiceClient.
- Returns
Google Cloud AutoML PredictionServiceClient client object.
- Return type
google.cloud.automl_v1beta1.PredictionServiceClient
- create_model(model, location, project_id=PROVIDE_PROJECT_ID, timeout=None, metadata=(), retry=DEFAULT)[source]¶
Create a model_id and returns a Model in the response field when it completes.
When you create a model, several model evaluations are created for it: a global evaluation, and one evaluation for each annotation spec.
- Parameters
model (dict | google.cloud.automl_v1beta1.Model) – The model_id to create. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.Model
project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type
- batch_predict(model_id, input_config, output_config, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Perform a batch prediction and returns a long-running operation object.
Unlike the online Predict, batch prediction result won’t be immediately available in the response. Instead, a long-running operation object is returned.
- Parameters
model_id (str) – Name of the model_id requested to serve the batch prediction.
input_config (dict | google.cloud.automl_v1beta1.BatchPredictInputConfig) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig
output_config (dict | google.cloud.automl_v1beta1.BatchPredictOutputConfig) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig
params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type
- predict(model_id, payload, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Perform an online prediction and returns the prediction result in the response.
- Parameters
model_id (str) – Name of the model_id requested to serve the prediction.
payload (dict | google.cloud.automl_v1beta1.ExamplePayload) – Required. Payload to perform a prediction on. The payload must match the problem type that the model_id was trained to solve. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.ExamplePayload
params (dict[str, str] | None) – Additional domain-specific parameters, any string must be up to 25000 characters long.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.PredictResponse instance
- Return type
google.cloud.automl_v1beta1.PredictResponse
- create_dataset(dataset, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Create a dataset.
- Parameters
dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.Dataset instance.
- Return type
google.cloud.automl_v1beta1.Dataset
- import_data(dataset_id, location, input_config, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Import data into a dataset. For Tables this method can only be called on an empty Dataset.
- Parameters
dataset_id (str) – Name of the AutoML dataset.
input_config (dict | google.cloud.automl_v1beta1.InputConfig) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type
- list_column_specs(dataset_id, table_spec_id, location, project_id=PROVIDE_PROJECT_ID, field_mask=None, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
List column specs in a table spec.
- Parameters
dataset_id (str) – Name of the AutoML dataset.
table_spec_id (str) – table_spec_id for path builder.
field_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.ColumnSpec instance.
- Return type
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListColumnSpecsPager
- get_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Get a AutoML model.
- Parameters
model_id (str) – Name of the model.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.Model instance.
- Return type
google.cloud.automl_v1beta1.Model
- delete_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Delete a AutoML model.
- Parameters
model_id (str) – Name of the model.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance.
- Return type
- update_dataset(dataset, update_mask=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Update a dataset.
- Parameters
dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset.
update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.Dataset instance..
- Return type
google.cloud.automl_v1beta1.Dataset
- deploy_model(model_id, location, project_id=PROVIDE_PROJECT_ID, image_detection_metadata=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Deploys a model.
If a model is already deployed, deploying it with the same parameters has no effect. Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability.
Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically.
- Parameters
model_id (str) – Name of the model requested to serve the prediction.
image_detection_metadata (google.cloud.automl_v1beta1.ImageObjectDetectionModelDeploymentMetadata | dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata
project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance.
- Return type
- list_table_specs(dataset_id, location, project_id=PROVIDE_PROJECT_ID, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
List table specs in a dataset_id.
- Parameters
dataset_id (str) – Name of the dataset.
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.TableSpec instances. This object can also be configured to iterate over the pages of the response through the options parameter.
- Return type
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListTableSpecsPager
- list_datasets(location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
List datasets in a project.
- Parameters
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.Dataset instances. This object can also be configured to iterate over the pages of the response through the options parameter.
- Return type
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListDatasetsPager
- delete_dataset(dataset_id, location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
Delete a dataset and all of its contents.
- Parameters
dataset_id (str) – ID of dataset to be deleted.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type
- get_dataset(dataset_id, location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
Retrieve the dataset for the given dataset_id.
- Parameters
dataset_id (str) – ID of dataset to be retrieved.
location (str) – The location of the project.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns
google.cloud.automl_v1beta1.types.dataset.Dataset instance.
- Return type
google.cloud.automl_v1beta1.Dataset