airflow.providers.google.cloud.hooks.automl¶
This module contains a Google AutoML hook.
Classes¶
Google Cloud AutoML hook. |
Module Contents¶
- class airflow.providers.google.cloud.hooks.automl.CloudAutoMLHook(gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.common.hooks.base_google.GoogleBaseHook
Google Cloud AutoML hook.
All the methods in the hook where project_id is used must be called with keyword arguments rather than positional.
- get_conn()[source]¶
Retrieve connection to AutoML.
- Returns:
Google Cloud AutoML client object.
- Return type:
google.cloud.automl_v1beta1.AutoMlClient
- property prediction_client: google.cloud.automl_v1beta1.PredictionServiceClient[source]¶
Creates PredictionServiceClient.
- Returns:
Google Cloud AutoML PredictionServiceClient client object.
- Return type:
google.cloud.automl_v1beta1.PredictionServiceClient
- create_model(model, location, project_id=PROVIDE_PROJECT_ID, timeout=None, metadata=(), retry=DEFAULT)[source]¶
Create a model_id and returns a Model in the response field when it completes.
When you create a model, several model evaluations are created for it: a global evaluation, and one evaluation for each annotation spec.
- Parameters:
model (dict | google.cloud.automl_v1beta1.Model) – The model_id to create. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.Model
project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type:
- batch_predict(model_id, input_config, output_config, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Perform a batch prediction and returns a long-running operation object.
Unlike the online Predict, batch prediction result won’t be immediately available in the response. Instead, a long-running operation object is returned.
- Parameters:
model_id (str) – Name of the model_id requested to serve the batch prediction.
input_config (dict | google.cloud.automl_v1beta1.BatchPredictInputConfig) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig
output_config (dict | google.cloud.automl_v1beta1.BatchPredictOutputConfig) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig
params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type:
- predict(model_id, payload, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Perform an online prediction and returns the prediction result in the response.
- Parameters:
model_id (str) – Name of the model_id requested to serve the prediction.
payload (dict | google.cloud.automl_v1beta1.ExamplePayload) – Required. Payload to perform a prediction on. The payload must match the problem type that the model_id was trained to solve. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.ExamplePayload
params (dict[str, str] | None) – Additional domain-specific parameters, any string must be up to 25000 characters long.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.PredictResponse instance
- Return type:
google.cloud.automl_v1beta1.PredictResponse
- create_dataset(dataset, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Create a dataset.
- Parameters:
dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.Dataset instance.
- Return type:
google.cloud.automl_v1beta1.Dataset
- import_data(dataset_id, location, input_config, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Import data into a dataset. For Tables this method can only be called on an empty Dataset.
- Parameters:
dataset_id (str) – Name of the AutoML dataset.
input_config (dict | google.cloud.automl_v1beta1.InputConfig) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type:
- list_column_specs(dataset_id, table_spec_id, location, project_id=PROVIDE_PROJECT_ID, field_mask=None, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
List column specs in a table spec.
- Parameters:
dataset_id (str) – Name of the AutoML dataset.
table_spec_id (str) – table_spec_id for path builder.
field_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.ColumnSpec instance.
- Return type:
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListColumnSpecsPager
- get_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Get a AutoML model.
- Parameters:
model_id (str) – Name of the model.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.Model instance.
- Return type:
google.cloud.automl_v1beta1.Model
- delete_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]¶
Delete a AutoML model.
- Parameters:
model_id (str) – Name of the model.
project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance.
- Return type:
- update_dataset(dataset, update_mask=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Update a dataset.
- Parameters:
dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset.
update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.Dataset instance..
- Return type:
google.cloud.automl_v1beta1.Dataset
- deploy_model(model_id, location, project_id=PROVIDE_PROJECT_ID, image_detection_metadata=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
Deploys a model.
If a model is already deployed, deploying it with the same parameters has no effect. Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability.
Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically.
- Parameters:
model_id (str) – Name of the model requested to serve the prediction.
image_detection_metadata (google.cloud.automl_v1beta1.ImageObjectDetectionModelDeploymentMetadata | dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata
project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance.
- Return type:
- list_table_specs(dataset_id, location, project_id=PROVIDE_PROJECT_ID, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]¶
List table specs in a dataset_id.
- Parameters:
dataset_id (str) – Name of the dataset.
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.TableSpec instances. This object can also be configured to iterate over the pages of the response through the options parameter.
- Return type:
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListTableSpecsPager
- list_datasets(location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
List datasets in a project.
- Parameters:
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.Dataset instances. This object can also be configured to iterate over the pages of the response through the options parameter.
- Return type:
google.cloud.automl_v1beta1.services.auto_ml.pagers.ListDatasetsPager
- delete_dataset(dataset_id, location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
Delete a dataset and all of its contents.
- Parameters:
dataset_id (str) – ID of dataset to be deleted.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types._OperationFuture instance
- Return type:
- get_dataset(dataset_id, location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]¶
Retrieve the dataset for the given dataset_id.
- Parameters:
dataset_id (str) – ID of dataset to be retrieved.
location (str) – The location of the project.
project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (collections.abc.Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
- Returns:
google.cloud.automl_v1beta1.types.dataset.Dataset instance.
- Return type:
google.cloud.automl_v1beta1.Dataset