Google Cloud Translate Operators¶
Prerequisite Tasks¶
To use these operators, you must do a few things:
Select or create a Cloud Platform project using the Cloud Console.
Enable billing for your project, as described in the Google Cloud documentation.
Enable the API, as described in the Cloud Console documentation.
Install API libraries via pip.
pip install 'apache-airflow[google]'Detailed information is available for Installation.
CloudTranslateTextOperator¶
Translate a string or list of strings.
For parameter definition, take a look at
CloudTranslateTextOperator
Using the operator¶
Basic usage of the operator:
product_set_create = CloudTranslateTextOperator(
task_id="translate",
values=["zażółć gęślą jaźń"],
target_language="en",
format_="text",
source_language=None,
model="base",
)
The result of translation is available as dictionary or array of dictionaries accessible via the usual XCom mechanisms of Airflow:
translation_access = BashOperator(
task_id="access", bash_command="echo '{{ task_instance.xcom_pull(\"translate\")[0] }}'"
)
Templating¶
template_fields: Sequence[str] = (
"values",
"target_language",
"format_",
"source_language",
"model",
"gcp_conn_id",
"impersonation_chain",
)
TranslateTextOperator¶
Translate an array of one or more text (or html) items.
Intended to use for moderate amount of text data, for large volumes please use the
TranslateTextBatchOperator
For parameter definition, take a look at
TranslateTextOperator
Using the operator¶
Basic usage of the operator:
translate_text = TranslateTextOperator(
task_id="translate_v3_op",
contents=["Ciao mondo!", "Mi puoi prendere una tazza di caffè, per favore?"],
source_language_code="it",
target_language_code="en",
)
TranslateTextBatchOperator¶
Translate large amount of text data into up to 10 target languages in a single run. List of files and other options provided by input configuration.
For parameter definition, take a look at
TranslateTextBatchOperator
TranslateCreateDatasetOperator¶
Create a native translation dataset using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateCreateDatasetOperator
Using the operator¶
Basic usage of the operator:
create_dataset_op = TranslateCreateDatasetOperator(
task_id="translate_v3_ds_create",
dataset=DATASET,
project_id=PROJECT_ID,
location=REGION,
)
TranslateImportDataOperator¶
Import data to the existing native dataset, using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateImportDataOperator
Using the operator¶
Basic usage of the operator:
import_ds_data_op = TranslateImportDataOperator(
task_id="translate_v3_ds_import_data",
dataset_id=create_dataset_op.output["dataset_id"],
input_config={
"input_files": [{"usage": "UNASSIGNED", "gcs_source": {"input_uri": DATASET_DATA_PATH}}]
},
project_id=PROJECT_ID,
location=REGION,
)
TranslateDatasetsListOperator¶
Get list of translation datasets using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDatasetsListOperator
Using the operator¶
Basic usage of the operator:
list_datasets_op = TranslateDatasetsListOperator(
task_id="translate_v3_list_ds",
project_id=PROJECT_ID,
location=REGION,
)
TranslateDeleteDatasetOperator¶
Delete a native translation dataset using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDeleteDatasetOperator
Using the operator¶
Basic usage of the operator:
delete_ds_op = TranslateDeleteDatasetOperator(
task_id="translate_v3_ds_delete",
dataset_id=create_dataset_op.output["dataset_id"],
project_id=PROJECT_ID,
location=REGION,
)
TranslateCreateModelOperator¶
Create a native translation model using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateCreateModelOperator
Using the operator¶
Basic usage of the operator:
create_model = TranslateCreateModelOperator(
task_id="translate_v3_model_create",
display_name=f"native_model_{ENV_ID}"[:32].replace("-", "_"),
dataset_id=create_dataset_op.output["dataset_id"],
project_id=PROJECT_ID,
location=REGION,
)
TranslateModelsListOperator¶
Get list of native translation models using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateModelsListOperator
Using the operator¶
Basic usage of the operator:
list_models = TranslateModelsListOperator(
task_id="translate_v3_list_models",
project_id=PROJECT_ID,
location=REGION,
)
TranslateDeleteModelOperator¶
Delete a native translation model using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDeleteModelOperator
Using the operator¶
Basic usage of the operator:
delete_model = TranslateDeleteModelOperator(
task_id="translate_v3_automl_delete_model",
model_id=model_id,
project_id=PROJECT_ID,
location=REGION,
)
TranslateDocumentOperator¶
Translate Document using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateDocumentOperator
Using the operator¶
Basic usage of the operator:
translate_document = TranslateDocumentOperator(
task_id="translate_document_op",
project_id=PROJECT_ID,
location=REGION,
source_language_code="en",
target_language_code="uk",
document_input_config=DOC_TRANSLATE_INPUT,
document_output_config=GCS_OUTPUT_DST,
)
TranslateDocumentBatchOperator¶
Translate Documents using Cloud Translate API (Advanced V3), by given input configs.
For parameter definition, take a look at
TranslateDocumentBatchOperator
Using the operator¶
Basic usage of the operator:
translate_document_batch = TranslateDocumentBatchOperator(
task_id="batch_translate_document_op",
project_id=PROJECT_ID,
location=REGION,
source_language_code="en",
target_language_codes=["uk", "fr"],
input_configs=[BATCH_DOC_INPUT_ITEM_1, BATCH_DOC_INPUT_ITEM_2],
output_config=BATCH_OUTPUT_CONFIG,
)
TranslateCreateGlossaryOperator¶
Create a translation glossary, using Cloud Translate API (Advanced V3).
For parameter definition, take a look at
TranslateCreateGlossaryOperator
Using the operator¶
Basic usage of the operator:
create_glossary = TranslateCreateGlossaryOperator(
task_id="glossary_create",
project_id=PROJECT_ID,
location=REGION,
input_config=GLOSSARY_FILE_INPUT,
glossary_id=f"glossary_new_{PROJECT_ID}",
language_pair={"source_language_code": "en", "target_language_code": "es"},
)
TranslateUpdateGlossaryOperator¶
Updates translation glossary, using Cloud Translate API (Advanced V3).
Only display_name
and input_config
fields available for update.
By updating input_config - the glossary dictionary updates.
For parameter definition, take a look at
TranslateUpdateGlossaryOperator
Using the operator¶
Basic usage of the operator:
glossary_id = create_glossary.output["glossary_id"]
update_glossary = TranslateUpdateGlossaryOperator(
task_id="glossary_update",
project_id=PROJECT_ID,
location=REGION,
new_input_config=UPDATE_GLOSSARY_FILE_INPUT,
new_display_name=f"gl_{PROJECT_ID}_updated",
glossary_id=glossary_id,
)
TranslateListGlossariesOperator¶
List all available translation glossaries on the project.
For parameter definition, take a look at
TranslateListGlossariesOperator
Using the operator¶
Basic usage of the operator:
list_glossaries = TranslateListGlossariesOperator(
task_id="list_glossaries",
page_size=100,
project_id=PROJECT_ID,
location=REGION,
)
TranslateDeleteGlossaryOperator¶
Delete the translation glossary resource.
For parameter definition, take a look at
TranslateDeleteGlossaryOperator
Using the operator¶
Basic usage of the operator:
delete_glossary = TranslateDeleteGlossaryOperator(
task_id="delete_glossary",
glossary_id=glossary_id,
project_id=PROJECT_ID,
location=REGION,
)
More information¶
See: Base (V2) Google Cloud Translate documentation. Advanced (V3) Google Cloud Translate (Advanced) documentation. Datasets Legacy and native dataset comparison.