airflow.providers.amazon.aws.sensors.comprehend

Module Contents

Classes

ComprehendBaseSensor

General sensor behavior for Amazon Comprehend.

ComprehendStartPiiEntitiesDetectionJobCompletedSensor

Poll the state of the pii entities detection job until it reaches a completed state; fails if the job fails.

ComprehendCreateDocumentClassifierCompletedSensor

Poll the state of the document classifier until it reaches a completed state; fails if the job fails.

class airflow.providers.amazon.aws.sensors.comprehend.ComprehendBaseSensor(deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]

Bases: airflow.providers.amazon.aws.sensors.base_aws.AwsBaseSensor[airflow.providers.amazon.aws.hooks.comprehend.ComprehendHook]

General sensor behavior for Amazon Comprehend.

Subclasses must implement following methods:
  • get_state()

Subclasses must set the following fields:
  • INTERMEDIATE_STATES

  • FAILURE_STATES

  • SUCCESS_STATES

  • FAILURE_MESSAGE

Parameters

deferrable (bool) – If True, the sensor will operate in deferrable mode. This mode requires aiobotocore module to be installed. (default: False, but can be overridden in config file by setting default_deferrable to True)

aws_hook_class[source]
INTERMEDIATE_STATES: tuple[str, Ellipsis] = ()[source]
FAILURE_STATES: tuple[str, Ellipsis] = ()[source]
SUCCESS_STATES: tuple[str, Ellipsis] = ()[source]
FAILURE_MESSAGE = ''[source]
ui_color = '#66c3ff'[source]
poke(context, **kwargs)[source]

Override when deriving this class.

abstract get_state()[source]

Implement in subclasses.

class airflow.providers.amazon.aws.sensors.comprehend.ComprehendStartPiiEntitiesDetectionJobCompletedSensor(*, job_id, max_retries=75, poke_interval=120, **kwargs)[source]

Bases: ComprehendBaseSensor

Poll the state of the pii entities detection job until it reaches a completed state; fails if the job fails.

See also

For more information on how to use this sensor, take a look at the guide: Wait for an Amazon Comprehend Start PII Entities Detection Job

Parameters
  • job_id (str) – The id of the Comprehend pii entities detection job.

  • deferrable – If True, the sensor will operate in deferrable mode. This mode requires aiobotocore module to be installed. (default: False, but can be overridden in config file by setting default_deferrable to True)

  • poke_interval (int) – Polling period in seconds to check for the status of the job. (default: 120)

  • max_retries (int) – Number of times before returning the current state. (default: 75)

  • aws_conn_id – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • region_name – AWS region_name. If not specified then the default boto3 behaviour is used.

  • verify – Whether to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  • botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

INTERMEDIATE_STATES: tuple[str, Ellipsis] = ('IN_PROGRESS',)[source]
FAILURE_STATES: tuple[str, Ellipsis] = ('FAILED', 'STOP_REQUESTED', 'STOPPED')[source]
SUCCESS_STATES: tuple[str, Ellipsis] = ('COMPLETED',)[source]
FAILURE_MESSAGE = 'Comprehend start pii entities detection job sensor failed.'[source]
template_fields: collections.abc.Sequence[str][source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

get_state()[source]

Implement in subclasses.

class airflow.providers.amazon.aws.sensors.comprehend.ComprehendCreateDocumentClassifierCompletedSensor(*, document_classifier_arn, fail_on_warnings=False, max_retries=75, poke_interval=120, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), aws_conn_id='aws_default', **kwargs)[source]

Bases: airflow.providers.amazon.aws.sensors.base_aws.AwsBaseSensor[airflow.providers.amazon.aws.hooks.comprehend.ComprehendHook]

Poll the state of the document classifier until it reaches a completed state; fails if the job fails.

See also

For more information on how to use this sensor, take a look at the guide: Wait for an Amazon Comprehend Document Classifier

Parameters
  • document_classifier_arn (str) – The arn of the Comprehend document classifier.

  • fail_on_warnings (bool) – If set to True, the document classifier training job will throw an error when the status is TRAINED_WITH_WARNING. (default False)

  • deferrable (bool) – If True, the sensor will operate in deferrable mode. This mode requires aiobotocore module to be installed. (default: False, but can be overridden in config file by setting default_deferrable to True)

  • poke_interval (int) – Polling period in seconds to check for the status of the job. (default: 120)

  • max_retries (int) – Number of times before returning the current state. (default: 75)

  • aws_conn_id (str | None) – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • region_name – AWS region_name. If not specified then the default boto3 behaviour is used.

  • verify – Whether to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  • botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

aws_hook_class[source]
INTERMEDIATE_STATES: tuple[str, Ellipsis] = ('SUBMITTED', 'TRAINING')[source]
FAILURE_STATES: tuple[str, Ellipsis] = ('DELETING', 'STOP_REQUESTED', 'STOPPED', 'IN_ERROR')[source]
SUCCESS_STATES: tuple[str, Ellipsis] = ('TRAINED', 'TRAINED_WITH_WARNING')[source]
FAILURE_MESSAGE = 'Comprehend document classifier failed.'[source]
template_fields: collections.abc.Sequence[str][source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

poke(context, **kwargs)[source]

Override when deriving this class.

Was this entry helpful?