airflow.providers.amazon.aws.operators.bedrock
¶
Module Contents¶
Classes¶
Invoke the specified Bedrock model to run inference using the input provided. |
|
Create a fine-tuning job to customize a base model. |
|
Create a fine-tuning job to customize a base model. |
|
Create a knowledge base that contains data sources used by Amazon Bedrock LLMs and Agents. |
|
Set up an Amazon Bedrock Data Source to be added to an Amazon Bedrock Knowledge Base. |
|
Begin an ingestion job, in which an Amazon Bedrock data source is added to an Amazon Bedrock knowledge base. |
- class airflow.providers.amazon.aws.operators.bedrock.BedrockInvokeModelOperator(model_id, input_data, content_type=None, accept_type=None, **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockRuntimeHook
]Invoke the specified Bedrock model to run inference using the input provided.
Use InvokeModel to run inference for text models, image models, and embedding models. To see the format and content of the input_data field for different models, refer to Inference parameters docs.
See also
For more information on how to use this operator, take a look at the guide: Invoke an existing Amazon Bedrock Model
- Parameters
model_id (str) – The ID of the Bedrock model. (templated)
input_data (dict[str, Any]) – Input data in the format specified in the content-type request header. (templated)
content_type (str | None) – The MIME type of the input data in the request. (templated) Default: application/json
accept – The desired MIME type of the inference body in the response. (templated) Default: application/json
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
- class airflow.providers.amazon.aws.operators.bedrock.BedrockCustomizeModelOperator(job_name, custom_model_name, role_arn, base_model_id, training_data_uri, output_data_uri, hyperparameters, ensure_unique_job_name=True, customization_job_kwargs=None, wait_for_completion=True, waiter_delay=120, waiter_max_attempts=75, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockHook
]Create a fine-tuning job to customize a base model.
See also
For more information on how to use this operator, take a look at the guide: Customize an existing Amazon Bedrock Model
- Parameters
job_name (str) – A unique name for the fine-tuning job.
custom_model_name (str) – A name for the custom model being created.
role_arn (str) – The Amazon Resource Name (ARN) of an IAM role that Amazon Bedrock can assume to perform tasks on your behalf.
base_model_id (str) – Name of the base model.
training_data_uri (str) – The S3 URI where the training data is stored.
output_data_uri (str) – The S3 URI where the output data is stored.
hyperparameters (dict[str, str]) – Parameters related to tuning the model.
ensure_unique_job_name (bool) – If set to true, operator will check whether a model customization job already exists for the name in the config and append the current timestamp if there is a name conflict. (Default: True)
customization_job_kwargs (dict[str, Any] | None) – Any optional parameters to pass to the API.
wait_for_completion (bool) – Whether to wait for cluster to stop. (default: True)
waiter_delay (int) – Time in seconds to wait between status checks. (default: 120)
waiter_max_attempts (int) – Maximum number of attempts to check for job completion. (default: 75)
deferrable (bool) – If True, the operator will wait asynchronously for the cluster to stop. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False)
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
- class airflow.providers.amazon.aws.operators.bedrock.BedrockCreateProvisionedModelThroughputOperator(model_units, provisioned_model_name, model_id, create_throughput_kwargs=None, wait_for_completion=True, waiter_delay=60, waiter_max_attempts=20, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockHook
]Create a fine-tuning job to customize a base model.
See also
For more information on how to use this operator, take a look at the guide: Provision Throughput for an existing Amazon Bedrock Model
- Parameters
model_units (int) – Number of model units to allocate. (templated)
provisioned_model_name (str) – Unique name for this provisioned throughput. (templated)
model_id (str) – Name or ARN of the model to associate with this provisioned throughput. (templated)
create_throughput_kwargs (dict[str, Any] | None) – Any optional parameters to pass to the API.
wait_for_completion (bool) – Whether to wait for cluster to stop. (default: True)
waiter_delay (int) – Time in seconds to wait between status checks. (default: 60)
waiter_max_attempts (int) – Maximum number of attempts to check for job completion. (default: 20)
deferrable (bool) – If True, the operator will wait asynchronously for the cluster to stop. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False)
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
- class airflow.providers.amazon.aws.operators.bedrock.BedrockCreateKnowledgeBaseOperator(name, embedding_model_arn, role_arn, storage_config, create_knowledge_base_kwargs=None, wait_for_indexing=True, indexing_error_retry_delay=5, indexing_error_max_attempts=20, wait_for_completion=True, waiter_delay=60, waiter_max_attempts=20, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockAgentHook
]Create a knowledge base that contains data sources used by Amazon Bedrock LLMs and Agents.
To create a knowledge base, you must first set up your data sources and configure a supported vector store.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon Bedrock Knowledge Base
- Parameters
name (str) – The name of the knowledge base. (templated)
embedding_model_arn (str) – ARN of the model used to create vector embeddings for the knowledge base. (templated)
role_arn (str) – The ARN of the IAM role with permissions to create the knowledge base. (templated)
storage_config (dict[str, Any]) – Configuration details of the vector database used for the knowledge base. (templated)
wait_for_indexing (bool) – Vector indexing can take some time and there is no apparent way to check the state before trying to create the Knowledge Base. If this is True, and creation fails due to the index not being available, the operator will wait and retry. (default: True) (templated)
indexing_error_retry_delay (int) – Seconds between retries if an index error is encountered. (default 5) (templated)
indexing_error_max_attempts (int) – Maximum number of times to retry when encountering an index error. (default 20) (templated)
create_knowledge_base_kwargs (dict[str, Any] | None) – Any additional optional parameters to pass to the API call. (templated)
wait_for_completion (bool) – Whether to wait for cluster to stop. (default: True)
waiter_delay (int) – Time in seconds to wait between status checks. (default: 60)
waiter_max_attempts (int) – Maximum number of attempts to check for job completion. (default: 20)
deferrable (bool) – If True, the operator will wait asynchronously for the cluster to stop. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False)
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
- class airflow.providers.amazon.aws.operators.bedrock.BedrockCreateDataSourceOperator(name, knowledge_base_id, bucket_name=None, create_data_source_kwargs=None, **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockAgentHook
]Set up an Amazon Bedrock Data Source to be added to an Amazon Bedrock Knowledge Base.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon Bedrock Data Source
- Parameters
name (str) – name for the Amazon Bedrock Data Source being created. (templated).
bucket_name (str | None) – The name of the Amazon S3 bucket to use for data source storage. (templated)
knowledge_base_id (str) – The unique identifier of the knowledge base to which to add the data source. (templated)
create_data_source_kwargs (dict[str, Any] | None) – Any additional optional parameters to pass to the API call. (templated)
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
- class airflow.providers.amazon.aws.operators.bedrock.BedrockIngestDataOperator(knowledge_base_id, data_source_id, ingest_data_kwargs=None, wait_for_completion=True, waiter_delay=60, waiter_max_attempts=10, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator
[airflow.providers.amazon.aws.hooks.bedrock.BedrockAgentHook
]Begin an ingestion job, in which an Amazon Bedrock data source is added to an Amazon Bedrock knowledge base.
See also
For more information on how to use this operator, take a look at the guide: Ingest data into an Amazon Bedrock Data Source
- Parameters
knowledge_base_id (str) – The unique identifier of the knowledge base to which to add the data source. (templated)
data_source_id (str) – The unique identifier of the data source to ingest. (templated)
ingest_data_kwargs (dict[str, Any] | None) – Any additional optional parameters to pass to the API call. (templated)
wait_for_completion (bool) – Whether to wait for cluster to stop. (default: True)
waiter_delay (int) – Time in seconds to wait between status checks. (default: 60)
waiter_max_attempts (int) – Maximum number of attempts to check for job completion. (default: 10)
deferrable (bool) – If True, the operator will wait asynchronously for the cluster to stop. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False)
aws_conn_id – The Airflow connection used for AWS credentials. If this is
None
or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html