airflow.providers.amazon.aws.transfers.s3_to_redshift

Module Contents

Classes

S3ToRedshiftOperator

Executes an COPY command to load files from s3 to Redshift.

Attributes

AVAILABLE_METHODS

airflow.providers.amazon.aws.transfers.s3_to_redshift.AVAILABLE_METHODS = ['APPEND', 'REPLACE', 'UPSERT'][source]
class airflow.providers.amazon.aws.transfers.s3_to_redshift.S3ToRedshiftOperator(*, table, s3_bucket, s3_key, schema=None, redshift_conn_id='redshift_default', aws_conn_id='aws_default', verify=None, column_list=None, copy_options=None, autocommit=False, method='APPEND', upsert_keys=None, redshift_data_api_kwargs=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Executes an COPY command to load files from s3 to Redshift.

See also

For more information on how to use this operator, take a look at the guide: Amazon S3 To Amazon Redshift transfer operator

Parameters
  • table (str) – reference to a specific table in redshift database

  • s3_bucket (str) – reference to a specific S3 bucket

  • s3_key (str) – key prefix that selects single or multiple objects from S3

  • schema (str | None) – reference to a specific schema in redshift database. Do not provide when copying into a temporary table

  • redshift_conn_id (str) – reference to a specific redshift database OR a redshift data-api connection

  • aws_conn_id (str | None) – reference to a specific S3 connection If the AWS connection contains ‘aws_iam_role’ in extras the operator will use AWS STS credentials with a token https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-authorization.html#copy-credentials

  • verify (bool | str | None) –

    Whether to verify SSL certificates for S3 connection. By default, SSL certificates are verified. You can provide the following values:

    • False: do not validate SSL certificates. SSL will still be used

      (unless use_ssl is False), but SSL certificates will not be verified.

    • path/to/cert/bundle.pem: A filename of the CA cert bundle to uses.

      You can specify this argument if you want to use a different CA cert bundle than the one used by botocore.

  • column_list (list[str] | None) – list of column names to load source data fields into specific target columns https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-column-mapping.html#copy-column-list

  • copy_options (list | None) – reference to a list of COPY options

  • method (str) – Action to be performed on execution. Available APPEND, UPSERT and REPLACE.

  • upsert_keys (list[str] | None) – List of fields to use as key on upsert action

  • redshift_data_api_kwargs (dict | None) – If using the Redshift Data API instead of the SQL-based connection, dict of arguments for the hook’s execute_query method. Cannot include any of these kwargs: {'sql', 'parameters'}

property use_redshift_data[source]
template_fields: collections.abc.Sequence[str] = ('s3_bucket', 's3_key', 'schema', 'table', 'column_list', 'copy_options', 'redshift_conn_id',...[source]
template_ext: collections.abc.Sequence[str] = ()[source]
ui_color = '#99e699'[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

get_openlineage_facets_on_complete(task_instance)[source]

Implement on_complete as we will query destination table.

Was this entry helpful?