airflow.providers.amazon.aws.transfers.s3_to_sql¶

Classes¶

S3ToSqlOperator

Load Data from S3 into a SQL Database.

Module Contents¶

class airflow.providers.amazon.aws.transfers.s3_to_sql.S3ToSqlOperator(*, s3_key, s3_bucket, table, parser, column_list=None, commit_every=1000, schema=None, sql_conn_id='sql_default', sql_hook_params=None, aws_conn_id='aws_default', **kwargs)[source]¶

Bases: airflow.providers.amazon.version_compat.BaseOperator

Load Data from S3 into a SQL Database.

You need to provide a parser function that takes a filename as an input and returns an iterable of rows

See also

For more information on how to use this operator, take a look at the guide: Amazon S3 To SQL Transfer Operator

Parameters:

schema (str | None) – reference to a specific schema in SQL database
table (str) – reference to a specific table in SQL database
s3_bucket (str) – reference to a specific S3 bucket
s3_key (str) – reference to a specific S3 key
sql_conn_id (str) – reference to a specific SQL database. Must be of type DBApiHook
sql_hook_params (dict | None) – Extra config params to be passed to the underlying hook. Should match the desired hook constructor params.
aws_conn_id (str | None) – reference to a specific S3 / AWS connection
column_list (list[str] | None) – list of column names to use in the insert SQL.
commit_every (int) – The maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
parser (collections.abc.Callable[[str], collections.abc.Iterable[collections.abc.Iterable]]) –
parser function that takes a filepath as input and returns an iterable. e.g. to use a CSV parser that yields rows line-by-line, pass the following function:
```
def parse_csv(filepath):
    import csv

    with open(filepath, newline="") as file:
        yield from csv.reader(file)
```

template_fields: collections.abc.Sequence[str] = ('s3_bucket', 's3_key', 'schema', 'table', 'column_list', 'sql_conn_id')[source]¶

template_ext: collections.abc.Sequence[str] = ()[source]¶

ui_color = '#f4a460'[source]¶

s3_bucket[source]¶

s3_key[source]¶

table[source]¶

schema = None[source]¶

aws_conn_id = 'aws_default'[source]¶

sql_conn_id = 'sql_default'[source]¶

column_list = None[source]¶

commit_every = 1000[source]¶

parser[source]¶

sql_hook_params = None[source]¶

execute(context)[source]¶

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

property db_hook[source]¶