airflow.providers.apache.drill.hooks.drill¶

Classes¶

DrillHook

Interact with Apache Drill via sqlalchemy-drill.

Module Contents¶

class airflow.providers.apache.drill.hooks.drill.DrillHook(*args, schema=None, log_sql=True, **kwargs)[source]¶

Bases: airflow.providers.common.sql.hooks.sql.DbApiHook

Interact with Apache Drill via sqlalchemy-drill.

You can specify the SQLAlchemy dialect and driver that sqlalchemy-drill will employ to communicate with Drill in the extras field of your connection, e.g. {"dialect_driver": "drill+sadrill"} for communication over Drill’s REST API. See the sqlalchemy-drill documentation for descriptions of the supported dialects and drivers.

You can specify the default storage_plugin for the sqlalchemy-drill connection using the extras field e.g. {"storage_plugin": "dfs"}.

conn_name_attr = 'drill_conn_id'[source]¶

default_conn_name = 'drill_default'[source]¶

conn_type = 'drill'[source]¶

hook_name = 'Drill'[source]¶

supports_autocommit = False[source]¶

get_conn()[source]¶

Establish a connection to Drillbit.

get_uri()[source]¶

Return the connection URI.

e.g: drill://localhost:8047/dfs

abstract set_autocommit(conn, autocommit)[source]¶

Set the autocommit flag on the connection.

abstract insert_rows(table, rows, target_fields=None, commit_every=1000, replace=False, **kwargs)[source]¶

Insert a collection of tuples into a table.

Rows are inserted in chunks, each chunk (of size commit_every) is done in a new transaction.

Parameters:

table (str) – Name of the target table
rows (collections.abc.Iterable[tuple[str]]) – The rows to insert into the table
target_fields (collections.abc.Iterable[str] | None) – The names of the columns to fill in the table
commit_every (int) – The maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
replace (bool) – Whether to replace instead of insert
executemany – If True, all rows are inserted at once in chunks defined by the commit_every parameter. This only works if all rows have same number of column names, but leads to better performance.
fast_executemany – If True, the fast_executemany parameter will be set on the cursor used by executemany which leads to better performance, if supported by driver.
autocommit – What to set the connection’s autocommit setting to before executing the query.