airflow.providers.apache.hive.transfers.hive_to_mysql

This module contains an operator to move data from Hive to MySQL.

Module Contents

Classes

HiveToMySqlOperator

Moves data from Hive to MySQL.

class airflow.providers.apache.hive.transfers.hive_to_mysql.HiveToMySqlOperator(*, sql, mysql_table, hiveserver2_conn_id='hiveserver2_default', mysql_conn_id='mysql_default', mysql_preoperator=None, mysql_postoperator=None, bulk_load=False, hive_conf=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Moves data from Hive to MySQL.

Note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.

Parameters
  • sql (str) – SQL query to execute against Hive server. (templated)

  • mysql_table (str) – target MySQL table, use dot notation to target a specific database. (templated)

  • mysql_conn_id (str) – source mysql connection

  • hiveserver2_conn_id (str) – Reference to the Hive Server2 thrift service connection id.

  • mysql_preoperator (str | None) – sql statement to run against mysql prior to import, typically use to truncate of delete in place of the data coming in, allowing the task to be idempotent (running the task twice won’t double load data). (templated)

  • mysql_postoperator (str | None) – sql statement to run against mysql after the import, typically used to move data from staging to production and issue cleanup commands. (templated)

  • bulk_load (bool) – flag to use bulk_load option. This loads mysql directly from a tab-delimited text file using the LOAD DATA LOCAL INFILE command. The MySQL server must support loading local files via this command (it is disabled by default).

  • hive_conf (dict | None) –

template_fields: collections.abc.Sequence[str] = ('sql', 'mysql_table', 'mysql_preoperator', 'mysql_postoperator')[source]
template_ext: collections.abc.Sequence[str] = ('.sql',)[source]
template_fields_renderers[source]
ui_color = '#a0e08c'[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?