Hive CLI Connection¶
The Hive CLI connection type enables the Hive CLI Integrations.
Authenticating to Hive CLI¶
There are two ways to connect to Hive using Airflow.
Use the Hive Beeline. i.e. make a JDBC connection string with host, port, and schema. Optionally you can connect with a proxy user, and specify a login and password.
Use the Hive CLI. i.e. specify Hive CLI params in the extras field.
Only one authorization method can be used at a time. If you need to manage multiple credentials or keys then you should configure multiple connections.
Default Connection IDs¶
All hooks and operators related to Hive_CLI use hive_cli_default
by default.
Configuring the Connection¶
- Login (optional)
Specify your username for a proxy user or for the Beeline CLI.
- Password (optional)
Specify your Beeline CLI password.
- Host (optional)
Specify your JDBC Hive host that is used for Hive Beeline.
- Port (optional)
Specify your JDBC Hive port that is used for Hive Beeline.
- Schema (optional)
Specify your JDBC Hive database that you want to connect to with Beeline or specify a schema for an HQL statement to run with the Hive CLI.
- Use Beeline (optional)
Specify as
True
if using the Beeline CLI. Default isFalse
.- Proxy User (optional)
Specify a proxy user to run HQL code as this user.
- Principal (optional)
Specify the JDBC Hive principal to be used with Hive Beeline.
- High Availability (optional)
Specify as
True
if you want to connect to a Hive installation running in high availability mode. Specify host accordingly.
When specifying the connection in environment variable you should specify it using URI syntax.
Note that all components of the URI should be URL-encoded.
For example:
export AIRFLOW_CONN_HIVE_CLI_DEFAULT='hive-cli://beeline-username:beeline-password@jdbc-hive-host:80/hive-database?hive_cli_params=params&use_beeline=True&auth=noSasl&principal=hive%2F_HOST%40EXAMPLE.COM'