apache-airflow-providers-apache-hive
Changelog¶
main¶
Warning
All deprecated classes, parameters and features have been removed from the {provider_name} provider package. The following breaking changes were introduced:
Removed deprecated
GSSAPI
forauth_mechanism.
UseKERBEROS
instead.
8.2.1¶
Misc¶
Add support for semicolon stripping to DbApiHook, PrestoHook, and TrinoHook (#41916)
Explain how to use uv with airflow virtualenv and make it works (#43604)
Move 'uncompress_file' function from 'airflow.utils' to Hive provider (#43526)
8.2.0¶
Note
This release of provider is only available for Airflow 2.8+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers to Airflow 2.8.0 (#41396)
8.1.2¶
Misc¶
Update pandas minimum requirement for Python 3.12 (#40272)
implement per-provider tests with lowest-direct dependency resolution (#39946)
8.1.1¶
Misc¶
Faster 'airflow_version' imports (#39552)
Simplify 'airflow_version' imports (#39497)
Improvising high availability field name in hive hook (#39658)
8.1.0¶
Note
This release of provider is only available for Airflow 2.7+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers to Airflow 2.7.0 (#39240)
8.0.0¶
Breaking changes¶
Changed the default value of use_beeline
in hive cli connection to True.
Beeline will be always enabled by default in this connection type.
Removed deprecated parameter authMechanism
from HiveHook and dependent operators.
Use auth_mechanism
instead in your extra
.
HiveOperator: Removed the method get_hook
in favor of hook
property instead.
HiveStatsCollectionOperator: Removed the deprecated col_blacklist
in favor of excluded_columns
.
Setting use_beeline by default for hive cli connection (#38763)
Removing deprecated code in hive provider (#38859)
Features¶
Adding support to hive hook for high availability Hive installations (#38651)
7.0.1¶
Misc¶
Remove references from the code to Jira Issues (#37807)
Unify 'aws_conn_id' type to always be 'str | None' (#37768)
Limit 'pandas' to '<2.2' (#37748)
7.0.0¶
Breaking changes¶
Remove the ability of specify a proxy user as an owner
or login
or as_param
in the connection.
Now, setting the user in Proxy User
connection parameter or passing proxy_user
to HiveHook will do the job.
`` Simplify hive client connection (#37043)``
Misc¶
Fix pyhive hive_pure_sasl extra name (#37323)
6.4.2¶
Bug Fixes¶
Fix assignment of template field in '__init__' in 'hive-stats' (#36905)
Misc¶
Set min pandas dependency to 1.2.5 for all providers and airflow (#36698)
6.4.0¶
Features¶
Add param proxy user for hive (#36221)
Misc¶
Add code snippet formatting in docstrings via Ruff (#36262)
6.3.0¶
Note
This release of provider is only available for Airflow 2.6+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers to Airflow 2.6.0 (#36017)
6.2.0¶
Note
This release of provider is only available for Airflow 2.5+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump min airflow version of providers (#34728)
Consolidate hook management in HiveOperator (#34430)
6.1.6¶
Misc¶
Refactor regex in providers (#33898)
Replace sequence concatenation by unpacking in Airflow providers (#33933)
Replace single element slice by next() in hive provider (#33937)
Use a single statement with multiple contexts instead of nested statements in providers (#33768)
Use startswith once with a tuple in Hive hook (#33765)
Refactor: Simplify a few loops (#33736)
E731: replace lambda by a def method in Airflow providers (#33757)
Use f-string instead of in Airflow providers (#33752)
6.1.5¶
Note
The provider now uses pure-sasl, a pure-Python implementation of SASL, which is better maintained than previous sasl implementation, even if a bit slower for sasl interface. It also allows hive to be installed for Python 3.11.
Misc¶
Bring back hive support for Python 3.11 (#32607)
Refactor: Simplify code in Apache/Alibaba providers (#33227)
Simplify 'X for X in Y' to 'Y' where applicable (#33453)
Replace OrderedDict with plain dict (#33508)
Simplify code around enumerate (#33476)
Use str.splitlines() to split lines in providers (#33593)
Simplify conditions on len() in providers/apache (#33564)
Replace repr() with proper formatting (#33520)
Avoid importing pandas and numpy in runtime and module level (#33483)
Consolidate import and usage of pandas (#33480)
6.1.3¶
Bug Fixes¶
Fix Pandas2 compatibility for Hive (#32752)
Misc¶
Add more accurate typing for DbApiHook.run method (#31846)
Move Hive configuration to Apache Hive provider (#32777)
6.1.1¶
Note
This release dropped support for Python 3.7
Bug Fixes¶
Sanitize beeline principal parameter (#31983)
Misc¶
Replace unicodecsv with standard csv library (#31693)
6.1.0¶
Note
This release of provider is only available for Airflow 2.4+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers (#30917)
Update return types of 'get_key' methods on 'S3Hook' (#30923)
6.0.0¶
Breaking changes¶
The auth option is moved from the extra field to the auth parameter in the Hook. If you have extra parameters defined in your connections as auth, you should move them to the DAG where your HiveOperator or other Hive related operators are used.
Move auth parameter from extra to Hook parameter (#30212)
5.1.0¶
Features¶
The apache.hive
provider provides now hive macros that used to be provided by Airflow. As of 5.1.0 version
of apache.hive
the hive macros are provided by the Provider.
Move Hive macros to the provider (#28538)
Make pandas dependency optional for Amazon Provider (#28505)
5.0.0¶
Breaking changes¶
The hive_cli_params
from connection were moved to the Hook. If you have extra parameters defined in your
connections as hive_cli_params
extra, you should move them to the DAG where your HiveOperator is used.
Move hive_cli_params to hook parameters (#28101)
Features¶
Improve filtering for invalid schemas in Hive hook (#27808)
4.1.0¶
Note
This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.
Misc¶
Move min airflow version to 2.3.0 for all providers (#27196)
Bug Fixes¶
Filter out invalid schemas in Hive hook (#27647)
4.0.0¶
Breaking Changes¶
The
hql
parameter inget_records
ofHiveServer2Hook
has been renamed to sql to match theget_records
DbApiHook signature. If you used it as a positional parameter, this is no change for you, but if you used it as keyword one, you need to rename it.hive_conf
parameter has been renamed toparameters
and it is now second parameter, to matchget_records
signature from the DbApiHook. You need to rename it if you used it.schema
parameter inget_records
is an optional kwargs extra parameter that you can add, to match the schema ofget_records
from DbApiHook.Deprecate hql parameters and synchronize DBApiHook method APIs (#25299)
Remove Smart Sensors (#25507)
3.1.0¶
Features¶
Move all SQL classes to common-sql provider (#24836)
Bug Fixes¶
fix connection extra parameter 'auth_mechanism' in 'HiveMetastoreHook' and 'HiveServer2Hook' (#24713)
3.0.0¶
Breaking changes¶
Note
This release of provider is only available for Airflow 2.2+ as explained in the Apache Airflow providers support policy.
Misc¶
chore: Refactoring and Cleaning Apache Providers (#24219)
AIP-47 - Migrate hive DAGs to new design #22439 (#24204)
2.3.0¶
Features¶
Set larger limit get_partitions_by_filter in HiveMetastoreHook (#21504)
Bug Fixes¶
Fix Python 3.9 support in Hive (#21893)
Fix key typo in 'template_fields_renderers' for 'HiveOperator' (#21525)
Misc¶
Support for Python 3.10
Add how-to guide for hive operator (#21590)
2.2.0¶
Features¶
Add more SQL template fields renderers (#21237)
Add conditional 'template_fields_renderers' check for new SQL lexers (#21403)
2.0.2¶
Bug fixes¶
HiveHook fix get_pandas_df() failure when it tries to read an empty table (#17777)
Misc¶
Optimise connection importing for Airflow 2.2.0
2.0.0¶
Breaking changes¶
Auto-apply apply_default decorator (#15667)
Warning
Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+.
If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade
Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded
automatically and you will have to manually run airflow upgrade db
to complete the migration.
1.0.3¶
Bug fixes¶
Fix mistake and typos in doc/docstrings (#15180)
Fix grammar and remove duplicate words (#14647)
Resolve issue related to HiveCliHook kill (#14542)
1.0.1¶
Updated documentation and readme files.
Bug fixes¶
Remove password if in LDAP or CUSTOM mode HiveServer2Hook (#11767)
1.0.0¶
Initial version of the provider.