apache-airflow-providers-apache-spark
Changelog¶
5.0.0¶
Note
This release of provider is only available for Airflow 2.9+ as explained in the Apache Airflow providers support policy.
Breaking changes¶
Warning
All deprecated classes, parameters and features have been removed from the Apache Spark provider package. The following breaking changes were introduced:
Operators
Removed
_sql()
support for SparkSqlOperator. Please usesql
attribute instead._sql
was introduced in 2016 and since it was listed as templated field, which is no longer the case, we handled it as public api despite the_
prefix that marked it as private.
Remove deprecated code from apache spark provider (#44567)
Misc¶
Bump minimum Airflow version in providers to Airflow 2.9.0 (#44956)
Fix failing mypy check on 'main' (#44191)
spark-submit: replace 'principle' by 'principal' (#44150)
Update DAG example links in multiple providers documents (#44034)
4.11.3¶
Misc¶
Move python operator to Standard provider (#42081)
4.11.2¶
Bug Fixes¶
Changed conf property from str to dict in SparkSqlOperator (#42835)
4.11.1¶
Misc¶
Refactor function resolve_kerberos_principal (#42777)
4.11.0¶
Features¶
Add kerberos related connection fields(principal, keytab) on SparkSubmitHook (#40757)
4.10.0¶
Note
This release of provider is only available for Airflow 2.8+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers to Airflow 2.8.0 (#41396)
Resolve 'AirflowProviderDeprecationWarning' in 'SparkSqlOperator' (#41358)
4.9.0¶
Features¶
Add 'kubernetes_application_id' to 'SparkSubmitHook' (#40753)
Bug Fixes¶
(fix): spark submit pod name with driver as part of its name(#40732)
4.8.2¶
Misc¶
implement per-provider tests with lowest-direct dependency resolution (#39946)
4.8.1¶
Misc¶
Faster 'airflow_version' imports (#39552)
Simplify 'airflow_version' imports (#39497)
4.8.0¶
Note
This release of provider is only available for Airflow 2.7+ as explained in the Apache Airflow providers support policy.
Bug Fixes¶
Rename SparkSubmitOperator argument queue as yarn_queue (#38852)
Misc¶
Bump minimum Airflow version in providers to Airflow 2.7.0 (#39240)
4.7.2¶
Misc¶
Rename 'SparkSubmitOperator' fields names to comply with templated fields validation (#38051)
Rename 'SparkSqlOperator' fields name to comply with templated fields validation (#38045)
4.7.1¶
Misc¶
Bump min version for grpcio-status in spark provider (#36662)
4.7.0¶
change spark connection form and add spark connections docs (#36419)
4.6.0¶
Features¶
SparkSubmit: Adding propertyfiles option (#36164)
SparkSubmit Connection Extras can be overridden (#36151)
Bug Fixes¶
Follow BaseHook connection fields method signature in child classes (#36086)
4.5.0¶
Note
This release of provider is only available for Airflow 2.6+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers to Airflow 2.6.0 (#36017)
4.4.0¶
Features¶
Add pyspark decorator (#35247)
Add use_krb5ccache option to SparkSubmitOperator (#35331)
4.3.0¶
Features¶
Add 'use_krb5ccache' option to 'SparkSubmitHook' (#34386)
4.2.0¶
Note
This release of provider is only available for Airflow 2.5+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump min airflow version of providers (#34728)
4.1.5¶
Misc¶
Refactor regex in providers (#33898)
4.1.4¶
Misc¶
Refactor: Simplify code in Apache/Alibaba providers (#33227)
4.1.3¶
Bug Fixes¶
Validate conn_prefix in extra field for Spark JDBC hook (#32946)
4.1.2¶
Note
The provider now expects apache-airflow-providers-cncf-kubernetes
in version 7.4.0+ installed
in order to run Spark on Kubernetes jobs. You can install the provider with cncf.kubernetes
extra with
pip install apache-airflow-providers-spark[cncf.kubernetes]
to get the right version of the
cncf.kubernetes
provider installed.
Misc¶
Move all k8S classes to cncf.kubernetes provider (#32767)
4.1.1¶
Note
This release dropped support for Python 3.7
Misc¶
SparkSubmitOperator: rename spark_conn_id to conn_id (#31952)
4.1.0¶
Note
This release of provider is only available for Airflow 2.4+ as explained in the Apache Airflow providers support policy.
Misc¶
Bump minimum Airflow version in providers (#30917)
4.0.1¶
Bug Fixes¶
Only restrict spark binary passed via extra (#30213)
Validate host and schema for Spark JDBC Hook (#30223)
Add spark3-submit to list of allowed spark-binary values (#30068)
4.0.0¶
Note
This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.
Breaking changes¶
The spark-binary
connection extra could be set to any binary, but with 4.0.0 version only two values
are allowed for it spark-submit
and spark2-submit
.
The spark-home
connection extra is not allowed anymore - the binary should be available on the
PATH in order to use SparkSubmitHook and SparkSubmitOperator.
Remove custom spark home and custom binaries for spark (#27646)
Misc¶
Move min airflow version to 2.3.0 for all providers (#27196)
3.0.0¶
Breaking changes¶
Note
This release of provider is only available for Airflow 2.2+ as explained in the Apache Airflow providers support policy.
Bug Fixes¶
Add typing for airflow/configuration.py (#23716)
Fix backwards-compatibility introduced by fixing mypy problems (#24230)
Misc¶
AIP-47 - Migrate spark DAGs to new design #22439 (#24210)
chore: Refactoring and Cleaning Apache Providers (#24219)
2.1.3¶
Bug Fixes¶
Fix mistakenly added install_requires for all providers (#22382)
2.1.2¶
Misc¶
Add Trove classifiers in PyPI (Framework :: Apache Airflow :: Provider)
2.1.1¶
Bug Fixes¶
fix param rendering in docs of SparkSubmitHook (#21788)
Misc¶
Support for Python 3.10
2.1.0¶
Features¶
Add more SQL template fields renderers (#21237)
Add optional features in providers. (#21074)
2.0.3¶
Bug Fixes¶
Ensure Spark driver response is valid before setting UNKNOWN status (#19978)
2.0.2¶
Bug Fixes¶
fix bug of SparkSql Operator log going to infinite loop. (#19449)
2.0.1¶
Misc¶
Optimise connection importing for Airflow 2.2.0
2.0.0¶
Breaking changes¶
Auto-apply apply_default decorator (#15667)
Warning
Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+.
If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade
Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded
automatically and you will have to manually run airflow upgrade db
to complete the migration.
Bug fixes¶
Make SparkSqlHook use Connection (#15794)
1.0.3¶
Bug fixes¶
Fix 'logging.exception' redundancy (#14823)
1.0.2¶
Bug fixes¶
Use apache.spark provider without kubernetes (#14187)
1.0.1¶
Updated documentation and readme files.
1.0.0¶
Initial version of the provider.