Example Airflow DAG that show how to create a Dataproc cluster in Google Kubernetes Engine.
Required environment variables:
GKE_NAMESPACE = os.environ.get(“GKE_NAMESPACE”, f”{CLUSTER_NAME}”)
A GKE cluster can support multiple DP clusters running in different namespaces.
Define a namespace or assign a default one.
Notice: optional kubernetes_namespace parameter in VIRTUAL_CLUSTER_CONFIG should be the same as GKE_NAMESPACE
Module Contents
-
tests.system.google.cloud.dataproc.example_dataproc_gke.ENV_ID[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.DAG_ID = 'dataproc_gke'[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.PROJECT_ID[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.REGION = 'us-central1'[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME_BASE[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME_FULL[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_CLUSTER_NAME[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.WORKLOAD_POOL[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_CLUSTER_CONFIG[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_NAMESPACE[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.VIRTUAL_CLUSTER_CONFIG[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.create_gke_cluster[source]
-
tests.system.google.cloud.dataproc.example_dataproc_gke.test_run[source]