tests.system.google.cloud.dataproc.example_dataproc_gke

Example Airflow DAG that show how to create a Dataproc cluster in Google Kubernetes Engine.

Required environment variables: GKE_NAMESPACE = os.environ.get(“GKE_NAMESPACE”, f”{CLUSTER_NAME}”) A GKE cluster can support multiple DP clusters running in different namespaces. Define a namespace or assign a default one. Notice: optional kubernetes_namespace parameter in VIRTUAL_CLUSTER_CONFIG should be the same as GKE_NAMESPACE

Attributes

ENV_ID

DAG_ID

PROJECT_ID

REGION

CLUSTER_NAME_BASE

CLUSTER_NAME_FULL

CLUSTER_NAME

GKE_CLUSTER_NAME

WORKLOAD_POOL

GKE_CLUSTER_CONFIG

GKE_NAMESPACE

VIRTUAL_CLUSTER_CONFIG

create_gke_cluster

test_run

Module Contents

tests.system.google.cloud.dataproc.example_dataproc_gke.ENV_ID[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.DAG_ID = 'dataproc_gke'[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.PROJECT_ID[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.REGION = 'us-central1'[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME_BASE = ''[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME_FULL = ''[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.CLUSTER_NAME = ''[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_CLUSTER_NAME = ''[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.WORKLOAD_POOL = 'Uninferable.svc.id.goog'[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_CLUSTER_CONFIG[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.GKE_NAMESPACE[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.VIRTUAL_CLUSTER_CONFIG[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.create_gke_cluster[source]
tests.system.google.cloud.dataproc.example_dataproc_gke.test_run[source]

Was this entry helpful?