airflow.providers.apache.kylin.operators.kylin_cube

Module Contents

Classes

KylinCubeOperator

Submit request about Kylin build/refresh/merge and track job status.

class airflow.providers.apache.kylin.operators.kylin_cube.KylinCubeOperator(*, kylin_conn_id='kylin_default', project=None, cube=None, dsn=None, command=None, start_time=None, end_time=None, offset_start=None, offset_end=None, segment_name=None, is_track_job=False, interval=60, timeout=60 * 60 * 24, eager_error_status=('ERROR', 'DISCARDED', 'KILLED', 'SUICIDAL', 'STOPPED'), **kwargs)[source]

Bases: airflow.models.BaseOperator

Submit request about Kylin build/refresh/merge and track job status.

For more detail information in Apache Kylin

Parameters
  • kylin_conn_id (str) – The connection id as configured in Airflow administration.

  • project (str | None) – kylin project name, this param will overwrite the project in kylin_conn_id:

  • cube (str | None) – kylin cube name

  • dsn (str | None) – (dsn , dsn url of kylin connection ,which will overwrite kylin_conn_id. for example: kylin://ADMIN:KYLIN@sandbox/learn_kylin?timeout=60&is_debug=1)

  • command (str | None) – (kylin command include ‘build’, ‘merge’, ‘refresh’, ‘delete’, ‘build_streaming’, ‘merge_streaming’, ‘refresh_streaming’, ‘disable’, ‘enable’, ‘purge’, ‘clone’, ‘drop’. build - use /kylin/api/cubes/{cubeName}/build rest api,and buildType is ‘BUILD’, and you should give start_time and end_time refresh - use build rest api,and buildType is ‘REFRESH’ merge - use build rest api,and buildType is ‘MERGE’ build_streaming - use /kylin/api/cubes/{cubeName}/build2 rest api,and buildType is ‘BUILD’ and you should give offset_start and offset_end refresh_streaming - use build2 rest api,and buildType is ‘REFRESH’ merge_streaming - use build2 rest api,and buildType is ‘MERGE’ delete - delete segment, and you should give segment_name value disable - disable cube enable - enable cube purge - purge cube clone - clone cube,new cube name is {cube_name}_clone drop - drop cube)

  • start_time (str | None) – build segment start time

  • end_time (str | None) – build segment end time

  • offset_start (str | None) – streaming build segment start time

  • offset_end (str | None) – streaming build segment end time

  • segment_name (str | None) – segment name

  • is_track_job (bool) – (whether to track job status. if value is True,will track job until job status is in(“FINISHED”, “ERROR”, “DISCARDED”, “KILLED”, “SUICIDAL”, “STOPPED”) or timeout)

  • interval (int) – track job status,default value is 60s

  • timeout (int) – timeout value,default value is 1 day,60 * 60 * 24 s

  • eager_error_status – (jobs error status,if job status in this list ,this task will be error. default value is tuple([“ERROR”, “DISCARDED”, “KILLED”, “SUICIDAL”, “STOPPED”]))

template_fields: collections.abc.Sequence[str] = ('project', 'cube', 'dsn', 'command', 'start_time', 'end_time', 'segment_name', 'offset_start',...[source]
ui_color = '#E79C46'[source]
build_command[source]
jobs_end_status[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?