airflow.providers.dbt.cloud.hooks.dbt¶

Attributes¶

`DBT_CAUSE_MAX_LENGTH`
`T`

Exceptions¶

`DbtCloudJobRunException`	An exception that indicates a job run failed to complete.
`DbtCloudResourceLookupError`	Exception raised when a dbt Cloud resource cannot be uniquely identified.

Classes¶

`TokenAuth`	Helper class for Auth when executing requests.
`JobRunInfo`	Type class for the `job_run_info` dictionary.
`DbtCloudJobRunStatus`	dbt Cloud Job statuses.
`DbtCloudHook`	Interact with dbt Cloud using the V2 (V3 if supported) API.

Functions¶

`fallback_to_default_account`(func)	Provide a fallback value for `account_id`.
`provide_account_id`(func)	Provide a fallback value for `account_id`.

Module Contents¶

airflow.providers.dbt.cloud.hooks.dbt.DBT_CAUSE_MAX_LENGTH = 255[source]¶

airflow.providers.dbt.cloud.hooks.dbt.fallback_to_default_account(func)[source]¶

Provide a fallback value for account_id.

If the account_id is None or not passed to the decorated function, the value will be taken from the configured dbt Cloud Airflow Connection.

class airflow.providers.dbt.cloud.hooks.dbt.TokenAuth(token)[source]¶

Bases: requests.auth.AuthBase

Helper class for Auth when executing requests.

token[source]¶

__call__(request)[source]¶

class airflow.providers.dbt.cloud.hooks.dbt.JobRunInfo[source]¶

Bases: TypedDict

Type class for the job_run_info dictionary.

account_id: int | None[source]¶

run_id: int[source]¶

class airflow.providers.dbt.cloud.hooks.dbt.DbtCloudJobRunStatus[source]¶

Bases: enum.Enum

dbt Cloud Job statuses.

QUEUED = 1[source]¶

STARTING = 2[source]¶

RUNNING = 3[source]¶

SUCCESS = 10[source]¶

ERROR = 20[source]¶

CANCELLED = 30[source]¶

NON_TERMINAL_STATUSES[source]¶

TERMINAL_STATUSES[source]¶

classmethod check_is_valid(statuses)[source]¶

Validate input statuses are a known value.

classmethod is_terminal(status)[source]¶

Check if the input status is that of a terminal type.

exception airflow.providers.dbt.cloud.hooks.dbt.DbtCloudJobRunException[source]¶

Bases: airflow.exceptions.AirflowException

An exception that indicates a job run failed to complete.

exception airflow.providers.dbt.cloud.hooks.dbt.DbtCloudResourceLookupError[source]¶

Bases: airflow.exceptions.AirflowException

Exception raised when a dbt Cloud resource cannot be uniquely identified.

airflow.providers.dbt.cloud.hooks.dbt.T[source]¶

airflow.providers.dbt.cloud.hooks.dbt.provide_account_id(func)[source]¶

Provide a fallback value for account_id.

If the account_id is None or not passed to the decorated function, the value will be taken from the configured dbt Cloud Airflow Connection.

class airflow.providers.dbt.cloud.hooks.dbt.DbtCloudHook(dbt_cloud_conn_id=default_conn_name, *args, **kwargs)[source]¶

Bases: airflow.providers.http.hooks.http.HttpHook

Interact with dbt Cloud using the V2 (V3 if supported) API.

Parameters:: dbt_cloud_conn_id (str) – The ID of the dbt Cloud connection.

conn_name_attr = 'dbt_cloud_conn_id'[source]¶

default_conn_name = 'dbt_cloud_default'[source]¶

conn_type = 'dbt_cloud'[source]¶

hook_name = 'dbt Cloud'[source]¶

classmethod get_ui_field_behaviour()[source]¶

Build custom field behavior for the dbt Cloud connection form in the Airflow UI.

dbt_cloud_conn_id = 'dbt_cloud_default'[source]¶

static get_request_url_params(tenant, endpoint, include_related=None, *, api_version='v2')[source]¶

Form URL from base url and endpoint url.

Parameters:

tenant (str) – The tenant domain name which is need to be replaced in base url.
endpoint (str) – Endpoint url to be requested.
include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

async get_headers_tenants_from_connection()[source]¶

Get Headers, tenants from the connection details.

async get_job_details(run_id, account_id=None, include_related=None)[source]¶

Use Http async call to retrieve metadata for a specific run of a dbt Cloud job.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

async get_job_status(run_id, account_id=None, include_related=None)[source]¶

Retrieve the status for a specific run of a dbt Cloud job.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

property connection: airflow.models.Connection[source]¶

get_conn(*args, **kwargs)[source]¶

Create a Requests HTTP session.

Parameters:

headers – Additional headers to be passed through as a dictionary.
extra_options – additional options to be used when executing the request

Returns:

A configured requests.Session object.

Return type:

requests.sessions.Session

list_accounts()[source]¶

Retrieve all of the dbt Cloud accounts the configured API token is authorized to access.

Returns:: List of request responses.
Return type:: list[requests.models.Response]

get_account(account_id=None)[source]¶

Retrieve metadata for a specific dbt Cloud account.

Parameters:: account_id (int | None) – Optional. The ID of a dbt Cloud account.
Returns:: The request response.
Return type:: requests.models.Response

list_projects(account_id=None, name_contains=None)[source]¶

Retrieve metadata for all projects tied to a specified dbt Cloud account.

Parameters:

account_id (int | None) – Optional. The ID of a dbt Cloud account.
name_contains (str | None) – Optional. The case-insensitive substring of a dbt Cloud project name to filter by.

Returns:

List of request responses.

Return type:

list[requests.models.Response]

get_project(project_id, account_id=None)[source]¶

Retrieve metadata for a specific project.

Parameters:

project_id (int) – The ID of a dbt Cloud project.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The request response.

Return type:

requests.models.Response

list_environments(project_id, *, name_contains=None, account_id=None)[source]¶

Retrieve metadata for all environments tied to a specified dbt Cloud project.

Parameters:

project_id (int) – The ID of a dbt Cloud project.
name_contains (str | None) – Optional. The case-insensitive substring of a dbt Cloud environment name to filter by.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

List of request responses.

Return type:

list[requests.models.Response]

get_environment(project_id, environment_id, *, account_id=None)[source]¶

Retrieve metadata for a specific project’s environment.

Parameters:

project_id (int) – The ID of a dbt Cloud project.
environment_id (int) – The ID of a dbt Cloud environment.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The request response.

Return type:

requests.models.Response

list_jobs(account_id=None, order_by=None, project_id=None, environment_id=None, name_contains=None)[source]¶

Retrieve metadata for all jobs tied to a specified dbt Cloud account.

If a project_id is supplied, only jobs pertaining to this project will be retrieved. If an environment_id is supplied, only jobs pertaining to this environment will be retrieved.

Parameters:

account_id (int | None) – Optional. The ID of a dbt Cloud account.
order_by (str | None) – Optional. Field to order the result by. Use ‘-’ to indicate reverse order. For example, to use reverse order by the run ID use order_by=-id.
project_id (int | None) – Optional. The ID of a dbt Cloud project.
environment_id (int | None) – Optional. The ID of a dbt Cloud environment.
name_contains (str | None) – Optional. The case-insensitive substring of a dbt Cloud job name to filter by.

Returns:

List of request responses.

Return type:

list[requests.models.Response]

get_job(job_id, account_id=None)[source]¶

Retrieve metadata for a specific job.

Parameters:

job_id (int) – The ID of a dbt Cloud job.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The request response.

Return type:

requests.models.Response

get_job_by_name(*, project_name, environment_name, job_name, account_id=None)[source]¶

Retrieve metadata for a specific job by combination of project, environment, and job name.

Raises DbtCloudResourceLookupError if the job is not found or cannot be uniquely identified by provided parameters.

Parameters:

project_name (str) – The name of a dbt Cloud project.
environment_name (str) – The name of a dbt Cloud environment.
job_name (str) – The name of a dbt Cloud job.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The details of a job.

Return type:

dict

trigger_job_run(job_id, cause, account_id=None, steps_override=None, schema_override=None, retry_from_failure=False, additional_run_config=None)[source]¶

Triggers a run of a dbt Cloud job.

Parameters:

job_id (int) – The ID of a dbt Cloud job.
cause (str) – Description of the reason to trigger the job.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
steps_override (list[str] | None) – Optional. List of dbt commands to execute when triggering the job instead of those configured in dbt Cloud.
schema_override (str | None) – Optional. Override the destination schema in the configured target for this job.
retry_from_failure (bool) – Optional. If set to True and the previous job run has failed, the job will be triggered using the “rerun” endpoint. This parameter cannot be used alongside steps_override, schema_override, or additional_run_config.
additional_run_config (dict[str, Any] | None) – Optional. Any additional parameters that should be included in the API request when triggering the job.

Returns:

The request response.

Return type:

requests.models.Response

list_job_runs(account_id=None, include_related=None, job_definition_id=None, order_by=None)[source]¶

Retrieve metadata for all dbt Cloud job runs for an account.

If a job_definition_id is supplied, only metadata for runs of that specific job are pulled.

Parameters:

account_id (int | None) – Optional. The ID of a dbt Cloud account.
include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.
job_definition_id (int | None) – Optional. The dbt Cloud job ID to retrieve run metadata.
order_by (str | None) – Optional. Field to order the result by. Use ‘-’ to indicate reverse order. For example, to use reverse order by the run ID use order_by=-id.

Returns:

List of request responses.

Return type:

list[requests.models.Response]

get_job_runs(account_id=None, payload=None)[source]¶

Retrieve metadata for a specific run of a dbt Cloud job.

Parameters:

account_id (int | None) – Optional. The ID of a dbt Cloud account.
paylod – Optional. Query Parameters

Returns:

The request response.

Return type:

requests.models.Response

get_job_run(run_id, account_id=None, include_related=None)[source]¶

Retrieve metadata for a specific run of a dbt Cloud job.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

Returns:

The request response.

Return type:

requests.models.Response

get_job_run_status(run_id, account_id=None)[source]¶

Retrieve the status for a specific run of a dbt Cloud job.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The status of a dbt Cloud job run.

Return type:

int

wait_for_job_run_status(run_id, account_id=None, expected_statuses=DbtCloudJobRunStatus.SUCCESS.value, check_interval=60, timeout=60 * 60 * 24 * 7)[source]¶

Wait for a dbt Cloud job run to match an expected status.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
expected_statuses (int | collections.abc.Sequence[int] | set[int]) – Optional. The desired status(es) to check against a job run’s current status. Defaults to the success status value.
check_interval (int) – Time in seconds to check on a pipeline run’s status.
timeout (int) – Time in seconds to wait for a pipeline to reach a terminal status or the expected status.

Returns:

Boolean indicating if the job run has reached the expected_status.

Return type:

bool

cancel_job_run(run_id, account_id=None)[source]¶

Cancel a specific dbt Cloud job run.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

list_job_run_artifacts(run_id, account_id=None, step=None)[source]¶

Retrieve a list of the available artifact files generated for a completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. To list artifacts from other steps in the run, use the step parameter.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
step (int | None) – Optional. The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

Returns:

List of request responses.

Return type:

list[requests.models.Response]

get_job_run_artifact(run_id, path, account_id=None, step=None)[source]¶

Retrieve a list of the available artifact files generated for a completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. To list artifacts from other steps in the run, use the step parameter.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
path (str) – The file path related to the artifact file. Paths are rooted at the target/ directory. Use “manifest.json”, “catalog.json”, or “run_results.json” to download dbt-generated artifacts for the run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.
step (int | None) – Optional. The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

Returns:

The request response.

Return type:

requests.models.Response

async get_job_run_artifacts_concurrently(run_id, artifacts, account_id=None, step=None)[source]¶

Retrieve a list of chosen artifact files generated for a step in completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. This takes advantage of the asynchronous calls to speed up the retrieval.

Parameters:

run_id (int) – The ID of a dbt Cloud job run.
step (int | None) – The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.
path – The file path related to the artifact file. Paths are rooted at the target/ directory. Use “manifest.json”, “catalog.json”, or “run_results.json” to download dbt-generated artifacts for the run.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The request response.

retry_failed_job_run(job_id, account_id=None)[source]¶

Retry a failed run for a job from the point of failure, if the run failed. Otherwise, trigger a new run.

Parameters:

job_id (int) – The ID of a dbt Cloud job.
account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns:

The request response.

Return type:

requests.models.Response

test_connection()[source]¶

Test dbt Cloud connection.