Skip to content

Conversation

@alexeyegorov
Copy link

@alexeyegorov alexeyegorov commented Jan 25, 2026

Execute DBT in session mode (e.g. on a job cluster)

This is a plan created with Claude to implement session mode for dbt-databricks adapter.
It should allow:

  • execution of sql and python models using a session like in dbt-spark
  • run the whole dbt pipeline on a job cluster to save costs
  • ...

Asset Bundle DBT task

The native dbt_task on Databricks does not provide a Spark session. It is also not possible to retrieve it.
It is about Databricks to allow retrieving a Spark Session within dbt_task in order to keep the Asset Bundle deployment as simple as it is right now.

The workaround to still use the session mode is to define the dbt tasks as python scripts or notebooks.
This approach could be added as a template via Databricks Asset Bundles.

In our job example, this looks as below:

image

With the job cluster selected, it now executes as expected:
image

  • prepare:
    • clone the repository with DBT code
    • run dbt deps and dbt seed
  • run: execute dbt run with optional --full-refresh and a passed --select (e.g. state, specific model, or empty for full selection)
  • test: execute dbt test
  • docs: generate dbt docs
  • cleanup: remove cloned repository files

Example of dbt cli for run execution

image

Pros/Cons

Pros:

  • keep profiles.yml and requirements.txt in the asset bundles repository
    • currently, profiles.yml need to be uploaded manually to an available path on e.g. Databricks; changes require to update and reupload the file;
    • this strategy allows keeping this file in asset bundles repository or in the original dbt repository and link the job to use directly after it was cloned
  • execute the complete DBT pipeline with SQL and Python models on both, all-purpose and job clusters as well as SQL warehouse
  • use a cluster with init script to install 3rd party libraries (e.g. Apache Sedona for geospatial functions)

Cons:

  • not able to use the native dbt_task provided by Databricks -> could be adjusted by Databricks
  • slightly more setup for the definition of the tasks (Python notebooks using dbt CLI) -> could be added as a template to Asset Bundles repository

Description

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

@alexeyegorov alexeyegorov changed the title feat: plan for implementing session mode [WIP] Add session mode for dbt-databricks adapter Jan 25, 2026
- Introduced `DatabricksSessionHandle` and `SessionCursorWrapper` to enable SparkSession-based execution.
- Updated `DatabricksConnectionManager` to handle session mode connections and capabilities.
- Enhanced `DatabricksCredentials` to auto-detect and validate connection methods.
- Added session mode handling in Python model submission and execution.
- Implemented cleanup for temporary views to prevent state leakage between models.

This update allows dbt to run entirely within a single SparkSession on Databricks job clusters, improving execution efficiency and compatibility.
- Introduced comprehensive unit tests for session mode components, including `SessionCursorWrapper`, `DatabricksSessionHandle`, and session mode credentials.
- Enhanced test coverage for session mode auto-detection and validation in `DatabricksCredentials`.
- Implemented tests for session mode Python model submission and execution, ensuring proper handling of temporary views and execution errors.

These additions improve the reliability and robustness of session mode features in the Databricks adapter.
@alexeyegorov alexeyegorov changed the title [WIP] Add session mode for dbt-databricks adapter [WIP] Add session mode for dbt-databricks adapter for 1.11.x version Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant