DataGym.io A DataGym.io Resource
dbt cheat sheet

dbt cheat sheet: the interactive dbt reference, updated for the AI era

Commands

Not the man page. The verbs and flags that actually earn a spot in muscle memory.

Core verbs: your daily drivers

dbt build
Run, test, seed and snapshot every selected resource in DAG order, the one command most teams run.
dbt run
Materialize models (view / table / incremental …) in the warehouse.
dbt test
Execute data tests and unit tests defined in the project.
dbt seed
Load CSV files from seeds/ into the warehouse.
dbt snapshot
Capture slowly-changing-dimension history for sources.

Mnemonic: build = run + test + seed + snapshot, all interleaved in dependency order.

Worth remembering

dbt list
Show which resources a selector matches (no run).
dbt show
Preview a model’s rows after transformation.
dbt retry
Re-run the last invocation from the point of failure.
dbt clone
Copy selected nodes from a state into the target schema(s).
dbt compile
Render SQL without executing. Inspect the Jinja.
dbt source freshness
Check whether sources are within their declared freshness SLA.
dbt parse
Parse the project and emit timing, useful in CI.

Setup & utilities

dbt init
Scaffold a new project and set up your profiles.yml connection.
dbt deps
Install the packages declared in packages.yml / dependencies.yml.
dbt debug
Check the install, the active profile and the warehouse connection.
dbt clean
Delete the folders in clean-targets: target/, dbt_packages/.
dbt docs generate
Build the documentation site and the catalog.json artifact.
dbt docs serve
Serve the generated docs site locally in your browser.
dbt run-operation
Invoke a macro directly: dbt run-operation my_macro --args '{k: v}'.

Flags worth remembering

--select / -s
Choose which resources to act on. See Node selection.
--exclude
Remove resources from the --select set; identical syntax.
--target / -t
Pick a target from profiles.yml: -t prod.
--defer
Resolve unselected refs from another state instead of rebuilding them.
--state
Path to a manifest.json, required by state-aware selectors and --defer.
--full-refresh
Rebuild incremental models and seeds from scratch.
--vars
Pass project vars: --vars '{"start_date": "2024-01-01"}'.
--threads
Override the thread count from your profile.
--store-failures
Persist failing test rows to a table for debugging.
--empty
Build schema-only (zero-row) models: a fast structural check for CI.
--sql 1.12+
On dbt run-operation, execute ad-hoc SQL directly. No macro file needed.

Trivial / one-time flags: --help, --version, --debug, --log-level, --profiles-dir, --project-dir.

Essentials

How models actually build: the project knowledge a CLI sheet alone leaves out.

Materializations

Set with {{ config(materialized='…') }} or in dbt_project.yml.

view
Default. A CREATE VIEW: no storage, always fresh, recomputed on every read.
table
CREATE TABLE AS: fully rebuilt from scratch each run.
incremental
Built once, then each run only inserts / merges new or changed rows.
ephemeral
Not built at all. Inlined as a CTE into the models that ref() it.
materialized_view
A warehouse-managed materialized view: dbt owns the definition, the warehouse keeps the data refreshed.

Incremental strategies

Pick with config(materialized='incremental', incremental_strategy='…').

merge
Upsert on unique_key: updates existing rows, inserts new ones.
append
Insert new rows only. Fastest, but no de-duplication.
delete+insert
Delete rows matching the new batch, then insert: a merge without native MERGE.
microbatch 1.9+
Split the load into time-based batches by event_time: robust, retryable backfills.
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

is_incremental() is true only when the table already exists and you didn’t pass --full-refresh.

Tests

generic
Reusable data tests declared in YAML: unique, not_null, accepted_values, relationships.
singular
A one-off .sql file in tests/: any query that returns failing rows.
unit 1.8+
Test model logic against mock inputs and expected output, declared with unit_tests:.

dbt build runs each model’s tests right after it; failures stop dependents. Add --store-failures to keep the offending rows.

Packages worth installing

Add to packages.yml, then run dbt deps. Browse the rest on the dbt Package hub.

dbt_utils
The standard toolbox: cross-database SQL, generic tests and helper macros.
dbt_project_evaluator
Audits your project against dbt's best-practice rules.
dbt_expectations
Great Expectations-style data quality tests.
audit_helper
Compare two relations to validate a refactor or migration.
elementary
Anomaly-detection tests and data observability reports.
dbt_artifacts
Models your run results and metadata for monitoring.

Common patterns

Copy-pasteable recipes for the things you do every week.

  • Slim CI: build only what changed since prod dbt build -s state:modified+ --defer --state path/to/prod
  • Production build, skipping unit tests and unmodified views dbt build --exclude "test_type:unit state:unmodified,config.materialized:view" --state path/to/prod
  • Resume after a failure (pick up what failed / skipped) dbt retry
  • Rebuild incremental models from scratch dbt run --full-refresh -s config.materialized:incremental
  • Test only what you just touched dbt test -s state:modified+ --state path/to/prod
  • Preview what a selector would match (no run) dbt list -s "tag:nightly,config.materialized:table"
  • Refresh only what has newer source data dbt source freshness && dbt build -s source_status:fresher+ --state path/
  • Let Claude or Cursor operate your project safely (see dbt + AI) uvx dbt-mcp

Node selection

The most-Googled corner of dbt. Type a selector below and watch it light up the DAG.

matched not matched source

Specifying resources

The --select flag accepts one or more args. Each arg can be one of:

  • A package’s name
  • A model name
  • A fully-qualified path to a directory of models
  • A selection method (path:, tag:, config:, test_type:, test_name: etc)

Examples:

  • $ dbt run --select my_dbt_project_nameall models in your project
  • $ dbt run --select my_dbt_modela specific model
  • $ dbt run --select path.to.my.modelsall models in a specific directory
  • $ dbt run --select my_package.some_modela specific model in a specific package
  • $ dbt run --select tag:nightlymodels with the “nightly” tag
  • $ dbt run --select path/to/modelsmodels contained in path/to/models
  • $ dbt run --select path/to/my_model.sqla specific model by its path

Graph operators

Plus operator (+)

  • $ dbt run --select my_model+select my_model and all children
  • $ dbt run --select +my_modelselect my_model and all parents
  • $ dbt run --select +my_model+select my_model, and all of its parents and children

N-plus operator

  • $ dbt run --select my_model+1select my_model and its first-degree children
  • $ dbt run --select 2+my_modelselect my_model, its first-degree parents, and its second-degree parents (“grandparents”)
  • $ dbt run --select 3+my_model+4select my_model, its parents up to the 3rd degree, and its children down to the 4th degree

At operator (@)

  • $ dbt run -s @my_modelselect my_model, all its descendants, and the ancestors of those descendants

Wildcards (*, ?, [abc]) work inside selector strings. They aren’t graph operators. Use them within a name or a method:value.

Methods examples

tag
dbt run -s "tag:nightly"
source
dbt run -s "source:snowplow+"
resource_type
dbt list -s "resource_type:test"
path
dbt run -s "models/staging/github"
package
dbt run -s "package:snowplow"
config.X
dbt run -s "config.materialized:incremental"
fqn
dbt run -s "fqn:my_project.marts.finance.fct_orders"
file
dbt run -s "file:fct_orders.sql"
semantic_model
dbt parse -s "semantic_model:orders"
saved_query
dbt sl list -s "saved_query:weekly_revenue"
test_type
dbt test -s "test_type:generic"
test_name
dbt test -s "test_name:unique"
state
dbt run -s "state:modified" --state path/to/artifacts
exposure
dbt run -s "+exposure:weekly_kpis"
metric
dbt build -s "+metric:weekly_active_users"
result
dbt run -s "result:error" --state path/to/artifacts
source_status
dbt build -s "source_status:fresher+"
group
dbt run -s "group:finance"
access
dbt list -s "access:public"
version
dbt list -s "version:latest"
unit_test
dbt list -s "unit_test:*"
selector 1.12+
dbt run -s "selector:nightly_marts"

Set operators

Unions (space-delimited)

  • $ dbt run --select +snowplow_sessions +fct_ordersrun snowplow_sessions, all ancestors of snowplow_sessions, fct_orders, and all ancestors of fct_orders

Intersections (comma-separated)

  • $ dbt run --select +snowplow_sessions,+fct_ordersrun all the common ancestors of snowplow_sessions and fct_orders
  • $ dbt run --select marts.finance,tag:nightlyrun models that are in the marts/finance subdirectory and tagged nightly

Excluding models

dbt provides an --exclude flag with the same semantics as --select. Models specified with the --exclude flag will be removed from the set of models selected with --select.

Example:

  • $ dbt run --select path:models/marts/finance --exclude fct_orders+the finance marts, minus fct_orders and everything downstream of it

State

Some methods compare the project against a manifest.json from another run (a previous invocation, or production), passed via the --state flag.

  • $ dbt build -s "state:modified+" --defer --state path/to/artifacts

Selectors

  • state:newabsent from the comparison manifest
  • state:modifiednew nodes, plus any changed existing node
  • state:unmodifiedexisting nodes with no changes
  • state:oldpresent in the comparison manifest

Narrow state:modified to one kind of change with sub-selectors: .body, .configs, .relation, .macros, .contract.

Defer

--defer lets you build a model without first building its parents: a ref() to an unbuilt parent resolves to another environment’s state instead. It needs --state. Flip the switches to see where model_c’s ref('model_b') resolves.

dbt build -s model_c

Prod state Dev (your schema) model_a model_b model_c model_b model_c ref('model_b')

No --defer: ref('model_b') resolves to your dev schema, where model_b was never built. model_c fails with a 'relation not found' error.

dbt + AI · 2026

The part the original cheat sheet couldn’t have: how dbt works in the agent era.

dbt MCP server

A Model Context Protocol server that lets AI agents (Claude, Cursor, VS Code) operate your dbt project, safely and with governance.

Run it locally:

  • uvx dbt-mcplocal server: full dbt CLI access

Tool groups it exposes:

dbt CLI
run, build, test, compile, list, parse, show
Discovery
get_model_details, get_lineage, get_model_health, get_mart_models
Semantic Layer
list_metrics, query_metrics, get_dimensions
SQL
execute_sql, text_to_sql
Admin API
trigger_job_run, list_jobs, retry_job_run
Codegen
generate_model_yaml, generate_source, generate_staging_model

A remote MCP server (one endpoint per environment) is also available for data-consumption tools without any local setup.

Agents & Copilot

Developer Agent
Write and refactor models, generate tests, and validate changes from natural language inside the Studio IDE.
Analyst Agent
Answer natural-language questions with governed numbers, powered by the dbt Semantic Layer.
dbt Copilot
One-click inline generation of SQL, models, tests and documentation.
dbt Canvas
Visual, drag-and-drop model building. Governance and lineage maintained automatically.
dbt Insights
An AI query interface to explore, validate and share results with full lineage awareness.

Fusion engine

The dbt engine, rewritten in Rust: what changes when your project runs on it.

What Fusion brings

A ground-up rewrite of the dbt engine in Rust, shipped as a single binary (v2.0).

  • SQL comprehension: Fusion actually parses your SQL, so it catches column & type errors before a run.
  • Language server (LSP): go-to-definition, autocomplete and live errors in the editor.
  • Faster parse / compile, with native ADBC warehouse drivers.
  • Adds the session verbs environment, invocation, cancel, reattach.
  • Deprecated flags are removed, most notably --models / -m → use --select.