Visual Guide

Incremental Models

An incremental model processes only the rows that changed since the last run, rather than rebuilding the entire table from scratch every time.

The problem with rebuilding everything

By default, a dbt table model rebuilds itself completely on every run: drop, re-select, re-insert every row. For a 10,000-row dimension that is fine. For a two-billion-row events table, it is a slow, expensive query you run again and again to recompute rows that have not changed since yesterday.

An incremental model breaks that pattern. On the first run it builds the whole table. On every run after that, it processes only the new or changed rows and folds them into the table that already exists.

Start with table or view materialization unless you have a concrete reason to switch. Full rebuilds are simpler, easier to debug, and guarantee correctness. Reach for incremental when a table is genuinely too large and too slow to rebuild.

How it works

Two pieces make an incremental model work: a config that marks the model incremental, and an is_incremental() block that filters down to just the fresh rows.

{{ config(materialized='incremental', unique_key='order_id') }}

select * from {{ source('shop', 'orders') }}

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

is_incremental() returns true only when all three conditions hold: the target table already exists, the model is materialized as incremental, and --full-refresh was not passed. When it is true, the where clause is active and only fresh rows reach the incremental logic.

{{ this }} is a dbt Jinja variable that resolves to the fully-qualified name of the model’s own target table, for example analytics.fct_orders. dbt substitutes it at compile time, so the sub-select always scans the live target table rather than a source.

The five strategies

The incremental_strategy tells dbt what to do with the rows that pass the is_incremental() filter. Choose based on your warehouse, data shape, and correctness requirements.

Full refresh and first run

When is_incremental() is false: on a model’s first run, or when you pass --full-refresh, dbt bypasses all incremental logic and rebuilds the entire table from scratch. Each strategy widget below also has a --full-refresh toggle to revisit this at any point.

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Warehouse SQL

fct_orders: before

rows this run processes

fct_orders: after

append

Inserts every incoming row without checking for duplicates.

{{ config(
  materialized='incremental',
  incremental_strategy='append'
) }}

Fast and cheap. Works for event tables where the is_incremental() filter is guaranteed to produce non-overlapping rows, or where downstream models deduplicate. If the model reruns and the filter is not tight enough, you get duplicates; step through the widget below to see order 3 land twice.

table already exists --full-refresh

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Warehouse SQL

fct_orders: before

rows this run processes

fct_orders: after

merge

Compares incoming rows against the existing table on unique_key. Rows that match are updated in place; rows that do not match are inserted. Default strategy on Snowflake, BigQuery, Postgres, and Redshift when unique_key is set.

{{ config(
  materialized='incremental',
  unique_key='order_id'
) }}

The most broadly useful strategy. Step through the widget below, then toggle unique_key configured off to see what happens without it: order 3 is duplicated instead of updated, the classic incremental footgun.

table already exists --full-refresh unique_key configured

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Warehouse SQL

fct_orders: before

rows this run processes

fct_orders: after

delete+insert

Deletes every existing row whose key appears in the incoming batch, then inserts the full batch. Produces the same final table as merge but uses DELETE + INSERT instead of a MERGE statement. The widget shows both SQL statements and the “replaced” tag in the result; same end state as merge, different mechanics.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  incremental_strategy='delete+insert'
) }}

table already exists --full-refresh

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Warehouse SQL

fct_orders: before

rows this run processes

fct_orders: after

insert_overwrite

Replaces entire partitions rather than individual rows. No unique_key; the granularity is the partition, not the row. Supported on Snowflake, Databricks, Spark, Redshift, and Postgres. Not supported on BigQuery.

{{ config(
  materialized='incremental',
  incremental_strategy='insert_overwrite',
  partition_by={'field': 'created_date', 'data_type': 'date'}
) }}

The key risk: if the batch for a partition does not contain all the rows that were previously there, the missing rows are gone. In the example below, e6 and e7 existed in the Jan 15 partition but were not in the batch; they are permanently lost after the overwrite.

microbatch

Splits the run into one bounded query per time period (hour, day, month). Each batch is independent, so dbt can run them in parallel and retry just the failed batches. Built for large time-series tables where a single incremental query is too slow or too error-prone.

{{ config(
  materialized='incremental',
  incremental_strategy='microbatch',
  event_time='created_at',
  batch_size='day'
) }}

Supported on Snowflake, Databricks, Spark, Trino, and Athena. Not yet available on Postgres, BigQuery, or Redshift.

Instead of one query with an is_incremental() filter, dbt splits the run into one bounded query per time period, here per day. Each batch is independent, so dbt can run them in parallel and retry only the ones that fail.

{{ config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='created_at',
    batch_size='day',
    begin='2024-01-01',
) }}
select * from {{ ref('page_views') }}

Lookback a batch fails

Jun 9

Jun 10

Jun 11

Jun 12

Jun 13

Jun 14

Jun 15today

already loaded
this run's batches
failed batch

Advanced config

on_schema_change

Controls what happens when you add, remove, or change columns in the model SQL. The default ignore silently discards new columns, a frequent source of schema drift that only surfaces later.

In the widget below, the model SQL changes from [order_id, amount, legacy_field] to [order_id, amount, status]. Pick each option to see what the target table looks like after the next incremental run.

merge_update_columns / merge_exclude_columns

Restrict which columns are updated during a merge. Useful when some columns are set once on insert and should never be overwritten; for example, created_at.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  merge_update_columns=['status', 'updated_at']
) }}

incremental_predicates

Limits the scan of the existing target table during merge, reducing cost on large tables. Use DBT_INTERNAL_DEST to reference the target and DBT_INTERNAL_SOURCE for the new batch.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  incremental_predicates=["DBT_INTERNAL_DEST.created_date >= dateadd('day', -7, current_date)"]
) }}

This tells dbt: only look for matching rows in the last 7 days of the target table, not the full history. Make sure your is_incremental() filter produces rows only within the predicate window, otherwise existing rows outside it will not be updated.

Use the widget below to compare all three row-level strategies side by side. All controls are live.

is_incremental() gates a where filter; the incremental_strategy decides how the rows it lets through land in the table. Flip the switches to see what each run writes to fct_orders.

table already exists --full-refresh unique_key configured Strategy

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

dbt run -s fct_orders

fct_orders: before

rows this run processes

fct_orders: after

When to use this pattern

Reach for an incremental model when a table is large and append-mostly (event logs, clickstream, orders) and a full rebuild has become too slow or too expensive. For small tables, a plain table model is simpler and the extra moving parts are not worth it.

Incremental Models

The problem with rebuilding everything

How it works

The five strategies

Full refresh and first run

fct_orders: before

rows this run processes

fct_orders: after

append

fct_orders: before

rows this run processes

fct_orders: after

merge

fct_orders: before

rows this run processes

fct_orders: after

delete+insert

fct_orders: before

rows this run processes

fct_orders: after

insert_overwrite

fct_events: before

batch arriving, Jan 15 only

fct_events: after

microbatch

Advanced config

on_schema_change

merge_update_columns / merge_exclude_columns

incremental_predicates

fct_orders: before

rows this run processes

fct_orders: after

When to use this pattern

Keep going