Visual Guide

Incremental Models

An incremental model processes only the rows that changed since the last run, rather than rebuilding the entire table from scratch every time.

The problem with rebuilding everything

By default, a dbt table model rebuilds itself completely on every run: drop, re-select, re-insert every row. For a 10,000-row dimension that is fine. For a two-billion-row events table, it is a slow, expensive query you run again and again to recompute rows that have not changed since yesterday.

An incremental model breaks that pattern. On the first run it builds the whole table. On every run after that, it processes only the new or changed rows and folds them into the table that already exists.

Start with table or view materialization unless you have a concrete reason to switch. Full rebuilds are simpler, easier to debug, and guarantee correctness. Reach for incremental when a table is genuinely too large and too slow to rebuild.

How it works

Two pieces make an incremental model work: a config that marks the model incremental, and an is_incremental() block that filters down to just the fresh rows.

{{ config(materialized='incremental', unique_key='order_id') }}

select * from {{ source('shop', 'orders') }}

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

is_incremental() returns true only when all three conditions hold: the target table already exists, the model is materialized as incremental, and --full-refresh was not passed. When it is true, the where clause is active and only fresh rows reach the incremental logic.

{{ this }} is a dbt Jinja variable that resolves to the fully-qualified name of the model’s own target table, for example analytics.fct_orders. dbt substitutes it at compile time, so the sub-select always scans the live target table rather than a source.

The five strategies

The incremental_strategy tells dbt what to do with the rows that pass the is_incremental() filter. Choose based on your warehouse, data shape, and correctness requirements.

Full refresh and first run

When is_incremental() is false: on a model’s first run, or when you pass --full-refresh, dbt bypasses all incremental logic and rebuilds the entire table from scratch. Each strategy widget below also has a --full-refresh toggle to revisit this at any point.

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

fct_orders: before

rows this run processes

fct_orders: after

append

Inserts every incoming row without checking for duplicates.

{{ config(
  materialized='incremental',
  incremental_strategy='append'
) }}

Fast and cheap. Works for event tables where the is_incremental() filter is guaranteed to produce non-overlapping rows, or where downstream models deduplicate. If the model reruns and the filter is not tight enough, you get duplicates; step through the widget below to see order 3 land twice.

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

fct_orders: before

rows this run processes

fct_orders: after

merge

Compares incoming rows against the existing table on unique_key. Rows that match are updated in place; rows that do not match are inserted. Default strategy on Snowflake, BigQuery, Postgres, and Redshift when unique_key is set.

{{ config(
  materialized='incremental',
  unique_key='order_id'
) }}

The most broadly useful strategy. Step through the widget below, then toggle unique_key configured off to see what happens without it: order 3 is duplicated instead of updated, the classic incremental footgun.

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

fct_orders: before

rows this run processes

fct_orders: after

delete+insert

Deletes every existing row whose key appears in the incoming batch, then inserts the full batch. Produces the same final table as merge but uses DELETE + INSERT instead of a MERGE statement. The widget shows both SQL statements and the “replaced” tag in the result; same end state as merge, different mechanics.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  incremental_strategy='delete+insert'
) }}
select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

fct_orders: before

rows this run processes

fct_orders: after

insert_overwrite

Replaces entire partitions rather than individual rows. No unique_key; the granularity is the partition, not the row. Supported on Snowflake, Databricks, Spark, Redshift, and Postgres. Not supported on BigQuery.

{{ config(
  materialized='incremental',
  incremental_strategy='insert_overwrite',
  partition_by={'field': 'created_date', 'data_type': 'date'}
) }}

The key risk: if the batch for a partition does not contain all the rows that were previously there, the missing rows are gone. In the example below, e6 and e7 existed in the Jan 15 partition but were not in the batch; they are permanently lost after the overwrite.

fct_events: before

Jan 13
e1 click
e2 view
Jan 14
e3 click
e4 purchase
Jan 15 this run's partition
e5 click
e6 view
e7 purchase

batch arriving, Jan 15 only

e5 scroll updated
e8 purchase new

e6 and e7 are not in this batch.

fct_events: after

Jan 13
e1 click
e2 view
Jan 14
e3 click
e4 purchase
Jan 15 partition replaced
e5 scroll new
e8 purchase new
⚠ e6 (view) and e7 (purchase) were in Jan 15 but not in the batch; permanently lost

microbatch

Splits the run into one bounded query per time period (hour, day, month). Each batch is independent, so dbt can run them in parallel and retry just the failed batches. Built for large time-series tables where a single incremental query is too slow or too error-prone.

{{ config(
  materialized='incremental',
  incremental_strategy='microbatch',
  event_time='created_at',
  batch_size='day'
) }}

Supported on Snowflake, Databricks, Spark, Trino, and Athena. Not yet available on Postgres, BigQuery, or Redshift.

Instead of one query with an is_incremental() filter, dbt splits the run into one bounded query per time period, here per day. Each batch is independent, so dbt can run them in parallel and retry only the ones that fail.

{{ config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='created_at',
    batch_size='day',
    begin='2024-01-01',
) }}
select * from {{ ref('page_views') }}
Jun 9
Jun 10
Jun 11
Jun 12
Jun 13
Jun 14
Jun 15today
  • already loaded
  • this run's batches
  • failed batch

Advanced config

on_schema_change

Controls what happens when you add, remove, or change columns in the model SQL. The default ignore silently discards new columns, a frequent source of schema drift that only surfaces later.

In the widget below, the model SQL changes from [order_id, amount, legacy_field] to [order_id, amount, status]. Pick each option to see what the target table looks like after the next incremental run.

Target table (current)
order_id amount legacy_field
New model SQL columns
order_id amount status

merge_update_columns / merge_exclude_columns

Restrict which columns are updated during a merge. Useful when some columns are set once on insert and should never be overwritten; for example, created_at.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  merge_update_columns=['status', 'updated_at']
) }}

incremental_predicates

Limits the scan of the existing target table during merge, reducing cost on large tables. Use DBT_INTERNAL_DEST to reference the target and DBT_INTERNAL_SOURCE for the new batch.

{{ config(
  materialized='incremental',
  unique_key='order_id',
  incremental_predicates=["DBT_INTERNAL_DEST.created_date >= dateadd('day', -7, current_date)"]
) }}

This tells dbt: only look for matching rows in the last 7 days of the target table, not the full history. Make sure your is_incremental() filter produces rows only within the predicate window, otherwise existing rows outside it will not be updated.


Use the widget below to compare all three row-level strategies side by side. All controls are live.

is_incremental() gates a where filter; the incremental_strategy decides how the rows it lets through land in the table. Flip the switches to see what each run writes to fct_orders.

select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

dbt run -s fct_orders

fct_orders: before

rows this run processes

fct_orders: after

When to use this pattern

Reach for an incremental model when a table is large and append-mostly (event logs, clickstream, orders) and a full rebuild has become too slow or too expensive. For small tables, a plain table model is simpler and the extra moving parts are not worth it.