Back to Resources
Data Engineering

Slowly Changing Dimensions: Types 1-6 Explained with Examples

Saad Selim
May 4, 2026
10 min read

Slowly Changing Dimensions (SCD) are dimension records in a data warehouse that change over time. A customer's address changes. An employee's department changes. A product's price changes. How you handle these changes determines whether your historical analysis is accurate.

The Problem

In a transactional database, you update the record and move on. But in a data warehouse used for analytics, you need to answer questions like:

  • "What was this customer's segment WHEN they placed that order?"
  • "How much revenue did the Northeast region generate last year?" (even though some accounts have since moved to a different region)

If you simply overwrite dimension values, you lose the ability to answer these historical questions accurately.

SCD Types

Type 0: Retain Original (Never Change)

The dimension value is fixed at the time of initial load and never updated, regardless of source changes.

Use case: Original customer acquisition channel, birth date, employee hire date.

Example: Customer's "original_signup_source" stays "Google Ads" forever, even if they later discover you through other channels.

Type 1: Overwrite (No History)

Replace the old value with the new value. No history preserved.

customer_idnamecity
101AliceNew York

Alice moves to Boston. After Type 1 update:

customer_idnamecity
101AliceBoston

Use case: Corrections (fixing typos), attributes where history does not matter (email format standardization).

Pros: Simple, no extra storage, no complexity. Cons: Cannot analyze historical state. "Revenue by customer city" will attribute all of Alice's past orders to Boston.

Type 2: Add New Row (Full History)

Insert a new row for the dimension record with effective dates. The old row is marked as expired.

customer_idnamecityeffective_fromeffective_tois_current
101AliceNew York2024-01-152026-03-01false
101AliceBoston2026-03-019999-12-31true

Use case: Any attribute where historical accuracy matters. Customer segment, territory assignment, product category.

Pros: Complete history, accurate point-in-time analysis. Cons: Table grows larger over time, joins require date range logic, more complex ETL.

Query pattern for Type 2:

-- Get customer's city at the time of their order
SELECT o.order_id, o.amount, c.city
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
    AND o.order_date >= c.effective_from
    AND o.order_date < c.effective_to;

Type 3: Add New Column (Limited History)

Add a column for the previous value. Only keeps one generation of history.

customer_idnamecurrent_cityprevious_citycity_change_date
101AliceBostonNew York2026-03-01

Use case: When you only need the current and immediately prior value (not full history).

Pros: No table growth, simple schema, no date range joins. Cons: Only one prior version kept. Subsequent changes overwrite the previous column.

Type 4: History Table (Separate)

Keep the current value in the main dimension table and store all historical changes in a separate history table.

Main table (current):

customer_idnamecity
101AliceBoston

History table:

customer_idcitychanged_fromchanged_to
101New York2024-01-152026-03-01
101Boston2026-03-01NULL

Use case: When the main dimension table must stay small and fast, but history is needed for auditing.

Pros: Main table stays compact, history available when needed. Cons: Requires joining to history table for point-in-time analysis.

Type 5: Type 4 + Type 1 (Mini-Dimension)

Combines a mini-dimension (frequently changing attributes) with the main dimension. The main dimension has a foreign key to the current mini-dimension row.

Use case: Large dimensions with a few rapidly changing attributes (e.g., customer demographics where income bracket and age group change frequently).

Type 6: Hybrid (Type 1 + 2 + 3 Combined)

Combines Type 1 (current value in every row), Type 2 (history rows), and Type 3 (previous value column):

surrogate_keycustomer_idcurrent_cityhistorical_cityeffective_fromeffective_tois_current
1001101BostonNew York2024-01-152026-03-01false
1002101BostonBoston2026-03-019999-12-31true

Note: current_city is always the LATEST value (updated across all rows via Type 1). historical_city is the value that was active during that row's effective period.

Use case: When you frequently need both current state and historical state in the same query.

Choosing the Right Type

ScenarioRecommended Type
Corrections and data cleaningType 1
Attributes with no analytical history needType 1
Anything requiring point-in-time accuracyType 2
Only need "before and after" comparisonType 3
Fast lookups + audit trailType 4
Large dimensions with mini-dimension patternType 5
Complex analytical needs (current + historical)Type 6

Implementation in dbt

Most modern data warehouses use dbt for SCD Type 2:

-- dbt snapshot (implements Type 2 automatically)
{% snapshot customer_snapshot %}
{{
    config(
      target_schema='snapshots',
      unique_key='customer_id',
      strategy='timestamp',
      updated_at='updated_at',
    )
}}
SELECT * FROM {{ source('raw', 'customers') }}
{% endsnapshot %}

dbt snapshots automatically track changes, add effective_from/effective_to columns, and maintain the is_current flag.

Performance Considerations

  1. Type 2 tables grow. A customer dimension with 1M customers where 10% change quarterly will have 1.4M rows after a year. Plan storage and indexing accordingly.
  2. Index effective dates. Queries on Type 2 dimensions always filter on date ranges. Index (business_key, effective_from, effective_to).
  3. Partition by is_current. Most queries only need current records. Partition or create a view for current-only access.
  4. Surrogate keys. Type 2 requires surrogate keys (auto-incrementing IDs) because the natural business key is no longer unique.

Summary

TypeApproachHistory?Complexity
0Never updateOriginal onlyMinimal
1OverwriteNoneLow
2New row with datesFullMedium
3Previous value columnOne generationLow
4Separate history tableFull (separate)Medium
5Mini-dimensionPartialHigh
6Hybrid (1+2+3)Full + currentHigh

Type 2 is the most common in practice because it balances complexity with analytical power. Use Type 1 for attributes that do not matter historically. Reserve Types 5-6 for edge cases that demand both performance and complete history.

Share this article

Saad Selim

The Skopx engineering and product team

Stay Updated

Get the latest insights on AI-powered code intelligence delivered to your inbox.