1. Introduction: Why Data Contracts Matter in Modern Analytics

Modern analytics stacks are becoming increasingly complex, with data flowing from numerous sources through multiple transformation layers before reaching dashboards, reports, and ML models. Without clear agreements about data structure, meaning, and quality, this complexity quickly leads to instability and distrust.

This is where data contracts come in, there are formal agreements that define what data should look like at different stages of your pipeline.

Data contracts aren’t just nice-to-have documentation; they’re executable specifications that verify your data matches expected patterns, ensuring:

  • Prevention of silent data pipeline failures
  • Immediate detection of schema violations
  • Clearer communication between data producers and consumers
  • Reduced time spent debugging unexpected data issues

In this guide, we’ll explore how to implement robust data contracts in dbt using the recently added schema contract enforcement features. You’ll learn implementation strategies, best practices, and how to integrate these contracts into your broader data quality framework.

2. Understanding dbt Contracts: More Than Just Types

Since version 1.3, dbt has offered contract enforcement capabilities that go beyond simple typing systems. Here’s what they enable you to do:

Core Contract Capabilities

  • Define expected column types for models
  • Enforce presence of required columns
  • Control contract strictness to match your team’s needs
  • Validate against physical schemas in your data warehouse

What’s New in dbt-core 1.5+

dbt-core 1.5+ enhanced contract enforcement with:

  • Constraints on column values (e.g., non-null)
  • Per-column policies for selective enforcement
  • Graceful contract evolution with better error messages

These features have transformed dbt contracts from basic type checking to comprehensive schema management tools that can enforce complex rules across your data models.

3. When to Use Contracts: Strategic Implementation

Data contracts aren’t needed everywhere, and over-implementing them can create unnecessary maintenance overhead. Here’s a strategic approach to where contracts deliver the most value:

High-Value Use Cases

  1. Interface layers between teams or systems
  2. Published data products consumed by multiple stakeholders
  3. Critical analytical tables that power key business decisions
  4. External data sharing with partners or customers
  5. Models with strict SLAs where failures are costly

Lower-Value Use Cases

  1. Internal intermediate models
  2. Exploratory or experimental models
  3. Rapidly evolving models during development

📌 Key Insight: Focus contract enforcement where stability and reliability matter most: interfaces, outputs, and critical business entities. Don’t try to contract everything at once.

4. How to Implement dbt Contracts: A Step-by-Step Guide

Let’s walk through implementing contracts in dbt from basic to advanced patterns.

Setting Up Your Project for Contracts

First, make sure you’re using dbt version 1.3 or later. Then, add these configurations to your dbt_project.yml to enable contracts:

# dbt_project.yml
 
models:
  your_project_name:
    +contract: false  # Default: don't enforce contracts
    marts:  # Apply contracts to exposed marts/dimensional models
      +contract: true  # Enable contract enforcement

Basic Column-Level Contracts

For your first model contract, focus on defining the essential columns and their expected types:

# models/marts/core/schema.yml
 
version: 2
 
models:
  - name: dim_customers
    description: "Core customer dimension with validated schema"
    config:
      contract:
        enforced: true
    columns:
      - name: customer_id
        data_type: varchar
        description: "Primary key for the customer dimension"
        constraints:
          - type: not_null
          - type: unique
      - name: customer_email
        data_type: varchar
        description: "Customer email address"
        constraints:
          - type: not_null
      - name: customer_name
        data_type: varchar
        description: "Customer full name"
      - name: signup_date
        data_type: date
        description: "Date when customer signed up"
      - name: total_orders
        data_type: integer
        description: "Count of customer's lifetime orders"
        constraints:
          - type: not_null
          - type: greater_than_or_equal_to:
              value: 0

What Happens Behind the Scenes

When you run dbt build, dbt will:

  1. Generate a contract validation query for dim_customers
  2. Compare the model’s actual schema against the expected contract
  3. Fail the build if:
    • Required columns are missing
    • Column data types don’t match
    • Column constraints are violated

Database-Specific Data Types

dbt contracts handle database-specific types through abstraction. Here’s a reference table for common data types across warehouses:

dbt TypeSnowflakeBigQueryRedshiftPostgres
varcharVARCHARSTRINGVARCHARVARCHAR
integerINTEGERINT64INTEGERINTEGER
floatFLOATFLOAT64FLOATFLOAT
numericNUMERICNUMERICNUMERICNUMERIC
booleanBOOLEANBOOLBOOLEANBOOLEAN
timestampTIMESTAMPTIMESTAMPTIMESTAMPTIMESTAMP
dateDATEDATEDATEDATE
arrayARRAYARRAYSUPERARRAY
objectOBJECTSTRUCTSUPERJSONB

Use these types in your contract definitions for cross-database compatibility.

5. Advanced Contract Enforcement Strategies

Once you’re comfortable with basic contracts, you can implement more sophisticated enforcement approaches.

Contract Strictness Levels

dbt offers different levels of contract enforcement:

# models/marts/finance/schema.yml
models:
  - name: fct_transactions
    config:
      contract:
        enforced: true
        strictness: strict # Options: strict, non-strict

Strictness Levels:

  • strict: Requires exact schema match (columns, types, constraints)
  • non-strict: Only enforces defined columns, allows additional columns

Partial Contracts

For large models, you can focus contract enforcement on critical columns:

# models/marts/finance/schema.yml
models:
  - name: large_analytical_model
    config:
      contract:
        enforced: true
        strictness: non-strict
    columns:
      # Only define and enforce contracts on critical columns
      - name: transaction_id
        data_type: varchar
      - name: amount
        data_type: numeric(18,2)
      # Other columns exist but aren't enforced

Conditional Contract Enforcement

Use dbt’s macro system to conditionally enforce contracts in different environments:

-- models/marts/core/dim_products.sql
 
{{
  config(
    contract = {
      'enforced': env_var('DBT_ENVIRONMENT', 'development') == 'production'
    }
  )
}}
 
select
  product_id,
  product_name,
  category,
  price
from {{ ref('stg_products') }}

This approach allows you to:

  • Enforce contracts strictly in production
  • Be more permissive during development and testing

6. Real-World Implementation: Data Contract Patterns

Let’s explore practical contract patterns for different types of models.

Pattern 1: Core Dimensional Models

For dimension tables that represent core business entities:

# models/marts/core/schema.yml
models:
  - name: dim_customers
    config:
      contract:
        enforced: true
    columns:
      - name: customer_id
        data_type: varchar
        constraints:
          - type: not_null
          - type: unique
      - name: customer_email
        data_type: varchar
        tests:
          - not_null
          - unique
      - name: first_name
        data_type: varchar
      - name: last_name
        data_type: varchar
      - name: full_name
        data_type: varchar
      - name: signup_date
        data_type: date
      - name: customer_status
        data_type: varchar
        constraints:
          - type: accepted_values:
              values: ['active', 'inactive', 'churned']
      - name: is_deleted
        data_type: boolean
        constraints:
          - type: not_null

Pattern 2: Fact Tables with Constraints

For fact tables that capture business events:

# models/marts/sales/schema.yml
models:
  - name: fct_orders
    config:
      contract:
        enforced: true
    columns:
      - name: order_id
        data_type: varchar
        constraints:
          - type: not_null
          - type: unique
      - name: customer_id
        data_type: varchar
        constraints:
          - type: not_null
        tests:
          - relationships:
              to: ref('dim_customers')
              field: customer_id
      - name: order_date
        data_type: date
        constraints:
          - type: not_null
      - name: order_status
        data_type: varchar
        constraints:
          - type: not_null
          - type: accepted_values:
              values: ['pending', 'processing', 'shipped', 'delivered', 'cancelled']
      - name: item_count
        data_type: integer
        constraints:
          - type: not_null
          - type: greater_than_or_equal_to:
              value: 1
      - name: order_amount
        data_type: numeric(18,2)
        constraints:
          - type: not_null
      - name: shipping_cost
        data_type: numeric(18,2)
      - name: tax_amount
        data_type: numeric(18,2)
      - name: total_amount
        data_type: numeric(18,2)
        constraints:
          - type: not_null

Pattern 3: Data Product APIs

For models explicitly exposed to consumers:

# models/apis/public_data_products/schema.yml
models:
  - name: product_api_daily_sales
    description: |
      Public data product showing daily sales aggregates.
      This model has a strict contract that will not change
      without explicit versioning and migration support.
    config:
      contract:
        enforced: true
        strictness: strict
    columns:
      - name: date_day
        data_type: date
        constraints:
          - type: not_null
          - type: unique
      - name: product_id
        data_type: varchar
        constraints:
          - type: not_null
      - name: product_name
        data_type: varchar
        constraints:
          - type: not_null
      - name: category
        data_type: varchar
        constraints:
          - type: not_null
      - name: total_quantity_sold
        data_type: integer
        constraints:
          - type: not_null
          - type: greater_than_or_equal_to:
              value: 0
      - name: total_revenue
        data_type: numeric(18,2)
        constraints:
          - type: not_null
      - name: average_unit_price
        data_type: numeric(18,2)

7. Implementing Contracts with Existing Tests

Data contracts and tests serve complementary purposes:

  • Contracts enforce structural expectations (columns, types)
  • Tests verify data quality expectations (uniqueness, relationships)

Here’s how to implement both effectively:

Combine Contracts with Tests

For comprehensive validation, combine contracts with tests:

# models/marts/core/schema.yml
models:
  - name: dim_products
    config:
      contract:
        enforced: true
    columns:
      - name: product_id
        data_type: varchar
        constraints:
          - type: not_null
        tests:
          - unique
      - name: product_name
        data_type: varchar
        constraints:
          - type: not_null
        tests:
          - not_null_proportion:
              at_least: 0.99  # Allow up to 1% missing names
      - name: category_id
        data_type: varchar
        constraints:
          - type: not_null
        tests:
          - relationships:
              to: ref('dim_categories')
              field: category_id
      - name: price
        data_type: numeric(18,2)
        tests:
          - dbt_expectations.expect_column_values_to_be_between:
              min_value: 0
              max_value: 1000
              mostly: 0.95  # Allow some outliers

Testing vs. Constraints: When to Use Each

Validation TypeUse Constraints WhenUse Tests When
Not null checksCritical for model functionalityMonitoring data quality
UniquenessCore identity requirementStatistical validation
Accepted valuesSmall, stable set of valuesLarger, changing set of values
Data boundariesHard limits that shouldn’t be crossedStatistical ranges with exceptions
RelationshipsN/A - Use tests for thisAlways use tests for relationships

📌 Best Practice: Use constraints for structural guarantees that should never be violated, and tests for data quality checks that may have exceptions or require monitoring.

8. Evolving Data Contracts Over Time

Data contracts shouldn’t be rigid - they need to evolve. Here’s how to manage that evolution:

Contract Versioning Strategy

  1. Minor changes (adding optional columns, relaxing constraints):

    • Can be done without breaking consumers
    • Update documentation and notify users
  2. Major changes (removing columns, changing types, adding required fields):

    • Create a new version of the model
    • Support both versions during migration period
    • Explicitly deprecate the old version

Example: Versioning a Contract

# Original model
models:
  - name: customer_api_v1
    config:
      contract:
        enforced: true
    columns:
      - name: customer_id
        data_type: varchar
      - name: email
        data_type: varchar
      - name: name
        data_type: varchar

When adding a breaking change:

# New version with breaking changes
models:
  - name: customer_api_v2
    config:
      contract:
        enforced: true
    columns:
      - name: customer_id
        data_type: varchar
      - name: email
        data_type: varchar
      - name: first_name  # Split name into components
        data_type: varchar
      - name: last_name
        data_type: varchar
      - name: phone  # New required field
        data_type: varchar
        constraints:
          - type: not_null
 
  # Keep old version during transition
  - name: customer_api_v1
    config:
      contract:
        enforced: true
      materialized: view  # Make it a view on top of v2
    columns:
      - name: customer_id
        data_type: varchar
      - name: email
        data_type: varchar
      - name: name
        data_type: varchar

With corresponding SQL for backward compatibility:

-- models/apis/customer_api_v1.sql
{{
  config(
    contract = {
      'enforced': true
    },
    materialized = 'view'
  )
}}
 
select
  customer_id,
  email,
  concat(first_name, ' ', last_name) as name
from {{ ref('customer_api_v2') }}

9. Handling Failures and Troubleshooting

When contracts fail, you need clear debugging paths:

Common Contract Failure Scenarios

Failure TypeExample ErrorTroubleshooting Steps
Missing columnColumn 'customer_status' not found in modelCheck model SQL for missing column, verify it’s being selected
Type mismatchExpected type 'numeric', got 'varchar'Examine source data, add explicit casting in model
Constraint violationNot null constraint failed for column 'order_id'Check for null handling in joins, verify source data quality
Contract reference errorContract not found for model 'dim_products'Check schema file paths, verify model name spelling

Debugging Contract Issues

When you encounter contract errors, here’s a systematic approach:

  1. Examine the error message for specific column and type information
  2. Preview model data with a simple SELECT to verify actual types and values
  3. Check model SQL for missing columns or incorrect transformations
  4. Verify upstream data hasn’t changed unexpectedly
  5. Consider if contract is too strict for the current use case

Example troubleshooting command:

# Run with --fail-fast to stop at the first error
dbt build --select my_model --fail-fast

Then check the compiled SQL and actual output:

# Preview model output with inferred column types
dbt compile --select my_model
# Check the compiled SQL in ./target/compiled/{project_name}/my_model.sql
# Run this in your warehouse to examine actual data and types

10. Advanced Patterns: Beyond Basic Contracts

Let’s explore some advanced patterns for large-scale implementations:

Contract Inheritance

For related models that share similar contracts:

# Define a base contract using YAML anchors
base_contracts:
  &user_contract
  columns:
    - name: user_id
      data_type: varchar
      constraints:
        - type: not_null
        - type: unique
    - name: email
      data_type: varchar
      constraints:
        - type: not_null
    - name: signup_date
      data_type: timestamp
 
models:
  # Inherit the base contract
  - name: dim_users
    config:
      contract:
        enforced: true
    columns: *user_contract  # Reference the base contract
    
  # Extend the base contract
  - name: dim_premium_users
    config:
      contract:
        enforced: true
    columns:
      # Include base contract
      - *user_contract
      # Add more columns
      - name: subscription_level
        data_type: varchar
      - name: monthly_fee
        data_type: numeric(12,2)

Automated Contract Generation

For existing models without contracts, you can bootstrap contracts using dbt’s run-operation:

-- macros/generate_model_contract.sql
{% macro generate_model_contract(model_name) %}
  {% set relation = adapter.get_relation(
      database=target.database,
      schema=target.schema,
      identifier=model_name
  ) %}
  
  {% if relation %}
    {% set columns = adapter.get_columns_in_relation(relation) %}
    
    {% if execute %}
      {{ log('# Contract for ' ~ model_name ~ ':', info=True) }}
      {{ log('columns:', info=True) }}
      
      {% for column in columns %}
        {{ log('  - name: ' ~ column.name, info=True) }}
        {{ log('    data_type: ' ~ column.data_type, info=True) }}
      {% endfor %}
    {% endif %}
  {% else %}
    {{ exceptions.raise_compiler_error("Model " ~ model_name ~ " does not exist in the current environment.") }}
  {% endif %}
{% endmacro %}

Run it to generate a contract yaml:

dbt run-operation generate_model_contract --args '{"model_name": "dim_customers"}'

11. Integration with dbt Project Structure

Contracts should be integrated thoughtfully into your overall dbt project structure:

Organizing Contract Files

For large projects, consider these options:

  1. Embedded in schema.yml (simplest approach)

    models/
      marts/
        core/
          schema.yml  # Contains both tests and contracts
    
  2. Dedicated contract files (for complex projects)

    models/
      marts/
        core/
          schema.yml       # Contains tests and docs
          contracts.yml    # Contains only contracts
    

Contract Implementation by Layer

Different layers of your dbt project require different contract approaches:

LayerContract ApproachExample
SourcesMonitoring, not enforcementSource freshness checks
StagingLight contracts on critical fieldsBasic type enforcement
IntermediateMinimal contractsFocus on critical models only
Marts/DimensionsComprehensive contractsFull schema enforcement
APIs/Exposed modelsStrict contractsVersion and document carefully

Naming Conventions for Versioned Contracts

For explicit contract versioning:

models/
  apis/
    v1/
      customer_api.sql
      schema.yml
    v2/
      customer_api.sql
      schema.yml

12. Integrating Contracts with Your Data Development Lifecycle

To get maximum value from contracts, integrate them into your full development cycle:

CI/CD Integration

  1. Run contract validation in CI:

    # .github/workflows/dbt-contracts.yml
    name: Validate dbt Contracts
     
    on:
      pull_request:
        branches: [ main ]
     
    jobs:
      validate-contracts:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v3
          - name: Setup dbt
            uses: dbt-labs/dbt-github-actions/setup@v1.0
          - name: Run contract validation
            run: dbt compile --models tag:contract-critical
  2. Pre-commit hooks for early feedback:

    # .pre-commit-config.yaml
    repos:
    - repo: local
      hooks:
        - id: dbt-compile
          name: dbt compile
          entry: dbt compile --models tag:contract-critical
          language: system
          pass_filenames: false

Documentation Integration

Enhance your dbt docs with contract information:

# models/marts/core/schema.yml
models:
  - name: dim_customers
    description: |
      Core customer dimension table.
      
      ## Contract Information
      This model has a strict contract that guarantees:
      - Every row has a unique customer_id
      - Email addresses are always present
      - Customer status is one of: active, inactive, churned
      
      See our [data contract documentation](link-to-docs) for details on
      breaking vs. non-breaking changes.
    config:
      contract:
        enforced: true

This documentation will appear in your dbt docs site, helping users understand the guarantees provided by the contract.

13. Measuring Contract Effectiveness

To demonstrate the value of contracts, track metrics like:

  1. Contract coverage:

    • % of critical models with contracts
    • % of columns under contract enforcement
  2. Contract violations:

    • Count of contract failures caught in CI/CD
    • Time saved by early detection
  3. Consumer impact:

    • Reduction in downstream data quality issues
    • Decreased time spent debugging schema issues

Use these metrics to guide your contract implementation strategy and show the business value of your data quality investments.

14. Conclusion: The Future of Data Contracts in dbt

Data contracts represent a significant advancement in how we manage data quality and reliability in analytics workflows. By defining and enforcing expectations about data structure directly in dbt, we create more resilient pipelines and clearer communication between data producers and consumers.

As you implement data contracts in your organization:

  1. Start small with your most critical models
  2. Balance strictness against development velocity
  3. Integrate contracts with your existing testing strategy
  4. Document your contracts for stakeholders
  5. Version and evolve contracts intentionally

Remember that contracts are not a silver bullet for data quality, but part of a comprehensive approach that includes testing, monitoring, and governance. Use them strategically to reinforce the foundation of your data platform and build trust in your analytics outputs.

What are your experiences with data contracts in dbt? Have you found certain patterns particularly effective? Share your thoughts in the comments!

Additional Resources: