Skip to content

Data Developer Workflow

Validate data changes throughout your development lifecycle. This guide covers validating changes before creating a PR (dev sessions) and iterating on feedback after your PR is open.

Goal: Validate data changes at every stage of development, from local work through PR merge.

Prerequisites

  • Recce Cloud account
  • dbt project with CI/CD configured for Recce
  • Access to your data warehouse

Development Stages

Before PR: Dev Sessions

Validate changes locally before pushing to remote. Dev sessions let you run Recce validation without creating a PR.

Upload via Web UI

  1. Go to Cloud
  2. Navigate to your project
  3. Click New Dev Session
  4. Upload your dbt artifacts:
  5. target/manifest.json
  6. target/catalog.json
  7. Select your base environment for comparison

Expected result: Dev session opens with lineage diff showing your changes.

Upload via CLI

Run from your dbt project directory:

recce-cloud upload --type dev

This uploads your current target/ artifacts and creates a dev session.

Required files:

File Location Generated by
manifest.json target/ dbt run, dbt build, or dbt compile
catalog.json target/ dbt docs generate

When to Use Dev Sessions

  • Testing changes before committing
  • Validating complex refactoring locally
  • Exploring impact without creating a PR
  • Sharing work-in-progress with teammates

After PR: CI/CD Validation

Once you push changes and open a PR, the Recce Agent validates automatically.

What Happens

  1. Your CI pipeline runs recce-cloud upload
  2. The agent compares your PR branch against the base branch
  3. The agent runs validation checks based on detected changes
  4. A data review summary posts to your PR

Understanding the Agent Summary

The summary includes:

  • Change overview - Which models changed and how
  • Impact analysis - Downstream models affected
  • Validation results - Schema diffs, row counts, and other checks
  • Recommendations - Suggested actions for review

Fixing Issues

When the agent identifies issues:

  1. Review the validation results in the PR comment
  2. Click Launch Recce to explore details in the web UI
  3. Identify the root cause using lineage and data diffs
  4. Make fixes in your branch
  5. Push changes - the agent re-validates automatically

Iterating Until Checks Pass

Each push triggers a new validation cycle:

  1. Agent re-analyzes your changes
  2. New validation results post to the PR
  3. Previous results are updated (not duplicated)
  4. Continue until all checks pass

Validation Techniques

Check Lineage First

Start with lineage diff to understand your change scope:

  • Modified models highlighted in the DAG
  • Downstream impact visible at a glance
  • Schema changes shown per model

Validate Metadata

Low-cost checks using model metadata:

  • Schema diff - Column additions, removals, type changes
  • Row count diff - Record count comparison (uses warehouse metadata)

Validate Data

Higher-cost checks that query your warehouse:

  • Value diff - Column-level match percentage
  • Profile diff - Statistical comparison (count, distinct, min, max, avg)
  • Histogram diff - Distribution changes for numeric columns
  • Top-K diff - Distribution changes for categorical columns

Custom Queries

For flexible validation, use query diff:

SELECT
    date_trunc('month', order_date) AS month,
    SUM(amount) AS revenue
FROM {{ ref('orders') }}
GROUP BY month
ORDER BY month DESC

Add queries to your checklist for repeated use.

Verification

Confirm your workflow works:

  1. Make a small model change locally
  2. Generate artifacts: dbt build && dbt docs generate
  3. Upload dev session: recce-cloud upload --type dev
  4. Verify session appears in Cloud
  5. Create PR and confirm agent posts summary

Troubleshooting

Issue Solution
Dev session upload fails Check artifacts exist in target/; run dbt docs generate
Agent doesn't run on PR Verify CI workflow includes recce-cloud upload
Validation results missing Check warehouse credentials in CI secrets
Summary not appearing Confirm GITHUB_TOKEN has PR write permissions

Next Steps