Data Developer Workflow
Validate data changes throughout your development lifecycle. This guide covers validating changes before creating a PR (dev sessions) and iterating on feedback after your PR is open.
Goal: Validate data changes at every stage of development, from local work through PR merge.
Prerequisites
- Recce Cloud account
- dbt project with CI/CD configured for Recce
- Access to your data warehouse
Development Stages
Before PR: Dev Sessions
Validate changes locally before pushing to remote. Dev sessions let you run Recce validation without creating a PR.
Upload via Web UI
- Go to Cloud
- Navigate to your project
- Click New Dev Session
- Upload your dbt artifacts:
target/manifest.jsontarget/catalog.json- Select your base environment for comparison
Expected result: Dev session opens with lineage diff showing your changes.
Upload via CLI
Run from your dbt project directory:
This uploads your current target/ artifacts and creates a dev session.
Required files:
| File | Location | Generated by |
|---|---|---|
manifest.json |
target/ |
dbt run, dbt build, or dbt compile |
catalog.json |
target/ |
dbt docs generate |
When to Use Dev Sessions
- Testing changes before committing
- Validating complex refactoring locally
- Exploring impact without creating a PR
- Sharing work-in-progress with teammates
After PR: CI/CD Validation
Once you push changes and open a PR, the Recce Agent validates automatically.
What Happens
- Your CI pipeline runs
recce-cloud upload - The agent compares your PR branch against the base branch
- The agent runs validation checks based on detected changes
- A data review summary posts to your PR
Understanding the Agent Summary
The summary includes:
- Change overview - Which models changed and how
- Impact analysis - Downstream models affected
- Validation results - Schema diffs, row counts, and other checks
- Recommendations - Suggested actions for review
Fixing Issues
When the agent identifies issues:
- Review the validation results in the PR comment
- Click Launch Recce to explore details in the web UI
- Identify the root cause using lineage and data diffs
- Make fixes in your branch
- Push changes - the agent re-validates automatically
Iterating Until Checks Pass
Each push triggers a new validation cycle:
- Agent re-analyzes your changes
- New validation results post to the PR
- Previous results are updated (not duplicated)
- Continue until all checks pass
Validation Techniques
Check Lineage First
Start with lineage diff to understand your change scope:
- Modified models highlighted in the DAG
- Downstream impact visible at a glance
- Schema changes shown per model
Validate Metadata
Low-cost checks using model metadata:
- Schema diff - Column additions, removals, type changes
- Row count diff - Record count comparison (uses warehouse metadata)
Validate Data
Higher-cost checks that query your warehouse:
- Value diff - Column-level match percentage
- Profile diff - Statistical comparison (count, distinct, min, max, avg)
- Histogram diff - Distribution changes for numeric columns
- Top-K diff - Distribution changes for categorical columns
Custom Queries
For flexible validation, use query diff:
SELECT
date_trunc('month', order_date) AS month,
SUM(amount) AS revenue
FROM {{ ref('orders') }}
GROUP BY month
ORDER BY month DESC
Add queries to your checklist for repeated use.
Verification
Confirm your workflow works:
- Make a small model change locally
- Generate artifacts:
dbt build && dbt docs generate - Upload dev session:
recce-cloud upload --type dev - Verify session appears in Cloud
- Create PR and confirm agent posts summary
Troubleshooting
| Issue | Solution |
|---|---|
| Dev session upload fails | Check artifacts exist in target/; run dbt docs generate |
| Agent doesn't run on PR | Verify CI workflow includes recce-cloud upload |
| Validation results missing | Check warehouse credentials in CI secrets |
| Summary not appearing | Confirm GITHUB_TOKEN has PR write permissions |
Next Steps
- Data Reviewer Workflow - How reviewers use Recce
- Admin Setup - Set up your organization
- PR/MR Data Review - Understanding agent summaries