What is Recce (Data Review Agent)
Know exactly how code changes impact your data.
Recce is a Data Review Agent that automates data validation for pull requests. When you open a PR, it compares your dev environment against production and surfaces schema changes, data diffs, row counts, and downstream impacts. You see what changed, what it affects, and what passed, all before you merge.
No more merging PRs where the pipeline succeeded but the data is quietly wrong.
How Recce Works
When you open a PR with data changes, Recce automatically:
- Runs data diffing: The best practice to validate data changes
- Analyzes impact: Identifies what changed down to the column level using Column-Level Lineage (CLL)
- Reviews first: The agent provides a data review summary explaining the change and its impact
- Surfaces what matters: Shows only impacted items, not every downstream table
- Opens exploration: Spins up a Recce instance where you can run additional diffs, explore lineage, and investigate deeper
You review the agent's findings, add notes, and approve with confidence, not blind trust.
- PR Created
- Recce Triggered
- Agent Analyzes Production vs. Development Data
- Agent Generates Review Summary
- Human Explore in Recce Instance
- Human Reviews Approves
- PR Merges
Example of Recce agent summary in a GitHub PR comment:

Automate Agent Data Review with CI/CD
Recce delivers value through CI/CD integration. Without it, you waste time triaging false alerts from source data updates and manually comparing environments hoping you caught everything.
With CI/CD:
- Every PR gets automatic validation
- Base and current environments are set up automatically
- Agent reviews before you do
- Checks accumulate as organizational knowledge (preset checks)
When to Use Recce
- Business-critical data: Data that's customer-facing or revenue-impacting
- Team collaboration: When reviewers need to understand impact, not just see code changes
- Standardized validation: When you need consistent review across senior and junior team members
- Unknown unknowns: When you can't predict what might break from a change
When Not to Use
- Teams that accept errors on production and fix later
- Exploratory analysis that won't go to production
FAQ
Does Recce work without CI/CD? Yes, you can run Recce locally for dev sessions. But CI/CD unlocks the full value: automatic validation on every PR without manual setup.
What data platforms does Recce support? Recce works with data warehouses like Snowflake, BigQuery, Redshift, and Databricks. See Connect to Warehouse for setup.
Related
- Interactive Demo: Try the Data Review Agent
- Tutorial: Get Started with Recce Cloud
- Blog: The Problem with Data PR Reviews: Where Do You Even Start?
