What is Recce (Data Review Agent)
No more merging PRs where the pipeline succeeded but the data is quietly wrong.
Recce is a Data Review Agent that automates data validation for pull requests. When you open a PR, it compares your dev environment against production and surfaces schema changes, data diffs, row counts, and downstream impacts. You see what changed, what it affects, and what passed, all before you merge.
Recce is the product. The agent automates validation on your PRs. You can run Recce through Cloud (hosted, automated) or open source (local, manual).
Get Started with Cloud Set Up Open Source
How Recce Works
When you open a PR with data changes, Recce automatically:
- Runs data diffing: The best practice to validate data changes
- Analyzes impact: Identifies what changed down to the column level using Column-Level Lineage (CLL)
- Reviews first: The agent provides a data review summary explaining the change and its impact
- Surfaces what matters: Shows only impacted items, not every downstream table
- Opens exploration: Spins up a Recce instance where you can run additional diffs, explore lineage, and investigate deeper
You review the agent's findings, add notes, and approve with confidence, not blind trust.
- PR Created
- Recce Triggered
- Agent Analyzes Production vs. Development Data
- Agent Generates Review Summary
- Human Explore in Recce Instance
- Human Reviews Approves
- PR Merges
Example of Recce agent summary in a GitHub PR comment:

Automate Agent Data Review with CI/CD
Recce delivers value through CI/CD integration. Without it, you waste time triaging false alerts from source data updates and manually comparing environments hoping you caught everything.
With CI/CD:
- Every PR gets automatic validation
- Base and current environments are set up automatically
- Agent reviews before you do
- Checks accumulate as organizational knowledge (preset checks)
When to Use Recce
- Business-critical data: Data that's customer-facing or revenue-impacting
- Team collaboration: When reviewers need to understand impact, not just see code changes
- Standardized validation: When you need consistent pull request review across senior and junior team members
- Unknown unknowns: When you can't predict what might break from a change
When Not to Use
- Teams that accept errors on production and fix later
- Exploratory analysis that won't go to production
FAQ
Does Recce work without CI/CD? Yes, you can run Recce locally for dev sessions. But CI/CD unlocks the full value: automatic validation on every PR without manual setup.
What data platforms does Recce support? Recce works with data warehouses like Snowflake, BigQuery, Redshift, and Databricks. See Connect to Warehouse for setup.
Next Steps
- Interactive Demo: Try the Data Review Agent
- Tutorial: Get Started with Recce Cloud
- Blog: The Problem with Data PR Reviews: Where Do You Even Start?
