Recce MCP Server
When data models change, downstream dashboards and reports can break without warning. The Recce MCP server lets your AI agent validate those changes before they reach production — directly from your editor, through natural language.
MCP (Model Context Protocol) is an open standard that lets AI assistants call external tools. Recce implements an MCP server so your AI agent can run data diffs against your warehouse on your behalf.
Unlike general-purpose database tools, Recce's MCP server is purpose-built for branch comparison. It reads dbt artifacts (manifest.json, catalog.json) to understand your model graph, so your AI agent can reason about lineage, column-level changes, and statistical differences — not just raw SQL.
Claude Code users: skip to the easy path
The Recce Claude Plugin handles all setup automatically — prerequisites, artifact generation, and server startup — in two commands. If you use Claude Code, start there.
What you can do
Once connected, ask your AI agent questions like:
- "What schema changes happened in this branch?"
- "Show me the Row Count Diff for all modified models"
- "Are there any breaking column changes in this PR?"
- "Profile the orders table and compare it against production"
- "Which downstream columns are affected by this change?"
- "Run a Value Diff on the orders model and show me which columns changed"
- "Run a custom SQL query against both dev and prod and show the differences"
Your agent translates these into the appropriate Recce tool calls and returns the results directly in your conversation.
How it works
Recce compares your current branch against a baseline from your main branch. It needs two sets of dbt artifacts — one representing your current work and one representing your base branch. The MCP server reads both artifact sets and runs diffs against your warehouse when your AI agent requests them.
Prerequisites
Before starting the MCP server, you need dbt artifacts for your current branch. Base artifacts are recommended for full diffing but not required.
Generate development artifacts
Run dbt in your current working branch:
This creates target/manifest.json and target/catalog.json.
Generate base artifacts
Generate artifacts from your base branch to a separate directory:
This creates target-base/manifest.json and target-base/catalog.json. The MCP server compares these two artifact sets to produce diffs.
Note
If target-base/ is missing, the MCP server starts in single-environment mode — all tools remain available, but diff results show no changes because both sides reference the same data. Generate base artifacts to enable real comparisons.
Installation
Install Recce with the MCP extra dependency:
Recce works with all major dbt adapters, including Snowflake, BigQuery, Redshift, Databricks, DuckDB, and others.
Configuration
Choose the tab for your AI agent. stdio is simpler (no separate process to manage) and works for most setups. Use SSE only if you need to share a single Recce server across multiple tools simultaneously.
Option A: Recce plugin (recommended)
The Recce Claude Plugin provides guided setup, handles prerequisite checks, generates artifacts, and starts the MCP server — all through interactive commands.
/plugin marketplace add DataRecce/recce-claude-plugin
/plugin install recce-quickstart@recce-claude-plugin
/recce-setup
See the Claude Plugin guide for full details.
Option B: Stdio
Configure Recce as an MCP server with stdio transport. Claude Code automatically launches the server when you start a session.
Then start Claude Code and verify the connection:
> /mcp
╭────────────────────────────────────────────────────────────╮
│ Manage MCP servers │
│ │
│ ❯ 1. recce ✔ connected · Enter to view details │
Option C: SSE
Launch a standalone MCP server that Claude Code connects to via HTTP. Use this if you want to keep the server running independently or share it across tools.
Start the server in a separate terminal:
In another terminal, configure Claude Code:
Add to .cursor/mcp.json in your dbt project:
Or use SSE mode:
Add to ~/.codeium/windsurf/mcp_config.json:
Any MCP-compatible client can use stdio transport:
Available tools
The MCP server exposes these tools to your AI agent. Tools are grouped by availability — some work in all modes, while diff tools require a running server with warehouse access.
Metadata and lineage tools
These tools are always available because they only require dbt artifacts and do not query your data warehouse:
| Tool | Description |
|---|---|
lineage_diff |
Compare data lineage between base and current branches. Returns nodes with change status and impact analysis |
schema_diff |
Detect schema changes (added, removed, or modified columns and type changes) |
get_model |
Get column details (names, types, constraints) for a model from both base and current branches |
get_cll |
Get Column-Level Lineage (CLL): trace which downstream columns are affected by changes |
select_nodes |
Resolve dbt selector expressions to node IDs. Useful for planning before running diffs |
get_server_info |
Get server context including adapter type, git branch, and supported tools |
Diff tools
These tools query your data warehouse and require an active warehouse connection:
| Tool | Description |
|---|---|
row_count_diff |
Compare row counts between branches for specified models |
profile_diff |
Statistical profiling comparison (min, max, avg, distinct count, nulls, and more) |
value_diff |
Compare row-level values using primary key join. Returns per-column match rates |
value_diff_detail |
Get detailed row-level diff showing actual changed, added, and removed values |
top_k_diff |
Compare top-K categorical value distributions between branches |
histogram_diff |
Compare numeric or datetime column distributions as histograms |
query |
Run arbitrary SQL against your data warehouse (supports Jinja and dbt macros) |
query_diff |
Run the same SQL against both branches and compare results |
Check management tools
These tools manage validation checks stored in the running Recce server instance (checks persist for the life of the server process):
| Tool | Description |
|---|---|
list_checks |
List all validation checks with their status and approval state |
run_check |
Run a specific validation check by ID |
create_check |
Create a persistent checklist item from analysis findings. Idempotent — updates existing checks with matching type and parameters |
Checks can also be configured as preset checks in recce.yml. See Preset checks for details.
Note
If base artifacts (target-base/) are not present, the server starts in single-environment mode — all tools remain available, but diff results show no changes. Generate base artifacts to enable real comparisons.
How agents use these tools
The metadata and diff tools work together in a structured validation workflow. A well-configured AI agent follows this pattern:
1. Understand the change
The agent starts with metadata tools to build context before querying the data warehouse:
get_server_info: confirms the connection is ready and which tools are availablelineage_diff: identifies which models changed and which downstream models are impactedselect_nodes: resolves dbt selectors (likestate:modified+) to specific node IDs for targeted analysisget_model: inspects column details of individual models before diffingget_cll: traces Column-Level Lineage to understand which downstream columns are affected
This planning phase helps the agent skip irrelevant models and focus warehouse queries on what matters.
2. Validate the data
With a clear picture of what changed, the agent runs diff tools to validate the data:
schema_diff: detects structural changes (added, removed, or type-changed columns)row_count_diff: checks for unexpected volume changesprofile_diff: compares statistical profiles (min, max, avg, distinct count, nulls)value_diff/value_diff_detail: compares actual row-level values using primary keystop_k_diff/histogram_diff: detects distribution shifts in categorical or numeric columnsquery/query_diff: runs custom SQL for cases not covered by built-in diffs
3. Persist findings as checks
After analysis, the agent calls create_check to save important findings as persistent checklist items. Each check runs automatically to produce verifiable evidence. These checks appear in Recce's validation checklist and PR comments, so reviewers can verify the results independently.
The agent can also use list_checks and run_check to work with existing preset checks configured in recce.yml.
Why metadata tools matter
Without select_nodes and get_cll, an agent would guess which models to validate or diff every model in the project. Metadata tools let the agent focus on what actually changed and what is impacted — reducing warehouse costs and response time.
Troubleshooting
MCP server fails to start
The most common cause is missing dbt artifacts. Check that your dbt artifacts exist:
If missing, run dbt docs generate in your current branch. See Prerequisites.
Diff results show no changes
If the server starts but all diffs return empty results, you are likely in single-environment mode (missing base artifacts). Follow the Generate base artifacts steps to enable real comparisons.
Port already in use (SSE mode)
# Check what's using port 8000
lsof -i :8000
# Use a different port
recce mcp-server --sse --port 8001
Warehouse connection errors
The MCP server uses your profiles.yml to connect to your warehouse. Verify your connection:
Prefer guided setup over manual configuration
If you're using Claude Code and running into issues, the Recce Claude Plugin handles prerequisite checks and provides actionable error messages:
See the Claude Plugin guide for full setup instructions.
FAQ
"How do I validate data changes in my PR using an AI agent?"
Connect Recce's MCP server to your AI agent (Claude Code, Cursor, or Windsurf), then ask questions in natural language. Your agent calls the appropriate validation tools and returns the results.
"Which dbt adapters work with Recce MCP?"
Recce works with all major dbt adapters: Snowflake, BigQuery, Redshift, Databricks, DuckDB, and others.
"Do I need Recce Cloud to use the MCP server?"
No. The MCP server is part of Recce OSS and free to use. Recce Cloud adds automated PR review, team collaboration, and persistent validation history.
"What is MCP and how does Recce use it?"
MCP (Model Context Protocol) is an open standard that allows AI agents to call external tools. Recce implements an MCP server so AI agents can run data diffs against your warehouse on demand.
Next Steps
- Recce Claude Plugin: guided setup for Claude Code users with interactive commands
- Column-Level Lineage: trace how column changes propagate through your model graph
- Row Count Diff: understand row count validation
- Profile Diff: statistical profiling comparisons
- Value Diff: row-by-row data validation
- CI/CD Setup: automate validation in your workflow