Skip to content

Understanding Lineage Diff

The Lineage view is Recce's main interface for visualizing and analyzing how your dbt model changes impact your data pipeline. It shows you the potential area of impact from your modifications, helping you determine which models need further investigation and validation.

What is Data Lineage?

Data lineage tracks the flow and transformation of data through your dbt project. In Recce, the lineage graph shows:

  • Dependencies: Which models depend on others
  • Change Impact: How modifications ripple through your pipeline
  • Data Flow: The path data takes from sources to final outputs

Viewing the Lineage Graph

From the Lineage view, you can determine which models to investigate further and perform various data validation checks that serve as proof-of-correctness of your work.

Recce Lineage Diff

Interactive lineage graph showing modified models

Getting Started

When you first open Recce, the lineage graph automatically loads showing only the models affected by your changes. This focused view helps you quickly understand the impact of your work.

Filter Nodes

In the top control bar, you can change the rule to filter the nodes:

  1. Mode:
  2. Changed Models: Modified nodes and their downstream + 1st degree of their parents.
  3. All: Show all nodes.
  4. Package: Filter by dbt package names.
  5. Select: Select nodes by node selection.
  6. Exclude: Exclude nodes by node selection.

Select Nodes

Click a node to select it, or click the Select nodes button in the top-right corner to select multiple nodes for further operations. For detail, see the Multi Nodes Selections section

Row Count Diff

A row count diff can be performed on nodes selected using the select and exclude options:

After selecting nodes, run the row count diff by:

  1. Clicking the 3 dots (...) button in the top-right corner.
  2. Clicking Row Count Diff by Selector.

Understanding Model Nodes

Visual Status Indicators

Node example

Example model node with status indicators

Models in the lineage graph are color-coded to indicate their status:

  • Green: Added models (new to your project)
  • Red: Removed models (deleted from your project)
  • Orange: Modified models (changed code or configuration)
  • Gray: Unchanged models (shown for context)

Change Detection Icons

Each model node displays two icons in the bottom-right corner that indicate detected changes:

  • Row Count Icon : Shows when row count differences are detected
  • Schema Icon : Shows when column or data type changes are detected

Grayed-out icons indicate no changes were detected in that category.

Model with Schema Change detected

Model with Schema Change detected

Row Count Detection

The row count icon only appears after you've run a row count diff on that specific model. This helps you track which models you've already validated.

Open node details panel

Open the node details panel

Investigating Model Changes

Opening the Node Details Panel

Click on any model in the lineage graph to open the node details panel. This is your starting point for deeper analysis.

Schema Diff

Schema diff helps you understand structural changes to your models.

Requirements

Schema diff requires catalog.json files in both your base and current environments. Make sure to run dbt docs generate in both environments before starting your Recce session.

Viewing Schema Changes

Click on a model to view its schema diff in the node details panel.

Recce Schema Diff

Interactive schema diff showing column changes

Types of Schema Changes

Schema diff identifies:

  • Added columns: New fields in your model (shown in green)
  • Removed columns: Fields that no longer exist (shown in red)
  • Renamed columns: Fields that have changed names (shown with arrows)
  • Data type changes: Modifications to column types

Recce Schema Diff

Schema diff showing renamed column

Code Diff

Understanding the code changes helps you analyze the root cause of data differences.

From any model's node details panel, you can view the exact code changes that were made. This helps you understand:

  • What SQL logic was modified
  • How transformations changed
  • Why data differences might be occurring

Learn more about viewing and analyzing code changes in the Code Diff guide.

Node Details

Node Details Overview

The node details panel provides comprehensive information about the selected model:

Explore the model

Node details panel with exploration options

From this panel, you can:

  • View model information: Node type, materialization, and basic metadata
  • Examine changes: See what specifically changed in the model
  • Run validations: Execute pre-built data diffs and custom queries
  • Add to checklist: Document important findings for review

Available Data Validation Checks

Click the "Explore Change" button to access pre-built validation checks that save time on writing SQL:

  1. Row Count Diff: Compare the number of rows between environments
  2. Profile Diff: Analyze column-level statistics and distributions
  3. Value Diff: Identify specific value changes between datasets
  4. Top-K Diff: Compare the most common values in your data
  5. Histogram Diff: Visualize data distribution changes

Custom Query Analysis

Click "Query" to open the query interface where you can:

  • Write custom SQL to investigate changes
  • Run ad-hoc comparisons between environments
  • Validate specific business logic or data quality rules

Building Your Validation Checklist

As you investigate changes, you can add important findings to your checklist for documentation and collaboration purposes.

Collaboration Best Practice

Use the checklist feature to document your validation process. This creates a clear record of what you've tested and verified, making it easier for teammates to review your changes.

Next Steps

After reviewing the lineage changes:

  1. Validate: Run data diffs on critical models to verify changes are correct
  2. Document: Add key findings to your checklist with clear descriptions
  3. Collaborate: Share your analysis with team members for review
  4. Integrate: Use Recce's workflow integration to automate validation in your CI/CD process

Ready to dive deeper into specific validation techniques? Explore the Data Diffing section to learn about different ways to validate your changes.