Setup CI
Automatically validate your data changes in every merge request using Recce Cloud. Catch data issues before they reach production, with validation results right in your MR.
Purpose
Automated MR Validation prevents data regressions before merge.
- Triggers: MR opened/updated against main
- Action: Auto-update Recce session for MR validation
- Benefit: Automated data validation and comparison
Prerequisites
You need manifest.json and catalog.json files (dbt artifacts) for Recce Cloud. See Start Free with Cloud for instructions on preparing these files.
Implementation
1. Core Workflow
Add to your .gitlab-ci.yml:
include:
- component: gitlab.com/recce/recce-cloud-cicd-component/[email protected]
inputs:
stage: test
stages:
- build
- test
variables:
DBT_TARGET: "ci"
dbt-build:
stage: build
image: python:3.11-slim
script:
- pip install -r requirements.txt
# Install dbt packages
- dbt deps
# Optional: Build tables to ensure they're materialized
# - dbt build --target $DBT_TARGET
# Required: Generate artifacts for comparison
- dbt docs generate --target $DBT_TARGET
artifacts:
paths:
- target/
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
The included Recce Cloud component automatically:
- Creates a session in Recce Cloud for the merge request
- Uploads your dbt artifacts (
manifest.jsonandcatalog.json) - Provides session URL for validation review
2. Component Configuration Options
The component accepts optional inputs for customization:
include:
- component: gitlab.com/recce/recce-cloud-cicd-component/[email protected]
inputs:
stage: test # Pipeline stage (default: test)
dbt_target_path: target # Path to dbt artifacts (default: target)
base_branch: main # Base branch for comparison (default: main)
gitlab_token: $CUSTOM_GITLAB_TOKEN # Custom GitLab token (default: $CI_JOB_TOKEN)
Default Configuration (shown in example above):
- Component uses
$CI_JOB_TOKENautomatically (no manual token setup required) - Uploads from
target/directory by default - Compares against
mainbranch
Custom Token (optional):
If you need to use a custom GitLab token instead of the default $CI_JOB_TOKEN:
- Create a Project Access Token with
apiscope - Add it as a CI/CD variable in your project
- Reference it in the component inputs:
include: - component: gitlab.com/recce/recce-cloud-cicd-component/[email protected] inputs: gitlab_token: $CUSTOM_GITLAB_TOKEN
3. Artifact Preparation Options
Default: Fresh Build (shown in example above)
dbt docs generateis required and provides all needed artifactsdbt buildis optional but ensures tables are materialized and updated
Alternative Methods:
- External Download: Download from dbt Cloud, Paradime, or other platforms
- Pipeline Integration: Use existing dbt build workflows
4. Verification
Test with an MR
- Create a test MR with small data changes
- Check CI/CD → Pipelines for workflow execution
- Verify validation runs successfully
Verify Success
- ✅ Pipeline completes without errors in CI/CD → Pipelines
- ✅ MR session updated in Recce Cloud
- ✅ Session URL appears in pipeline job output
Review MR Session
To analyze the MR changes in detail:
- Go to your Recce Cloud
- Find the MR session that was created
- Launch Recce instance to explore data differences
Or use the session launch URL from the pipeline output:
Troubleshooting
Missing dbt files
Error: Missing manifest.json or Missing catalog.json
Solution: Ensure dbt docs generate runs successfully before the Recce component:
Authentication issues
Error: Failed to create session: 401 Unauthorized
Solutions:
- Verify Recce Cloud GitLab integration is set up for your project
- Check that your project is connected in Recce Cloud settings
- For custom tokens, ensure the token has
apiscope (setup guide)
Upload failures
Error: Failed to upload manifest/catalog
Solutions:
- Check network connectivity to Recce Cloud
- Verify artifact files exist in
target/directory - Review pipeline job logs for detailed error messages
- Ensure artifacts are passed between jobs:
Complete Example
Here's a full working example combining dbt build and Recce validation:
include:
- component: gitlab.com/recce/recce-cloud-cicd-component/[email protected]
inputs:
stage: test
stages:
- build
- test
variables:
DBT_TARGET: "ci"
dbt-build:
stage: build
image: python:3.11-slim
before_script:
- pip install -r requirements.txt
script:
- dbt deps
- dbt build --target $DBT_TARGET
- dbt docs generate --target $DBT_TARGET
artifacts:
paths:
- target/
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
See the complete example project for a full working configuration.
