Skip to main content

Query Diff Documentation

Overview

Query Diff in DataDios helps you validate data after migration by comparing query results from your source and target systems. This ensures that your data has been migrated accurately and consistently.


Stages of Query Diff

  1. Locate Query Diff in DataDios
  2. Create or import the required Data Sources
  3. Define and select Key Columns for verification
  4. Process the data sources
  5. View summary and compare differences

Steps to Perform Query Diff

Step 1: Navigate to Query Diff

  1. Go to the Smart Diff tab in DataDios

  2. Click on Query Diff

    Query Diff Tab

  3. Click Create Diff → This will open the Source and Target Data Source Selection page

On this page, you can:

  • Switch from Datasource Diff (default) to Query Diff

  • alt text

  • Define your workflow name or diff job name

  • Add a description

  • Select Source and Target Data Sources

    Select Data Sources


Step 2: Add Queries

  1. Select your data sources and click Next

  2. Add queries for both Source and Target

    Ensure queries are validated before adding

  3. Click Next

    Add Query


Step 3: Analyse Step

  1. The Analyse step automatically maps matching columns between queries

  2. Review the mapping

  3. Select the Key Columns to be used in comparison

    Analyse Step


Step 4: Execute Stage

  1. Select the Key Columns that may differ across the two data sources

  2. (Optional) Schedule your workflow by clicking Schedule Workflow

  3. Click Next to start execution and view diff results

    Diff Analyse Step


Step 5: Diff Summary

  1. Wait for the diff process to complete (time varies by dataset size)

  2. Once complete, review the Diff Overview

    Diff Overview

  3. Click View Diff to explore detailed differences

Types of Diffs:

  • Metadata Diff Shows differences between the structure (columns) of your queries

    Metadata Diff

  • Data Diff Shows row-level differences in query results

    Data Diff


Best Practices

  1. Create Data Sources first: Always configure and test your source and target before running Query Diff
  2. Validate queries: Ensure SQL queries are correct and return expected results
  3. Choose Key Columns wisely: These should uniquely identify rows to ensure accurate comparison
  4. Use scheduling(if Required): Automate query diff execution for repeated validation tasks