Reconciliation Policy

A Reconciliation Policy in ADOC ensures that two datasets are consistent and trustworthy. It validates whether data has been moved, transformed, or processed correctly across systems.

A policy is a set of rules that define how two datasets (a source and a sink) should be compared. A rule is a specific condition or check applied during reconciliation, for example, comparing row counts, validating column equality, or using hashed row checks.

A reconciliation policy passes only if all its rules are satisfied. Policies can be executed manually or scheduled, and results are tracked to ensure ongoing consistency across systems.

Example:

  • Compare a staging orders table with the production orders table.
  • Rules might include:
    • “Row counts must match between source and sink.” (Row Count Match)
    • “Order IDs in the source must equal Order IDs in the sink.” (Data Equality)
    • “Hashed rows between source and sink must match.” (Hashed Data Equality)

Creating a Reconciliation Policy

You can create a Reconciliation Policy in two main ways:

Option 1: Through Manage Policies

  1. Navigate to Data Reliability > Manage Policies.
  2. Click Add Policy (top-right).
  3. Select Reconciliation as the policy type.
  4. Choose your source asset and sink asset.
  5. The Create Reconciliation Policy page opens for configuration.

Option 2: Through the Asset Details Page

There are two ways to create a Reconciliation Policy from the Asset Details page:

Overview Tab

  1. Open the dataset in the Asset Details page.
  2. In the Overview tab, click Actions > Add Reconciliation Policy.

Policies Tab

  1. Navigate to the Policies tab.
  2. Click Add Policy or use the Actions button and select Reconciliation.

In both cases, the Create Reconciliation Policy page opens for configuration.

Configure Data Selection

Before defining rules, refine the data that will be compared:

  1. Source Asset SQL Filter: Narrow down the dataset from the source asset.

    • Example: region = 'APAC'
  2. Sink Asset SQL Filter – Narrow down the dataset from the sink asset.

    • Example: status = 'Active'

Use valid column names from the chosen assets. Filters must follow Spark SQL syntax.

Configure Rules

Reconciliation rules define how datasets should be compared. Use one or more rule types:

Rule TypeDescriptionExample
Data EqualityCompare values in source vs. sink columns.customer_id = client_id
Hashed Data EqualityCompare hashed rows for integrity.order_hash in source = order_hash sink
Row Count MatchCompare total row counts.Source row count = Sink row count

Each rule includes parameters:

  • Left Column (source) and Right Column (sink)
  • Operator (=, !=, <, >)
  • Ignore Null Values (optional)
  • Success Threshold (0–100%)
  • Warning Threshold (0–100%)

Example Rule Table:

Match TypeLeft ColumnRight ColumnOperatorIgnore NullSuccess ThresholdWarning Threshold
Data Equalitycustomer_idclient_id=Yes10090
Hashed Data Equalityorder_hashorder_hash=No10095
Row Count Match10095

Scheduling and Incremental Checks (Optional)

  • Incremental Strategy: Track changes using an ID, datetime, or partition column.
  • Auto Increment Column: Select the column for incremental runs.
  • Initial Offset: Define the starting marker for the first run.
  • Schedule: Run hourly, daily, weekly, or monthly, with time zone selection.

Example: Run reconciliation daily at midnight, comparing only records where transaction_date is greater than the last successful run.

Alerts & Notifications (Optional)

Alerts help you respond quickly to reconciliation issues.

  • Severity Levels: Critical, High, Medium, Low.
  • Channels: Email, Slack, Microsoft Teams, Webhook, ServiceNow, Chat.
  • Notify on Success / Warning: Optional toggles.
  • Re-notification Options:
    • Never
    • After n failed runs
    • Every time

Example: Send a Slack alert if the row count mismatch exceeds 5%.

Advanced Policy Configuration (Optional)

  • Persistence: Store Good and Bad records for analysis.

  • Timeouts: Define run timeout and total timeout.

  • Execution Engine:

    • Spark (default): For non-SQL or file-based sources.
    • Pushdown: For SQL-based sources, runs directly in the database engine for faster performance.
  • Resource Strategy: Small, Medium, Large, Global, or Custom (define Spark resources manually).

Use Pushdown whenever possible to reduce resource overhead.

Executing a Reconciliation Policy

  1. Navigate to Data Reliability > Manage Policies.
  2. Find your policy and click the Play icon.
  3. Choose execution type:
    • All Data: Compare the entire dataset.
    • Incremental: Use the defined incremental strategy.
    • Selective: Compare data within a specific ID or date range.

Monitoring a Reconciliation Policy

  • View results in the Execution Details page.
  • Review pass or fail status for each rule.
  • Check overall data quality scores.
  • Policies using Pushdown and Spark appear together for easy comparison.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard