Asset Details
To access the details of a data asset, navigate to Reliability > Data Discovery and select an asset. If the asset is a table, it will lead you to the Overview tab; otherwise, you will be directed to the Structure tab, where you can explore specific tables or columns within the asset.
To monitor data freshness and detect anomalies for an asset, enable data freshness in the settings window. For more information, see Data Freshness Policy.
Overview

The following table explains the properties of the asset displayed in the asset details Overview tab:
Details | Description |
---|---|
Overall Reliability | Displays the total reliability score (a combination of all four policy scores). |
Data Freshness | Displays the total data freshness score. |
Data Anomaly | Identifies dataset abnormalities that may indicate abnormal behavior or trends. |
Data Quality | Displays the total score for Data Quality policies. |
Reconciliation | Displays the total score for Reconciliation policies. |
Data Drift | Displays the total score for Data Drift policies. |
Schema Drift | Displays the total score for Schema Drift policies. |
Last Profiled | Displays the date and time at which the most recent asset profile occurred. |
Performance Trend
The Performance Trend chart displays the Data Reliability score over a period of time. You can choose the period to be either one day, one week, one month, last six months, or last one year.

Freshness Trend
The Freshness Trend chart shows data freshness over time. The x-axis represents dates, and the y-axis shows freshness in days.
- Blue bars indicate data freshness.
- Red dots with numbers represent additional freshness metrics.
Hover over a data point for details. Use this chart to track trends and detect inconsistencies.

You can change the displayed metric using the Metric dropdown menu, which includes options such as:
- Data Freshness
- Absolute Asset Size
- Change in Asset Size
- Absolute Row Count
- Change in Row Count
Pipelines
The Pipelines tab provides an overview of all data pipelines associated with the selected asset. It displays key metrics such as the total number of pipelines, recent runs, execution statuses, and alert severity levels. Users can filter pipelines based on multiple parameters, including source, run status, alert severity, policy type, and more. For more information, see About Pipelines.

Policies
The Policies tab displays policy details and scores for all applied policies on the selected asset.
- Data Reliability Score: Shows the overall data reliability score, including the highest, lowest, and average values.
- Policy Scores: Displays scores for different policy types, such as Data Quality, Reconciliation, Freshness, Anomaly, Data Drift, Schema Drift, and Auto Anomaly.
The graph provides an overview of data reliability scores and policy execution trends over time. You can view scores for the last day, week, month, six months, or year. Hovering over a bar reveals the policy type and score.

The Policies table provides details on all applied policies, their execution status, and quality scores. It helps track policy performance and identify potential data issues.

Column Name | Description |
---|---|
Name | The policy name. Click to view detailed information. |
Policy Type | The type of policy, such as Data Quality, Schema Drift, Freshness, Anomaly, or Reconciliation. |
Tags | The associated tags for categorization. |
Quality Score | The percentage score representing the policy's quality assessment. |
Open Alerts | The number of unresolved alerts triggered by the policy. |
Overall Status | The execution result, such as Successful, Warning, or Errored. |
Rule Count | The number of rules applied within the policy. |
Records Processed | The total number of records evaluated by the policy. |
Last Executed | The most recent execution timestamp of the policy. |
Avg Quality Score | The average quality score across multiple executions. |
Total Executions | The number of times the policy has been executed. |
Schedule | The policy execution schedule, if applicable. |
Last Updated | The last modification date of the policy. |
Cadence
The Cadence tab displays key metrics related to your assets like freshness, difference in data sizes and so on. You can compare how your asset has increased or decreased over a period of time. You can also view the average value of increase or decrease in asset size per hour.
This tab is applicable only for Snowflake, Databricks, MySQL, PostgreSQL, BigQuery, S3, ADLS, Azure Data Lake, GCS, HDFS, Hive, and Redshift data assets.
To view the Data Cadence tab:
- Open the asset detail view of either a Snowflake, Databricks, MySQL, PostgreSQL, BigQuery, S3, ADLS, Azure Data Lake, GCS, HDFS, or Redshift asset.
- Click the Cadence tab.

This tab has two visualizations, described as follows:
Freshness Trend
The Freshness Trend section displays a bar graph representing changes in asset rows or asset size during a selected time period.
Y-Axis: Displays one of the following, based on user selection:
- Data Freshness
- Absolute Asset Size
- Change in Asset Size
- Absolute Row Count
- Change in Row Count
X-Axis: Represents the timeline.
Monitoring Execution Jobs for Hive Data Source
When monitoring Hive data sources, execution jobs can be triggered and observed using two methods:
Adhoc Request (Raw Cadence)
The Adhoc Request mode is designed for manual, on-demand metric collection. In this mode:
Metrics must be triggered manually through a POST API request.
A one-time job is executed that:
- Is not scheduled.
- Collects raw-level metrics for monitoring.
- Does not run anomaly detection or policy evaluations.
Only one execution per assembly is allowed at a time.
Triggering an Adhoc Job
Initiates a raw cadence job for a specified assembly ID.
Request:
POST https://{host}/catalog-server/api/monitors/datasource/{assemblyId}
Example:
https://demo.acceldata.app/catalog-server/api/monitors/datasource/{assemblyId}
Curl Command:
curl --location --request POST 'https://demo.acceldata.tech/catalog-server/api/monitors/datasource/1' \
--header 'accesskey: {REPLACE WITH ACCESS KEY}' \
--header 'secretKey: {REPLACE WITH SECRET KEY}'
Note: Replace {accessKey}
and {secretKey}
with your actual credentials.
Response for Successful Trigger:
{
"id": 124,
"status": "RUNNING",
"error": null
}
Response if Job is Already in Progress:
{
"errors": [
{
"message": "previous monitor execution is not completed. Execution Id: 125",
"status": 400,
"details": null
}
]
}
You can monitor the job status on the Cadence Jobs tab on the Jobs page.
Terminating a Running Job
Terminates an active adhoc execution job. Use the executionId
obtained from the trigger response.
Request:
DELETE https://{host}/catalog-server/api/monitors/executions/{executionId}
Example:
https://demo.acceldata.app/catalog-server/api/monitors/executions/{executionId}
Curl Command:
curl --location --request DELETE 'https://demo.acceldata.tech/catalog-server/api/monitors/executions/125' \
--header 'accesskey: {REPLACE WITH ACCESS KEY}' \
--header 'secretKey: {REPLACE WITH SECRET KEY}'
Note: Replace {accessKey}
and {secretKey}
with your actual credentials.
Response for Successful Termination:
{
"status": "SUCCESS",
"requestId": null,
"message": "Successfully cancelled the job.",
"parentList": {},
"isParentAvailable": false
}
Metrics Collected in Adhoc Mode
- Raw Data Freshness: Indicates the freshness of the asset.
- Raw Absolute Size: Represents the total size of the asset in bytes.
Note: These metrics are available in the Cadence page dropdown but are excluded from anomaly detection and data freshness policies.
Hive MetaStore Request (Normal Cadence)
The Normal Cadence mode represents standard scheduled monitoring behavior.
Overview
- Runs on a recurring schedule (typically hourly).
- Supports anomaly detection and data freshness policies.
- Aligned with the platform’s default cadence monitoring strategy.
- Collects a comprehensive set of metrics.
Hive Tables Access in Spark 3.3.3
- The ADOC platform uses Apache Spark 3.3.3, which relies on Hive 2.3.9 for metadata access.
- Hive 2.3.9 does not support transactional (ACID) tables.
- Attempting to access transactional tables through a Hive-enabled Spark session results in errors (e.g., class not found).
- As a result, transactional tables cannot be accessed or monitored using this feature.

Other Derived Metrics
This section allows you to compare the asset metrics during different time periods. You can view the percentage change in asset metrics during the selected time period.
You can select the first duration metric to be either the Yesterday, Last Week, or Last Month. You can then select the second duration metric to be either Last Week, Last Month or Last Year.
- If you select Last Week as the first duration metric, then you can only select either Last Month or Last Year as the second duration metric.
- If you select Last Month as the first duration metric, then you can only select Last Year as the second duration metric.
ADOC compares the first duration metric's value with the second duration metric's value and displays the percentile change and also the absolute value change on the widgets. The widgets available are Change in File Size, Size added/hour, Change in File Count, and Rows added/hour.
In the following image, the first comparison metric is Yesterday, and the second comparison metric is Last Month. The widgets display the percentile change and absolute change in values from last yesterday to last month.

Data Cadence Reports
The data Cadence reports features notifies you about the changes in your assets. To receive the notifications for an asset, you must enable asset watch for the required assets.
To enable asset watch:
- Navigate to Data Reliability and select Data Discovery.
- Click the ellipsis menu for the required asset and select Watch Asset.
Alternatively, you can also enable asset watch from the Data Assets tab on the ADOC Discovery page.
Once you enable asset watch, ADOC notifies you about the changes to your asset, through an Email notification. The Email notifications are sent to the Email ID used to login to ADOC. If you have logged in to ADOC through SSO, the Email notifications are sent to the Email ID used to login to SSO. The Email notifications are sent every hour, even though there are no changes to the watched assets.
The following image displays the Notification Email you receive from ADOC.

The contents of the email are described in the following table:
Component Name | Description |
---|---|
Asset Name | The name of the asset. All the assets for which you have enabled asset watch, are listed here. |
Last Updated | The date and time when the asset was last updated. |
Record Count | The details of the changes in asset records. You can view the total number of records that are currently present, the total number of records that were previously present, and the change in the number of records. |
Volume Count | The details of the changes in asset volume. You can view the current volume of asset, the previous volume of asset, and the change in the asset volume. |
SQL View
The SQL View allows you to change the name and other properties of a custom asset. This allows you to easily update and adjust the asset's information without having to change the underlying database structure. You may also use the SQL View to design and apply complicated filters or queries to extract specific subsets of data from the asset.

In terms of behavior, a custom asset is essentially similar to a standard asset. It is supported by a custom SQL, and whenever this custom asset is processed, the underlying SQL is used. This is useful for giving custom logic to describe data.
For example, if you wish to connect multiple tables or aggregate data from one or more tables, write that logic in custom SQL and use this SQL View in the same way that you would a regular data asset for data quality and profiling activities.
Field | Description |
---|---|
Name | Displays the Asset's name as well as the option to edit it. |
Select Data Source | Displays the name of the Data Source. |
Select Database | Displays the selected Database. |
Select Schema | Displays the schema |
Description | Displays the description |
SQL | Displays the SQL view of the data |
After making the changes, click the Preview button to display the preview.
To save the changes, click the Save button.
Profile
Asset profiling serves as an essential precursor to any data quality enhancement initiative, as it enables organizations to better understand the current state of their data, recognize vulnerabilities, and take informed actions to rectify issues and optimize their data assets for more accurate and reliable analytics and decision-making.
ADOC provides the capability to perform data profiling not only for structured data but also for semi-structured data, enabling you to gain valuable insights from both types of data assets.

The Profile tab displays the following information about the assets profiled within the selected table, for a selected date and time.
Property Name | Definition | Example |
---|---|---|
Executed Profile | Defines the most recent date and time at which the profiling of asset occurred. Click the drop-down and select a date and time to view previous profile executions details. | Aug 24, 2023 8:26pm |
Rows Profiled | Number of rows profiled. | 2976508 |
Profiling Type | Full or Sample type of asset profiling? | FULL |
Start Time | Defines the date and time at which the profiling of the asset started. | Aug 24, 2023 8:26pm |
End Time | Defines the date and time at which the profiling has ended. | Aug 24, 2023 8:27pm |
Start Value | Defines the value with which the profiling began. | 169271...114824 |
End Value | Defines the value with which the profiling completed. | 169288...763048 |
- Compare Profiles: Click on the shuffle icon to compare the current profiled data of an asset with previously profiled data.
Profiling an Asset
To start profiling, click the Action button and then select either Full Profile, Incremental, or Selective from under Profile.

Action Button
Once the profiling is completed, a table is generated with names of each of the columns present in the table. Various metrics are calculated for each column. Each column contains one data type and the metrics generated for a structured column data types are as follows:
Data Type | Statistical Measures |
---|---|
String |
|
Integral |
|
Fractional |
|
Time Stamp |
|
Boolean |
|
Similarly, the metrics generated for semi-structured column data types are as follows:
Data Type | Statistical Measures |
---|---|
Struct |
|
Array[String] |
|
Array[Integral/Fractional] |
|
Array[Boolean] |
|
Array[Struct] |
|
Viewing Column Data Insights
To gain deeper insights into any column type, whether structured or not, simply click on the column name. This action will open a modal window presenting the following details:
Column Statistics
This section provides a table showcasing statistics for the selected column, accompanied by a bar graph illustrating percentage-based evaluations like % Null values and % Unique values.

Most Frequent Values
This section provides a list of the most frequent values found for the selected column.

Detected Patterns
This section provides a list of common patterns found for the selected column.
Anomalies & Trends
Within this section, you'll find a variety of charts that offer valuable insights into your data. These visualizations present key metrics such as skewness, distinct count, completeness, and kurtosis. Using the historical data, upper bound for the current value and lower bound for the current value is calculated and plotted over the graph as shown in the following image:
These charts help you understand the distribution and patterns within your data, enabling you to identify potential anomalies and trends that may influence your analysis and decision-making processes.

Every time a table is profiled, a data point is recorded. Overtime, n number of data points is recorded for each metric of every column of the table. The following observations can be made from the graph:
- If the data point lies between the upper bound curve and the lower bound curve, then the data point is non-anomalous.
- If the data point lies beyond the upper bound curve and the lower bound curve, then the data point is anomalous.
The following fields must be configured for anomaly detection in the Data Retention window:
- Historical Metrics Interval for Anomaly Detection
- Minimum Required Historical Metrics For Anomaly Detection
- For a column of complex data type including array, map, and struct, sub-columns of other datatype other than string, numeric and Boolean data type will be treated as string.
- For column with array type, pattern profile and top values will have values only and the count will not be displayed
- For column with array type, total non null count will not be displayed.
- Anomaly detection is not supported for nested column of an asset in the current version of ADOC.
- By default ADOC can profile up to five levels of a complex data structure. This can be updated with an environment variable PROFILE_DATATYPE_COMPLEX_SUPPORTED_LEVEL in the analysis service deployment in the data plane.
Sample Data
The Sample Data tab displays the complete data of an asset. The complete table, along with all the columns, rows, and the data in each cell is displayed. The user gets a quick insight into the data of the asset.
To view only selected columns, filter the data according to the Column Data Types and Column Names. Click the

Asset Selection
If columns are selected at the asset level, then the columns not present in the Asset Selection are grayed out.
Segments
The Segments tab provides an organized view of data segments within the selected asset. Segments help categorize and analyze data based on specific attributes or conditions.

Segment Details
Each segment consists of the following components:
- Segment Name – Identifies the segment.
- Columns Count – Displays the number of columns included in the segment.
- Sub-segments Count – Indicates the number of sub-segments linked to the main segment.
- Linked Policies – Shows the number of policies associated with the segment.
Segment Actions
You can perform the following actions on a segment:
- Edit – Modify the segment name, columns, or linked policies.
- Duplicate – Create a copy of an existing segment for reuse.
- Delete – Permanently remove the segment from the system.
To access these options, users can click on the menu icon (three dots) next to a segment card. This enables efficient management of data segments while ensuring flexibility in organizing data.

Managing Segments
You can add new segments by selecting the + Add New Segment option. Existing segments are displayed as cards, each showing relevant metadata.

You can define a segment by:
- Searching for and selecting relevant columns from a dropdown list.
- Viewing details such as column name, data type (Integral, String, Fractional), and distinct values.
- Clicking Save to confirm the selection or Cancel to discard changes.
This functionality allows you to efficiently group and analyze data based on relevant attributes.
Lineage
The Lineage tab pictorially represents the dependency of an asset over another asset. It depicts the flow of data from one asset to another. This data flow diagram is known as Lineage. Lineage is currently examined at two levels, namely,
- Table Level Lineage
- Column Level Lineage
Lineage at Table Level
Lineage at the table level depicts the flow of data from one table to another through a process. This process is responsible for the flow of data from the source asset to the sink asset.
For more detailed information, toggle the Show Sub-Level Lineage button. The lineage at the column level is also displayed. Click on one of the columns to view the flow of data.

If a query is part of the lineage, hovering over it allows you to view the full query and copy it.

Lineage at Column Level
Lineage at the column level depicts the flow of data from one column to another through a process. The process is responsible for the flow of data from the source asset to the sink asset.
Users can search for a column by name within a table and trace its lineage through the dotted flow line.

Manual column-to-column lineage can also be added to represent transformations that are not automatically captured. When a column-level lineage is manually added:
- A process node is created between the selected source and target columns.
- A corresponding process node is also created between the parent assets (e.g., tables) of those columns.
- Deleting the column-level lineage also removes the associated parent-level process node.
This functionality is currently available through API:
- Endpoint:
/api/assets/{assetId}/column-lineage
- Method:
POST
- Sample Payload
{
"direction": "DOWNSTREAM",
"assetIds": [49239734],
"process": {
"name": "test-lineage",
"description": "Demo of Manual Addition of Column Lineage"
}
}
Payload Field Descriptions
Field | Description |
---|---|
direction | "UPSTREAM" or "DOWNSTREAM" — defines the flow direction between columns. |
assetIds | List of column asset IDs involved in the lineage — either upstream or downstream depending on the direction. |
process.name | Name of the column-level lineage process node. |
process.description | Description of the process node for reference. |
Query Analyzer Service
The lineage information is extracted by the Query Analyzer Service. The Query Analyzer Service periodically (one hour intervals) fetches queries executed on a data source and stores them as query logs. It extracts information and lineage is inferred at both table level and column level.
The user can also add custom queries, to extract lineage information. To add custom queries, see here.
Add Lineage
An external pipeline or asset can be added to an existing lineage. To add an asset to a particular lineage, click on Add Lineage.
This Query Analyzer Service supports only Redshift and Snowflake data sources.
The Add Lineage modal window is displayed.

Add Lineage Window
The following section describes the parameters to be entered in the Add Lineage window:
Lineage Type: There are two types of lineage, namely:
- Upstream: The upstream lineage type indicates that data is flowing from the added asset to the selected table. On selecting upstream, the node is added before the table.
- Downstream: The downstream lineage type indicates that data is flowing from the selected table to the added asset. On selecting downstream, the node is added after the table.
Target Data Asset: To add a target data asset, click on Add Asset. The Data Asset Picker window pops-up. Select an asset from the data source list or search for an asset by its name in the search bar. Click the Select button.
Process Name: Define a name for the process. For example, Create Table.
Process Description: Description of the process i.e., flow of data from the source to the sink asset.
Click the Add button.

The STUDENTS table is now added as a node to the existing lineage.
Lineage Asset Node Types and Visual Legends
Type | Source Type | Clickable | Example |
---|---|---|---|
Type 1: Source Nodes | DATABASE, SCHEMA, BIG_QUERY_ DATASET | VISUAL_VIEW_INPUT2 (from Snowflake)![]() | |
Type 2: Data Entities | TABLE, VIEW, FILE, KAFKA_TOPIC, COLLECTION, SHEET, SQL_ VIEW, VISUAL_VIEW | telecom_data (from Hadoop)![]() | |
Type 3: Other Asset Types | Any asset type not in Type 1 or Type 2 | test2 (from Power BI)![]() | |
Type 4: Unknown Source | Unidentified or unclassified data sources | VISUAL_VIEW_INPUT2 (when the source is unavailable or unknown)![]() | |
Deleted Nodes | Deleted or no longer available assets | DBTOOLS$EXECUTION_HISTORY (marked as Deleted)![]() |
Relationships
The Relationships tab displays assets related to the selected asset. It displays the hierarchy of the asset.
For example, the name of the selected table is SALES_DATA. The table is part of the schema named TPCSCHEMASMALL. This schema belongs to the database named TPCDSSMALL. This database belongs to the data source named SNOWFLAKE_SALES_DS.
The Relationships tab also displays the terms related the selected asset.

Select a column from the table, to view the column hierarchy.

Foreign Key Relationship
The Relationships tab displays the Foreign Key relationship between tables and columns as well.
Let's suppose, a table contains few columns and 1000s of rows of data. Each column is identified by an identifier known as the primary key. When this primary key is used to define another column of another table, then the second table is said to have a Foreign Key relationship with the first table.
The following image displays a single foreign key relationship:

Single Foreign Key Relationship
The following image displays a composite foreign key relationship:

Composite Foreign Key Relationship
Schema Changes
The Schema Changes tab displays a panel with all the snapshots that are taken every time a crawler runs. Select exactly two snapshots for comparison.

To show only the differences between the two executions, click the Show Only Differences? toggle icon. It displays the number of columns added and deleted. Also, displays any update in the table.
Handling Schema Changes in Data Profiling
ADOC's data profiling capabilities are designed to adapt dynamically to changes in your data schema. When new columns are added to an asset or existing columns are removed, ADOC's profiler can continue to function without the need for an immediate re-crawl of the asset. This ensures that schema changes will not disrupt your primary data flows, especially for data sources with challenging schema detection issues.
Profiling with Schema Changes
When profiling data with a schema update since the last crawl
For Missing Columns: If the profiler encounters a column present in the catalog but missing in the current data, it will skip profiling for the column without causing the entire process to fail. Results for those columns will not be generated.
For New Columns:
- Primitive Data Type: When a new column is of a primitive data type, the profiler will include it in the profiling results as expected.
- Complex Data Types: If the new column is of a complex data type (such as array, map, or structured data types) and Spark can natively detect these types, the profiler will attempt to generate semi-structured statistics.
Considerations for Semi Structure Data
For data sources, if a new complex column is introduced but not recrawled, Spark will receive those complex columns as strings (e.g., Snowflake data types Variant, Object, Array, etc.). Profile results will be presented as string statistics rather than semi-structured statistics. You will be able to obtain semi-structured statistics after recrawling and profiling.
User Action Required
While ADOC's profiler can handle schema modifications, to effectively utilize ADOC's data profiling features for semi-structured data, you should re-crawl the asset to update the catalog with the most recent schema.
This phase is critical for semi-structured data in order to obtain accurate and detailed profiling statistics.
Metadata
The Metadata page provides detailed information related to an asset, including general details, tags, metadata attributes, and additional categorization options. This page allows users to manage and organize asset-related information efficiently.

Section | Description |
---|---|
General Details | Displays the asset's general information and description. |
Tags | Displays the list of tags created on the asset. Click on the Add Custom Metadata button to add a new tag to the assets. |
MetaData | Displays the metadata information about the asset. Click on the Custom tab to add custom metadata to the asset. |
UDT Variables | Displays the User Defined Templates created on the asset. |
Labels | Labels are used to group assets. The Labels panel under the Settings tab, displays a table with the existing labels. To add Labels, click the ADD button. Enter the Key and Value pair and click the Save button. |
Settings
The Settings page provides various configuration options to optimize data processing, validation, and resource management. You can fine-tune profiling settings, apply incremental strategies, allocate job resources efficiently, and set up reference validation for assets.

Profile Settings
One of the key features in the ADOC V3.4.0 release is the Profile Settings - Column selection tool in the data discovery module, which is specifically designed to optimize data profiling by allowing users to selectively target specific columns for profiling and anomaly detection.
The Profile Settings interface displays the total number of columns available, as well as a breakdown of how many are set for profiling and anomaly detection.

Columns Selection Feature
Traditionally, profiling an asset in a data reliability project involved analyzing every column, which could be costly and often unnecessary.
Consider a scenario where a customer has an asset with a thousand columns. With the Profile Settings, they can choose to only profile the 600 relevant columns, effectively cutting down the cost and increasing the efficiency of their data profiling operations.
Now, with the feature, users can:
- Selectively Profile Columns: Choose only the relevant columns for profiling from potentially large datasets, avoiding unnecessary processing.
- Detect Anomalies Efficiently: Specify which columns to monitor for anomalies, streamlining the process and focusing on the most impactful data.
- Cost and Time Effective: Reduces the resources required for profiling by focusing only on selected columns.
Using the Columns Setting
- Navigate to Reliability > Discover Assets.
- Select an asset.
- Access Settings from the left-side menu.
- Enter Profile Settings.
- Here, users can click on the Columns option to select columns for profiling or anomaly detection.
- Once selections are made, click the Back button to return to profile settings.
- Only columns with Profile enabled can have Data Anomaly selected; otherwise, the option remains inactive.
The Profile Settings window also allows you to configure the following capabilities for an asset:
- Auto-Tag
- Most Frequent Values
- Patterns
Schedule Profiling
You can schedule the profiling of an asset. To schedule profiling, perform the following steps:
- Enable the Schedule recurring profile job toggle button.
- Select one of the following profiling types from the Profiling Type drop-down list:
Profiling Type | Description |
---|---|
Full | Profiles all the columns in the table. |
Incremental | Incremental profile uses a monotonically increasing date column to determine the bounds for selecting data from the data source. |
When selecting Full Profile, you must schedule the profiling by choosing a time frequency—such as hour, day, week, or month—from the drop-down list and selecting the appropriate time zone.

On selecting Incremental profiling, you must define an incremental strategy and then schedule the profiling by selecting the time from the drop-down list.
Notification
To receive profile notifications for an asset perform the following:
- Select the Profile Notifications checkbox.
- Click the Select notification channels drop-down and select all the channels that you would like to be notified on.
- Enable the Notify on Success toggle button to be notified if a profile on the asset was successful. By default you are notified if a profile on an asset fails.
- Click Save Changes to apply your profile configurations to the asset.
Incremental Strategy
The user can profile an asset incrementally, that is, first profile a number of rows and then profile successive rows. For incremental profiling, you must define an incremental strategy.

Acceldata Data Observability Cloud (ADOC) supports following types of incremental strategies, namely:
- Auto Increment Id based
- Partition based
- Incremental date based
- Incremental file based
- Time stamp based
To define an incremental strategy at the asset level, click the Settings tab of the asset, then select Incremental Strategy. The following section explains the three types of incremental strategies, along with the required inputs:
Auto Increment Id Based
Incremental profile uses a monotonically increasing value of a column to determine the bounds for selecting data from the data source.
For example, every time a new row or rows of data are added to the database, they are allotted with an auto-incrementing numeric value. For instance, on adding 1000 rows of data to the database, each row is given an id starting from 1 to 1000. When the data source is profiled, the first 1000 rows are taken into consideration. Let's say you added another thousand rows of data to the database. An auto increment id based strategy is used to provide values from the last incremented value of the preceding set of rows, i.e., 1001 to 2000. On profiling, only the new set of rows are profiled.
The following table describes the required inputs:
Input Required | Description |
---|---|
Auto Increment Column | Select the auto increment column. |
Initial Offset | If a value is specified, then the starting marker for the first incremental profile will be set to this value. If left blank, then the first incremental profile will start from the beginning. |
Partition Based
Incremental profile uses a date based partition column to determine the bounds for selecting data from the data source. The required inputs varies depending on the type of Sub Strategy selected. Currently there are two types of sub strategies, namely:
- default
- day-month-year
Default - Sub Strategy
Incremental profile uses a date based partition column to determine the bounds for selecting data from the data source. Only useful if the data source supports partition. The required inputs are as follows:
- Partition Column: Select the partition column
- Day Format: Provide a date format to save the date timestamp. For example, YYYY-MM-DD
- Frequency: The profiling frequency can be set to hourly, daily, weekly, monthly, quarterly and yearly.
For more options, click on Advanced. Enter the following data under the advanced fields:
- Time Zone: Select a time zone from the drop-down list.
- Offset: If the selected time zone is offset by a few hours or minutes, then enter the number of hours or minutes in the field provided.
- Data Prefix: If the selected partition column has some prefix attached to its values, then specify the prefix data.
- Data Suffix: If the selected partition column has some suffix attached to its values, then specify the suffix data.
On checking Round End Date, the last executed date value is rounded up by the frequency that is selected from the Frequency drop-down list for the next execution of the policy. For instance, at 12:20, the last data row was executed, and you checked Round End Date and selected Hourly frequency. Therefore, the next time the policy is executed, it will only be executed on the data created at 13:20 and so on.
Click the Save button.
Day-month-year - Sub Strategy
Incremental profile uses three partition columns based on day, month and year to determine the bounds for selecting data from the data source. Only useful if the data source supports partition.
The required inputs are as follows:
- Day Column: Date based partition column
- Month Column: Month based partition column
- Year Column: Year based partition column
- Day Format: Provide a date format. For example, DD
- Month Format: Provide a month format. For example, MM
- Year Format: Provide a year format. For example, YY
- Frequency: The profiling frequency can be set to hourly, daily, weekly, monthly, quarterly and yearly.
For more options, click on Advanced. Enter the following data under the advanced field:
- Time Zone: Select a time zone from the drop-down list.
- Minute Offset: If the selected time zone is offset by a few hours or minutes, then enter the number of minutes in the field provided.
Incremental Data Based
Incremental profile uses a monotonically increasing date column to determine the bounds for selecting data from the data source. The following table describes the required inputs:
Input Required | Description |
---|---|
Date Column | Select the column name that is used to save dates and timestamps. |
Date Format | Provide a date format to save the date timestamp. For example, YYYY-MM-DD |
Initial Offset | If a date is specified, then the starting marker for the first incremental profile will begin from the specified date. If left blank, then the first incremental profile will start from the beginning. |
For more options, click on Advanced. Enter the following data under the advanced field:
- Time Zone: Select a time zone from the drop-down list.
- Minute Offset: If the selected time zone is offset by a few hours or minutes, then enter the number of minutes in the field provided.
On checking Round End Date, the last executed date value is rounded up by the frequency that is selected from the Frequency drop-down list for the next execution of the policy. For instance, at 12:20, the last data row was executed, and you checked Round End Date and selected Hourly frequency. Therefore, the next time the policy is executed, it will only be executed on the data created at 13:20 and so on.
Click the Save button.
Incremental File Based
When enabling the Monitoring File Channel Type for an asset in S3, GCS, or ADLS data source during configuration, the data plane's monitoring service actively searches for file events such as file additions, deletions, or renames on that specific data source. Once a file event is captured, it is forwarded to the catalog server, which then stores the event details.
To enable a file-based incremental strategy for an asset, go to the asset's details page, click on Settings from the right-hand side, and click Incremental Strategy from the left menu.
Incremental profile uses a monotonically increasing date column to determine the bounds for selecting data from the data source. The following table describes the required inputs:
Input Required | Description |
---|---|
Initial Offset | If specified, the starting marker for the first incremental profile will be set to this value; if left blank, the first incremental profile will start from the beginning. |
Time Zone | Select the desired time zone for configuring the initial offset. |
When the Round End Date checkbox is selected, the number of the most recent date the policy was run is rounded up by the frequency selected from the Frequency drop-down list for the next time the policy is performed. If you checked Round End Date and picked Hourly regularity, for example, the last row of data was processed at 12:20. As a result, when the policy is executed again, it will only run on the data created at 13:20, and so on.
Click the Save button.
Time Stamp Based
The Time Stamp Based Incremental Strategy allows users to incrementally profile an object using time stamps. This method profiles data in a chronological order, making it excellent for datasets with time-related data points. This technique establishes the bounds for data selection by employing a time-stamp column, guaranteeing that only fresh or altered data points (after the previous time stamp) are profiled throughout successive runs.
Input Required | Description |
---|---|
Select Incremental Strategy | Choose the Time Stamp Based option from the dropdown to set your incremental strategy to Time stamp. |
Date Format | Define the format in which the time stamp data is saved, e.g., YYYY-MM-DD HH:MM:SS. |
Initial Offset | If specified, the starting marker for the first incremental profile will begin from this value. If left blank, the profiling starts from the beginning. |
Time Zones | Choose the appropriate time zone based on the origin of the data or the intended audience for the profiling findings. This ensures that timestamp-based data profiling is correct. |
Job Resources
This setting allows you to assign resources for Spark job executions. You can set resources like executors and CPU cores on this page.

This page has the following settings.
- Number of Executors to Use for the Spark jobs: In this field, you must set the number of executors to be assigned for Spark jobs execution. By default, two executors are assigned.
- Number of CPU cores to use for the Spark jobs per executor: In this field, you must set the number of CPU cores to be assigned for Spark jobs execution. By default, one CPU core is assigned.
- Amount of memory provided to each executor process: In this field, you must set the amount of memory to be assigned to each executor process. The memory allocation must be in JVM memory strings format. By default, 2 GB memory is allocated.
Click Save once you have configured all the settings.
Reference Asset

You can use this setting to mark the asset as a Reference asset. Reference assets can be used for Lookup in a Lookup rule in Data Quality policies. To learn more about this setting, see Lookup Data Quality Policy.
Persistence
Whenever you execute a data quality policy or a reconciliation policy, data violations are generated if the policy fails. Data Violations are rows of data that attributed to the failure of the policy.

The Persistence section under Settings for a selected asset allows users to define where data quality results will be stored. Users can configure a specific result location, such as Amazon S3, HDFS, or Google Cloud Storage, at the asset level. If a custom persistence path is specified, it overrides the default global path for storing data quality results. Users can also select the Record Type to persist, including:
- All (default)
- Bad Rows (records that failed validation)
- Good Rows (records that passed validation)
- Summary (aggregated results)
Additionally, a Base Path field is available for specifying the storage path, and users can save or clear their configurations as needed.
Recommendations
The Recommendations section for a selected asset provides insights and actionable suggestions to enhance data reliability coverage.

Query Logs

The Query Logs section provides a detailed record of database transactions for a selected asset. It includes insights into Transaction Logs, Access Info, and related Row Trends.
- Transaction Logs: Displays recent queries executed on the asset, including the query type (e.g., INSERT), timestamp, and user who executed the query. This helps track data modifications and query activity over time.
- Access Info: Shows the number of times the table has been accessed and identifies the most frequent user accessing the asset.
- Row Trends: Highlights changes in data rows within the selected period. If no modifications are detected, it indicates data stability.
- Most Associated Tables: Lists other tables that are frequently linked or queried alongside the selected asset, aiding in understanding data relationships.