Asset Details Lineage Tab

The Lineage tab pictorially represents the dependency of an asset over another asset. It depicts the flow of data from one asset to another. This data flow diagram is known as Lineage. Lineage is currently examined at two levels, namely,

  1. Table Level Lineage
  2. Column Level Lineage

Lineage at Table Level

Lineage at the table level depicts the flow of data from one table to another through a process. This process is responsible for the flow of data from the source asset to the sink asset.

For example, the following diagram displays that the table mobile data is derived from the employee table through some insert query.

For more detailed information, toggle the Show Sub-Level Lineage button. The lineage at the column level is also displayed. Click on one of the columns to view the flow of data.

Lineage at Column Level

Lineage at the column level depicts the flow of data from one column to another through a process. The process is responsible for the flow of data from the source asset to the sink asset.

For example, the following diagram displays that the columns CUSTOMER_NAME and NAME are derived from the column FIRST_NAME through some insert query.

Query Analyzer Service

The lineage information is extracted by the Query Analyzer Service. The Query Analyzer Service periodically (one hour intervals) fetches queries executed on a data source and stores them as query logs. It extracts information and lineage is inferred at both table level and column level.

The user can also add custom queries, to extract lineage information. To add custom queries, see here.

This Query Analyzer Service supports only Redshift and Snowflake data sources.

Add Lineage

An external pipeline or asset can be added to an existing lineage. To add an asset to a particular lineage, click on Add Lineage.

The Add Lineage window pops-up.

The following section describes the parameters to be entered in the Add Lineage window:

  1. Lineage Type: There are two types of lineage, namely:

    1. Upstream: The upstream lineage type indicates that data is flowing from the added asset to the selected table. On selecting upstream, the node is added before the table.
    2. Downstream: The downstream lineage type indicates that data is flowing from the selected table to the added asset. On selecting downstream, the node is added after the table.
  2. Target Data Asset: To add a target data asset, click on Add Asset. The Data Asset Picker window pops-up. Select an asset from the datasource list or search for an asset by its name in the search bar. Click the Select button.

  1. Process Name: Define a name for the process. For example, Create Table.
  2. Process Description: Description of the process i.e., flow of data from the source to the sink asset.

Click the Add button.

The STUDENTS table is now added as a node to the existing lineage.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard