Trino Integration with OpenLineage

ADOC integrates with Trino using OpenLineage to capture query execution metadata and automatically build pipeline observability. When configured, Trino emits OpenLineage events for every qualifying query. ADOC ingests these events to construct pipelines and pipeline runs, providing visibility into how data is processed and moved across Trino workloads.

How Trino Integration with OpenLineage Works

  • Trino emits OpenLineage events for each qualifying query execution.
  • ADOC ingests these events and extracts query metadata.
  • Each unique query pattern is modeled as a Pipeline.
  • Each execution of that query is modeled as a Pipeline Run.

Example: Consider the following queries executed on a Trino table T1:

Bash
Copy

In ADOC, this results in:

  • 1 Pipeline → INSERT INTO T1
  • 2 Pipeline Runs → one per execution

This keeps pipelines stable while capturing execution-level observability.

Pipeline Stitching at Query Level

Unlike traditional job-based pipelines, Trino pipelines in ADOC are stitched at the query level.

  • The query signature defines the pipeline.
  • Each runtime execution of that query becomes a pipeline run.
  • Inputs and outputs are automatically derived from OpenLineage metadata.

This approach allows ADOC to model highly dynamic, ad-hoc Trino workloads without requiring predefined pipeline definitions.

With this integration, ADOC can:

  • Identify pipelines reading from a table
  • Identify pipelines writing to a table
  • Show upstream and downstream dependencies

Use cases:

  • Impact analysis
  • Data quality investigations
  • Change management

Configuration

Complete the following steps on your Trino server to enable OpenLineage event emission to ADOC.

Step 1: Install the OpenLineage Plugin

The OpenLineage plugin must be present in the Trino plugin directory before the event listener can be configured.

  1. Navigate to the Trino plugin directory:
Bash
Copy
  1. Download the plugin JAR:
Bash
Copy
  1. Verify the download:
Bash
Copy

Expected output: openlineage-trino-1.23.0.jar

Step 2: Configure the OpenLineage Event Listener

Create or update the event listener configuration file at /etc/trino/event-listener.properties with the following:

Bash
Copy

Replace the placeholders with values from your environment:

PlaceholderDescription
TRINO_HOSTHostname or IP address of the Trino server
TENANT_NAMEYour Acceldata tenant name
ADOC_ACCESS_KEYAcceldata access key
ADOC_SECRET_KEYAcceldata secret key

Step 3: Configure TLS TrustStore

  1. Download the certificate from the ADOC UI and save it as acceldata-cert.pem.
  2. Place the file in a directory, for example certs.
  3. Generate a truststore using the following command:
Bash
Copy
  1. Mount the certs directory in the Trino instance at /etc/trino/certs.

Step 4: Restart Trino

Restart the Trino service to apply the configuration changes.

Bash
Copy

Note If you are not running Trino as a systemd service, stop and restart the server manually using ./bin/launcher stop followed by ./bin/launcher start from your Trino installation directory.

Step 5: Verify the Integration

After Trino restarts, confirm the OpenLineage listener has loaded correctly.

Check the server log

Run the following command to confirm the event listener is registered:

Bash
Copy

Expected log entry: Registered event listener openlineage

Confirm Trino is listening

Verify that Trino is accepting connections on port 8080:

Bash
Copy

A successful response shows port 8080 in a listening state.

Confirm events are reaching ADOC

Execute a qualifying query on Trino, then navigate to the Pipelines view in ADOC to confirm a pipeline and pipeline run have been created for the query.

What's Next

After completing this configuration, explore the following topics:

  • Pipelines – Understand how ADOC models Trino queries as pipelines and pipeline runs.
  • Lineage Explorer – Trace upstream and downstream data dependencies across Trino workloads.
  • Impact Analysis – Assess the downstream effect of changes to Trino tables or query patterns.
  • Data Reliability – Apply reliability policies to tables written to by Trino pipelines.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard