Pipeline Information & Management

This guide covers the foundational APIs for discovering, exploring, and managing pipelines in Acceldata ADOC. Think of this as your "getting to know the system" workflow - you'll learn how to find pipelines, understand their structure, and navigate your data observability landscape.

Why This Matters

Before you can monitor, debug, or create pipelines, you need to understand what's already in your system. These APIs help you:

  • Discover existing pipelines across your organization
  • Understand pipeline structure through visual graphs
  • Find the right pipeline when investigating data issues
  • Explore data lineage by viewing how data flows through jobs and assets
  • Organize pipelines using tags and metadata

Real-World Scenarios

Scenario 1: New Team Member Onboarding

"I just joined the data engineering team. What pipelines do we have?"

Solution: Use GET /pipelines/summary to see all pipelines with their current status, then drill into specific ones using GET /pipelines/:identity.

Scenario 2: Investigating a Data Quality Issue

"Our customer dashboard shows outdated data. Which pipeline feeds it?"

Solution: Search for pipelines by name or tag, then use GET /pipelines/:pipelineId/graph to see the data flow and identify the bottleneck.

Scenario 3: Understanding System Architecture

"I need to document our data pipelines for compliance."

Solution: List all pipelines, retrieve their graphs, and export the structure showing inputs, transformations, and outputs.

"We're migrating from Athena to Snowflake. Which pipelines will be affected?"

Solution: Use GET /tags to find all pipelines tagged with "athena" or search by data source in pipeline metadata.

Prerequisites

  • API credentials (accessKey and secretKey)
  • Basic understanding of your organization's data infrastructure
  • Access to Acceldata ADOC

API Reference

This workflow uses 6 APIs:

  1. GET /pipelines/summary - List all pipelines
  2. GET /pipelines/:identity - Get specific pipeline details
  3. GET /pipelines/:pipelineId/graph - View pipeline structure
  4. GET /tags - List available tags
  5. PUT /pipelines - Create or update pipeline
  6. GET /nodes/:nodeId - Get node details

Workflow: Discover and Explore Pipelines

Step 1: List All Pipelines

Get an overview of all pipelines in your system.

API Call

Bash
Copy

Query Parameters (optional):

  • page: Page number (default: "0")
  • size: Results per page (default: "50")

Example Request

Bash
Copy

Response

JSON
Copy

What to Look For

  • enabled: Is the pipeline currently active?

  • scheduled: Does it run automatically?

  • lastRunStatus: Recent execution status

  • totalRunsCount: How often it's been executed

    Tip: Filter by status to find problematic pipelines or sort by totalRunsCount to identify critical pipelines.

Step 2: Get Detailed Pipeline Information

Once you've identified a pipeline of interest, get its full details.

API Call

Bash
Copy

Path Parameter:

  • identity: Pipeline ID (numeric like 15) or UID (string like customer-etl-daily)

Example Requests

By numeric ID:

Bash
Copy

By string UID:

Bash
Copy

Response

JSON
Copy

What This Tells You

  • Owner & Team: Who to contact for questions
  • Tags: How it's categorized
  • Code Location: Where to find the implementation
  • Scheduler Type: INTERNAL (ADOC manages) or EXTERNAL (like Airflow)

Step 3: Visualize Pipeline Structure

Understand how data flows through the pipeline.

API Call

Bash
Copy

Path Parameter:

  • pipelineId: Numeric pipeline ID (e.g., 15)

Example Request

Bash
Copy

Response

JSON
Copy

Understanding the Graph

Nodes represent:

  • JOB: Processing steps (extract, transform, load)
  • ASSET: Data sources and destinations (tables, files)

Edges represent:

  • INPUT: Data source → Job
  • FLOW: Job → Job (dependency)
  • OUTPUT: Job → Data destination

Visualization Tip:

Copy

Step 4: Explore Pipeline Tags

Find pipelines by category or purpose.

API Call

Bash
Copy

Parameters: None

Response

JSON
Copy

How to Use Tags

  • Environment: production, staging, dev
  • Data Domain: customer-data, sales-data, inventory
  • Frequency: hourly, daily, weekly, real-time
  • Priority: critical, standard, low-priority
  • Type: etl, streaming, batch

Tip: Use tags to filter pipelines in Step 1 by adding them to your search criteria.

Step 5: Get Detailed Node Information

Drill into specific jobs or assets in the pipeline graph.

API Call

Bash
Copy

Path Parameter:

  • nodeId: Numeric node ID from the graph (e.g., 101)

Example Request

Bash
Copy

Response

JSON
Copy

What This Reveals

  • Job configuration and metadata
  • Performance expectations
  • Ownership and contacts
  • Dependencies and constraints

Common Workflow Patterns

Pattern 1: Pipeline Discovery

Bash
Copy

Pattern 2: Impact Analysis

"If I modify this Athena table, what breaks?"

Bash
Copy

Pattern 3: Creating Documentation

Bash
Copy

Quick Reference

What You WantAPI to UseKey Info
See all pipelinesGET /pipelines/summaryOverview with status
Find specific pipelineGET /pipelines/:identityFull details
Understand data flowGET /pipelines/:pipelineId/graphVisual structure
Browse by categoryGET /tagsAvailable tags
Inspect a job/assetGET /nodes/:nodeIdNode details

Troubleshooting

IssueSolution
Too many resultsUse pagination: ?page=0&size=25
Can't find pipelineTry searching by UID instead of ID
Empty graphPipeline may not have jobs defined yet
Missing metadataOwner can update via PUT /pipelines
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard