xStore Unified Data Catalog

What is xStore?

xStore is the unified data catalog and apply governance layer for the Acceldata xDP platform. It provides a single place to register, organize, and govern all of your catalog — whether they live in legacy and modern data stack like hadoop, cloud object store, relational database, iceberg, data lake etc — and makes those catalogs discoverable and queryable by any connected compute engine such as Spark, Trino etc.

Key Concepts

Metalake: This is top level entity which manage different catalogs. We can create multiple metalakes per business use cases.
Catalog: A registered data source within a metalake. A catalog maps directly to a specific storage system or database (for example, Hive metastore, HDFS, S3, PostgreSQL database, Apache Iceberg lakehouse etc).
Schema: A logical grouping of tables within a catalog, equivalent to a database schema or namespace. Schemas let you organize related tables and apply governance boundaries at a fine-grained level.
Table: A named collection of structured data defined by a schema (columns and types) within a schema. xStore tracks table metadata — columns, data types, partition specs, and properties — without storing the underlying data itself.
xStore Iceberg REST Catalog service: xStore Iceberg Rest server following the Apache Iceberg REST specification. It enables any Iceberg-native client or compute engine to interact with Iceberg tables using the open REST protocol.
Hive Metastore Service: xStore includes a Hive metastore service if the client wants to manage the catalogs using xStore hive metastore service itself.

Supported Catalog Providers

Type	Use Case
Relational	PostgreSQL, MySQL,Hive
Object Store	Files on Azure,GCS,S3
Lakehouse	Upsert-capable open table format
Cloud Data Warehouse	Snowflake databases and schemas
Lakehouse	Databricks Unity Catalog federation

Capabilities

Multi-Source Metadata Federation: Register and govern data assets across heterogeneous systems — Iceberg, Hive, PostgreSQL, Snowflake, S3, Kafka, and more — through a single UI and API, without moving the underlying data.
Hive-Compatible for Legacy Workloads : Existing Spark jobs and Hive workloads discover and query xStore-managed tables without modification, through a built-in Hive Metastore-compatible interface.
Open Standards, No Vendor Lock-In : Built for the Apache Iceberg REST Catalog specification. Any Iceberg-native client — Spark, Trino, PyIceberg — can connect using the open REST protocol without proprietary drivers or SDKs.
Automatic Governance Registration : Every catalog registered in xStore is automatically backed by an Apache Ranger service definition. Fine-grained access policies at the catalog, schema, and table level are enforced at query time — from the moment a catalog is created.
Native Iceberg Lakehouse Management : Create and manage Iceberg namespaces and tables directly from the portal, with built-in support for schema evolution, hidden partitioning, time travel, and snapshot history — no external metadata store required.
One-Click Catalog Sync to Compute : Push catalog configuration to Trino and Spark in one click. No manual reconfiguration — compute engines pick up new catalogs within seconds.

Last updated on

Was this page helpful?