xStore Catalog Hive

Apache Hive Catalog

Overview

The Apache Hive catalog connects xStore to an existing Hive Metastore. Once registered, all databases and tables tracked by that Hive Metastore are visible in the Catalog Browser tree and can be synced to an xCompute Compute Cluster for querying with Trino or Spark.

The Hive catalog works with both Hive-on-HDFS and Hive-on-S3 setups. Authentication can be Simple (no Kerberos) or Kerberos-secured for environments that require it.

When to Use

Use the Hive catalog when:

  • You have an existing Hadoop or cloud-based data warehouse managed through a Hive Metastore.
  • Your tables are in ORC, Parquet, Avro, or text formats stored on HDFS or S3, accessed via Hive schemas.
  • You want Trino or Spark on xCompute to read and write Hive-managed tables without reconfiguring each compute cluster manually.

Prerequisites

Before creating a Hive catalog, ensure:

  • You have a running xStore cluster. See (Link Removed).
  • You have a metalake. See (Link Removed) .
  • The Hive Metastore Thrift service is running and reachable from the xStore cluster. The default Thrift port is 9083.
  • If your Metastore is Kerberos-secured, you have a valid Kerberos principal and keytab for the service account that xStore will use.
  • If your tables are on HDFS with custom configuration, you have your core-site.xml and hdfs-site.xml files available for upload.

Creating a Hive Catalog

  1. Navigate to Data Catalog → Browse → Catalog in the sidebar.
  2. Select your metalake in the Catalog Browser tree. If it has no catalogs yet, you will see the empty state shown below. Click + New Catalog (top-right) or + Create Catalog in the empty state.
  3. In the Create Catalog form, select Relational as the Catalog Type, then for Provider choose Apache Hive.
  4. Under Basic Information, set the Catalog Name (for example, hive_warehouse) and an optional Description.

The form now shows all the Hive-specific configuration sections:

  1. Under Backend, fill in the connection fields:
FieldRequiredDescription
Metastore URIYesThrift URI for the Hive Metastore (e.g. thrift://hive-metastore.internal:9083)
Hive Service Principal (Kerberos)Kerberos onlyThe service principal of the Hive Metastore (e.g. hive/_HOST@EXAMPLE.COM)
List All TablesNoSet to true to include tables created outside standard Hive DDL (e.g. by Spark or Impala)
  1. Under Additional Properties, upload any Hadoop configuration files your environment requires:
FileDescription
Core Site XMLUpload your core-site.xml for custom HDFS settings
HDFS Site XMLUpload your hdfs-site.xml
Hive Site XMLUpload your hive-site.xml for Hive-specific settings
Kerberos ConfigUpload your krb5.conf if using Kerberos

Click the upload area or drag and drop each file into the corresponding field.

  1. Under Authentication, select the type that matches your Hive Metastore:
  • Simple — No Kerberos. Use this for non-Kerberos clusters.
  • Kerberos — For secured Hive Metastores. Selecting Kerberos reveals two additional fields:
FieldRequiredDescription
Kerberos PrincipalYesThe principal xStore authenticates as (e.g. xstore@EXAMPLE.COM)
Kerberos KeytabYesUpload the .keytab file for the Kerberos principal

The screenshot below shows a completed Hive catalog form with the Metastore URI, List All Tables, Kerberos Principal, and Keytab all configured:

  1. Click Create Catalog.

Browsing the Hive Catalog

After creation, the catalog appears in the Catalogs list with a Hive and Relational type badge and an Active status.

To explore its contents, expand the catalog in the Catalog Browser tree:

  1. Click the catalog name to reveal its schemas (Hive databases).
  2. Click a schema to reveal its tables.
  3. Click any table to open the Table Detail Panel on the right — showing column names, data types, partition columns, and table properties.

If the tree shows no schemas, verify that the Metastore URI is correct and that the xStore cluster can reach the Metastore on port 9083.

Syncing and Querying

Sync the Hive catalog to your xCompute Compute Cluster to make it available in Trino:

  1. Select the Hive catalog in the tree.
  2. In the right-hand panel, click Linked Clusters.
  3. Click Sync next to your Compute Cluster.
  4. Once the sync completes, open the SQL Editor and verify:
SQL
Copy

Updating the Hive Catalog

To update connection details after the catalog is created:

  1. In the Catalog Browser tree, hover over the catalog name and click the Edit icon.
  2. Update the Metastore URI, authentication settings, or upload new configuration files.
  3. Click Save.
  4. Trigger a Sync to apply the changes to all linked Compute Clusters.

Common Issues

No schemas appear after catalog creation

  • Confirm that the Metastore Thrift service is running: telnet hive-metastore.internal 9083 from inside the xStore cluster's network.
  • Verify the URI format is thrift://host:port — not http:// or bare host:port.
  • If you recently created the catalog, wait a few seconds and refresh the tree.

Kerberos authentication failure

  • Confirm the keytab file corresponds to the principal specified in Kerberos Principal.
  • Ensure the Hive Service Principal matches the principal configured on the Hive Metastore (typically hive/_HOST@REALM where _HOST resolves to the Metastore hostname).
  • Verify that the krb5.conf file points to the correct KDC server.

HDFS permission errors when querying

  • Ensure the Kerberos principal (or the xStore service account for Simple auth) has read permission on the HDFS paths where your Hive table data is stored.
  • If using core-site.xml overrides, confirm the fs.defaultFS value matches your HDFS namenode address.

Tables are missing from a schema

  • If some tables were created outside of standard Hive DDL (for example, by Spark or Impala), enable List All Tables to include them.
VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches