Deploy the Delta-Hive Connector

This page outlines the steps to integrate Delta Lake with Hive 4.0.0. The integration involves deploying a custom Delta Lake JAR, to Hive, and using it to create and manage Delta tables.

Download Prebuilt Connector

Download the Deltalake Hive assembly connector jar v3.3.1 matching the client present in Spark 3.5.5, using the internal Nexus link (if available) instead of building it:

Deploy the JAR to Hive

  1. Copy the JAR: Place the delta-hive-assembly_2.12-3.3.1.3.3.6.3-1 into the Hive lib directory. For example:
Bash
Copy
  1. Restart Hive: Restart the Hive service (e.g., from Ambari) to apply the changes.

Create a Delta Table in Spark

Create & Insert Table in Spark

Bash
Copy

Register and Query the Delta Table from Hive

Register the Delta Table in Hive

Open Hive Shell:

Bash
Copy

Create the External Delta Table:

Bash
Copy

In case table is partitioned don’t add partitioned by clause, just add partition column as regular column. The Delta Hive connector reads the partition information from the Delta transaction log (_delta_log) and handles partitioning transparently. This is by design to ensure consistency between the Hive table definition and the underlying Delta metadata.

Verify Table Creation

After running the CREATE EXTERNAL TABLE command, you should see a success message indicating that the table has been created.

You can also verify with:

Bash
Copy

Data Insertion Behavior

Write Data from Hive (Unsupported Test)

Writing to a Delta table directly through Hive is not supported. You must use Spark for data insertion.

Attempting to insert data directly via Hive will result in an error:

Bash
Copy

Error Message:

Bash
Copy

Limitations and Support Matrix

As part of the Delta Lake integration (ODP-1697), we do support querying Delta tables via Hive using the internal Delta–Hive connector JAR, specifically built for Hive 4 compatibility in ODP 3.3.6.x. However, note the following key limitations:

  • Read-only support: Only SELECT and CREATE EXTERNAL TABLE operations are supported from Hive.
  • No write support: INSERT, UPDATE, DELETE, MERGE, or any DML operations on Delta tables from Hive are not supported, by design. This is enforced by the DeltaOutputFormat class in the connector, which throws UnsupportedOperationException for any write attempt.
  • Only EXTERNAL tables are supported: The Delta table must be pre-created using Spark with proper transaction logs (_delta_log), and then referenced in Hive using:
Bash
Copy
  • The Delta Lake connector from the open-source Delta GitHub repo does not yet support Hive 4, hence we use our internally built version.

Support Matrix from Hive on Delta Table

OperationHive Support
CREATE EXTERNAL TABLE (pointing to existing Delta)Yes
SELECT / READYes
INSERTNo
UPDATENo
DELETENo
MERGENo
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated