Title
Create new category
Edit page index title
Edit category
Edit link
Deploy the Delta-Hive Connector
This page outlines the steps to integrate Delta Lake with Hive 4.0.0. The integration involves deploying a custom Delta Lake JAR, to Hive, and using it to create and manage Delta tables.
Download Prebuilt Connector
Download the Deltalake Hive assembly connector jar v3.3.1 matching the client present in Spark 3.5.5, using the internal Nexus link (if available) instead of building it:
Deploy the JAR to Hive
- Copy the JAR: Place the
delta-hive-assembly_2.12-3.3.1.3.3.6.3-1into the Hivelibdirectory. For example:
cp delta-hive-assembly_2.12-3.3.1.3.3.6.3-1.jar /usr/odp/3.3.6.3-1/hive/lib/- Restart Hive: Restart the Hive service (e.g., from Ambari) to apply the changes.
Create a Delta Table in Spark
Create & Insert Table in Spark
// Using Spark Shell or Scalaimport org.apache.spark.sql.SparkSessionval spark = SparkSession.builder() .appName("Create Delta Table") .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") .getOrCreate()// Create sample dataval data = Seq( (1L, "uuid1", "rider1", "driver1", 10.5, "city1"), (2L, "uuid2", "rider2", "driver2", 20.5, "city2"))val df = data.toDF("ts", "uuid", "rider", "driver", "fare", "city")df.write.format("delta").save("hdfs:///warehouse/tablespace/external/hive/my_delta_table")Register and Query the Delta Table from Hive
Register the Delta Table in Hive
Open Hive Shell:
hiveCreate the External Delta Table:
CREATE EXTERNAL TABLE deltaTableTest( ts BIGINT, uuid STRING, rider STRING, driver STRING, fare DOUBLE, city STRING)STORED BY 'io.delta.hive.DeltaStorageHandler'LOCATION 'hdfs:///warehouse/tablespace/external/hive/my_delta_table';In case table is partitioned don’t add partitioned by clause, just add partition column as regular column. The Delta Hive connector reads the partition information from the Delta transaction log (_delta_log) and handles partitioning transparently. This is by design to ensure consistency between the Hive table definition and the underlying Delta metadata.
Verify Table Creation
After running the CREATE EXTERNAL TABLE command, you should see a success message indicating that the table has been created.
You can also verify with:
0: jdbc:hive2://oraupg03.acceldata.ce:2181,or> DESCRIBE FORMATTED deltaTableTest;INFO : Compiling command(queryId=hive_20260202023428_624c2a02-26c0-4a19-9e5b-5ea846a5c6c2): DESCRIBE FORMATTED deltaTableTestINFO : Semantic Analysis Completed (retrial = false)INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)INFO : Completed compiling command(queryId=hive_20260202023428_624c2a02-26c0-4a19-9e5b-5ea846a5c6c2); Time taken: 0.164 secondsINFO : Operation DESCTABLE obtained 0 locksINFO : Executing command(queryId=hive_20260202023428_624c2a02-26c0-4a19-9e5b-5ea846a5c6c2): DESCRIBE FORMATTED deltaTableTestINFO : Starting task [Stage-0:DDL] in serial modeINFO : Completed executing command(queryId=hive_20260202023428_624c2a02-26c0-4a19-9e5b-5ea846a5c6c2); Time taken: 0.156 seconds+-------------------------------+----------------------------------------------------+----------------------------------------------------+| col_name | data_type | comment |+-------------------------------+----------------------------------------------------+----------------------------------------------------+| ts | bigint | || uuid | string | || rider | string | || driver | string | || fare | double | || city | string | || | NULL | NULL || # Detailed Table Information | NULL | NULL || Database: | default | NULL || OwnerType: | USER | NULL || Owner: | hive | NULL || CreateTime: | Mon Feb 02 02:34:16 IST 2026 | NULL || LastAccessTime: | UNKNOWN | NULL || Retention: | 0 | NULL || Location: | hdfs://nameservice1/warehouse/tablespace/external/hive/my_delta_table | NULL || Table Type: | EXTERNAL_TABLE | NULL || Table Parameters: | NULL | NULL || | EXTERNAL | TRUE || | bucketing_version | 2 || | numFiles | 2 || | spark.sql.sources.provider | DELTA || | storage_handler | io.delta.hive.DeltaStorageHandler || | totalSize | 3662 || | transient_lastDdlTime | 1769979856 || | NULL | NULL || # Storage Information | NULL | NULL || SerDe Library: | org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL || InputFormat: | null | NULL || OutputFormat: | null | NULL || Compressed: | No | NULL || Num Buckets: | -1 | NULL || Bucket Columns: | [] | NULL || Sort Columns: | [] | NULL || Storage Desc Params: | NULL | NULL || | path | hdfs://nameservice1/warehouse/tablespace/external/hive/my_delta_table || | serialization.format | 1 |+-------------------------------+----------------------------------------------------+----------------------------------------------------+36 rows selected (0.421 seconds)--To Read data from Hive set tez.input.format0: jdbc:hive2://oraupg03.acceldata.ce:2181,or> SET hive.tez.input.format=io.delta.hive.HiveInputFormat;No rows affected (0.012 seconds)0: jdbc:hive2://oraupg03.acceldata.ce:2181,or> SELECT * FROM deltaTableTest LIMIT 10;INFO : Compiling command(queryId=hive_20260202023459_d89b9135-9e0e-4e1d-941b-bf2f4a0fdac1): SELECT * FROM deltaTableTest LIMIT 10INFO : No Stats for default@deltatabletest, Columns: fare, driver, city, uuid, rider, tsINFO : Semantic Analysis Completed (retrial = false)INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:deltatabletest.ts, type:bigint, comment:null), FieldSchema(name:deltatabletest.uuid, type:string, comment:null), FieldSchema(name:deltatabletest.rider, type:string, comment:null), FieldSchema(name:deltatabletest.driver, type:string, comment:null), FieldSchema(name:deltatabletest.fare, type:double, comment:null), FieldSchema(name:deltatabletest.city, type:string, comment:null)], properties:null)INFO : Completed compiling command(queryId=hive_20260202023459_d89b9135-9e0e-4e1d-941b-bf2f4a0fdac1); Time taken: 0.159 secondsINFO : Operation QUERY obtained 0 locksINFO : Executing command(queryId=hive_20260202023459_d89b9135-9e0e-4e1d-941b-bf2f4a0fdac1): SELECT * FROM deltaTableTest LIMIT 10INFO : Completed executing command(queryId=hive_20260202023459_d89b9135-9e0e-4e1d-941b-bf2f4a0fdac1); Time taken: 0.004 seconds+--------------------+----------------------+-----------------------+------------------------+----------------------+----------------------+| deltatabletest.ts | deltatabletest.uuid | deltatabletest.rider | deltatabletest.driver | deltatabletest.fare | deltatabletest.city |+--------------------+----------------------+-----------------------+------------------------+----------------------+----------------------+| 2 | uuid2 | rider2 | driver2 | 20.5 | city2 || 1 | uuid1 | rider1 | driver1 | 10.5 | city1 |+--------------------+----------------------+-----------------------+------------------------+----------------------+----------------------+SELECT * FROM deltaTableTest LIMIT 10;Data Insertion Behavior
Write Data from Hive (Unsupported Test)
Writing to a Delta table directly through Hive is not supported. You must use Spark for data insertion.
Attempting to insert data directly via Hive will result in an error:
INSERT INTO deltaTableTest VALUES (1693400000000, 'uuid1', 'rider1', 'driver1', 15.5, 'city1'), (1693400001000, 'uuid2', 'rider2', 'driver2', 20.0, 'city2');Error Message:
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Writing to a Delta table in Hive is not supported. Please use Spark to write.INFO : Completed executing command(queryId=hive_20260202023700_6d89c9ea-ea02-463b-be16-e7d8164a9328); Time taken: 0.152 secondsError: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Writing to a Delta table in Hive is not supported. Please use Spark to write. (state=08S01,code=1)Limitations and Support Matrix
As part of the Delta Lake integration (ODP-1697), we do support querying Delta tables via Hive using the internal Delta–Hive connector JAR, specifically built for Hive 4 compatibility in ODP 3.3.6.x. However, note the following key limitations:
- Read-only support: Only
SELECTandCREATE EXTERNAL TABLEoperations are supported from Hive. - No write support:
INSERT,UPDATE,DELETE,MERGE, or any DML operations on Delta tables from Hive are not supported, by design. This is enforced by theDeltaOutputFormatclass in the connector, which throwsUnsupportedOperationExceptionfor any write attempt. - Only EXTERNAL tables are supported: The Delta table must be pre-created using Spark with proper transaction logs (
_delta_log), and then referenced in Hive using:
STORED BY 'io.delta.hive.DeltaStorageHandler'- The Delta Lake connector from the open-source Delta GitHub repo does not yet support Hive 4, hence we use our internally built version.
Support Matrix from Hive on Delta Table
| Operation | Hive Support |
|---|---|
CREATE EXTERNAL TABLE (pointing to existing Delta) | Yes |
SELECT / READ | Yes |
| INSERT | No |
| UPDATE | No |
| DELETE | No |
| MERGE | No |