Configure CDP Hive and Tez

This page describes how to configure Hive and Tez in CDP clusters so that Pulse can collect query statistics, performance metrics, and lineage information.

Configure CDP Hive for Pulse

View Hive Table Details in Pulse

To display Hive table details with data in the Pulse UI:

  1. Enable automatic statistics gathering. In the hive-site.xml file, set the following properties to true:
Bash
Copy
  • This allows Hive to compute table statistics automatically.
  1. Compute statistics manually (optional). You can also run the following Hive command to compute table statistics manually:
Bash
Copy

Enable JMX for Hive Metastore and HiveServer2

For Hive Metastore Server:

  1. Navigate to Hive > Java Configuration Options.
  2. Update the property with the following values.
Bash
Copy

For HiveServer2:

  1. Navigate to Tez > Java Configuration Options.
  2. Update the property with the following values.
Bash
Copy

Place Hive Hook JARs

  • Get the Hive hook JARs from the Acceldata team (refer to the version mapping table).
  • Place the JARs on all edge and HiveServer2 nodes in the local path:
Bash
Copy

Hook Version Mapping

Distro VersionHive VersionTez VersionPulse Hook Jar Name
CDP3.1.30.9.1ad-hive-hook cdp 3.1.3-assembly-2.0.0.jar
  • Ensure the hook directory is readable and executable by all users.

Update Hive Environment with Hook JAR

In CM, search for:

  • Hive Service Environment Advanced Configuration Snippet (Safety Valve)
  • Hive on Tez Service Environment Advanced Configuration Snippet (Safety Valve)

Add the following property:

Bash
Copy

Update Hive Site Properties

In CM, search for:

  • Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml
  • Hive on Tez Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml

Switch to XML view and add the following properties:

Bash
Copy

Restart Hive Services

  • Restart affected Hive components.
  • Deploy the updated client configuration.

Configure CDP Tez for Pulse

Place Tez Hook JARs

  1. Obtain the Hive hook JARs from the Acceldata team (refer to the version mapping table).
  2. Log in to any HDFS client node.

Avoid clicking action available under Tez "Upload Tez tar file to HDFS"

Update Tez Tarball

Update the Tez tarball with the Pulse hook JAR.

During upgrades or hook updates, ensure the Tez tarball contains only the latest Acceldata hook JAR. Remove any older hook files to avoid conflicts.

Bash
Copy

Update Tez Site Properties

  1. In CM, search for: Tez Client Advanced Configuration Snippet (Safety Valve) for tez-conf/tez-site.xml
  2. Switch to XML view and add the required Pulse properties.
Bash
Copy

Result

The CDP cluster services are configured with Pulse hook JARs and JMX properties. Pulse can collect metrics, capture query data, and integrate with Hive and Tez.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard