Cluster Configuration Changes
Cluster setup adjustments are a necessity for multiple Pulse connectors and services. Maintenance restarts can be conducted before, during, or after Pulse installation to implement configuration changes.
To activate SSL and Basic Authentication on the remote JMX port, ensure that the jmxremote.password, jmxremote.access, truststore.jks, and keystore.jks files are already positioned in their correct directories.
Common Changes
HDFS
In the case of HDFS, if users are not permitted or do not have access to the Pulse server to request the Namenode API, the following options are available:
- Under HDFS configurations add property
dfs.cluster.administratorsin advanced or custom hdfs-site.xml with values such as Pulse Kerberos username. - Use and provide one namenode service keytab to Pulse server
- Restart all the affected components and deploy the new client configuration.
MapReduce
Configure ODP as below to show MapReduce jobs in YARN > Application Explorer.
Add the HDFS user to the properties listed below in Ambari > MapReduce configuration.
- mapreduce.cluster.administrators
- MapReduce.cluster.acls.enabled (By default, it’s enabled)
- mapreduce.job.acl-modify-job
- mapreduce.job.acl-view-job user
Add the HDFS user to the property listed below in Ambari > YARN configuration.
- Yarn.admin.acl
After completing the configuration, you need to restart the ad-connector service on Pulse Master.
Access privileges
For services managed by Ranger or by other authorization and requires permission privileges for non-HDFS user, follow the steps:
In case of a non-HDFS users, create a policy and provide read and executable permissions to the following HDFS path(s):
Spark v1, 2 and3 log directories, the following are the default locations, check on the respective cluster:
- HDP 2.x -
/spark2-history - CDH, CDP -
/user/spark/applicationHistory,/user/spark/spark2ApplicationHistory
- HDP 2.x -
Hive Query path, below are default locations, please check on the respective cluster:
- HDP 2.x, CDH 5.x or 6.x -
/tmp/ad - HDP 3.x -
/warehouse/tablespace/external/hive/sys.db/dag_data,/warehouse/tablespace/external/hive/sys.db/query_data - CDP 7.x -
/warehouse/tablespace/managed/hive/sys.db/dag_data,/warehouse/tablespace/managed/hive/sys.db/query_data
- HDP 2.x, CDH 5.x or 6.x -
In the case of a HDFS user being used to connect to Kafka service or any other user that does not have all the privileges for reading metadata information from all Kafka topics:
- Add user to the default access policy for Describe permissions under all topics
Add the SELECT privileges in Ranger to all databases, tables, and columns using the below steps:
- Log on to the Ranger UI.
- Navigate Hadoop SQL and click on it. The list of Hadoop SQL policies appear.
- On the List of Hadoop policies: Hadoop SQL, click on the edit button in Action for the all - databases, tables, and column Policy Name.
- On the Edit Policy page, add the SELECT privileges to the non-HDFS user.
Grant MySQL Permissions
To enable Pulse to collect Hive and Oozie metadata stored in MySQL, you must grant the required permissions.
- Log in to MySQL as the root or an administrative user.
mysql -u root -p- Create the users (if they do not already exist):
"CREATE USER 'hive'@'pulse_host' IDENTIFIED BY 'password'""CREATE USER 'oozie'@'pulse_host' IDENTIFIED BY 'password'"- Grant read-only (SELECT) privileges (replace placeholders with actual values):
GRANT SELECT ON hive_database.* TO 'hive_user'@'pulse_host' IDENTIFIED BY 'password';GRANT SELECT ON oozie_database.* TO 'oozie_user'@'pulse_host' IDENTIFIED BY 'password';The commands vary depending on the MySQL version.
- hive_database / oozie_database: Names of the Hive and Oozie metadata databases.
- hive_user / oozie_user: Usernames Pulse uses to access these databases.
- Pulse_host: Hostname or IP address of the Pulse server. Use % to allow access from any host.
- Password: Password assigned to the database user.
- Apply changes.
FLUSH PRIVILEGES;Example:
GRANT SELECT ON hive.* TO 'hivepulse'@'192.168.1.10' IDENTIFIED BY 'Hive@123';GRANT SELECT ON oozie.* TO 'ooziepulse'@'192.168.1.10' IDENTIFIED BY 'Oozie@123';HDP 2.x & 3.x
Kafka
- Login to the Ambari Admin Web UI.
- Navigate to the: Kafka > Configs > Advanced kafka-env > kafka-env.
- Go to the end of the file and add the following line:
export JMX_PORT=${JMX_PORT:-9999}- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
Kafka 3
- Login to the Ambari Admin Web UI.
- Navigate to the: Kafka > Configs > Advanced kafka3-env > kafka3-env.
- Go to the end of the file and add the following line:
For Kafka 3 with Zookeeper:
export JMX_PORT=${JMX_PORT:-8987}For Kafka 3 with KRaft:
export JMX_PORT=${JMX_PORT:-8988}- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
Set ACLs
To set ACLs, run the following commands.
./kafka-acls.sh --bootstrap-server <broker ip> --command-config client-kerb.prop --add --allow-principal User:hdfs --allow-host '*' --operation All --topic '*' ./kafka-acls.sh --bootstrap-server <broker ip> --command-config client-kerb.prop --add --allow-principal User:hdfs --allow-host '*' --operation All --group '*'Zookeeper
- Log into the Ambari Admin Web UI.
- Navigate to the: Zookeeper > Configs > Advanced zookeeper-env > zookeeper-env template.
- Go to the end of the file and add the following line:
Before including any of the following lines, ensure to add the JMXDISABLE environment variable first.
export JMXDISABLE="true" export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dzookeeper.jmx.log4j.disable=true"- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dzookeeper.jmx.log4j.disable=true -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable SSL on the JMX Remote Port, use the following parameters:
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dzookeeper.jmx.log4j.disable=true -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
Spark
- Login to the Ambari Admin Web UI.
- Navigate to the: Spark > Configs > Advanced spark2-metrics-properties.
- Go to the end of the file and add the following lines:
# Graphite sink class*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink# Location of your graphite instance*.sink.graphite.host=localhost*.sink.graphite.port=12003*.sink.graphite.protocol=tcp*.sink.graphite.prefix=spark.metrics*.sink.graphite.period=20master.source.jvm.class=org.apache.spark.metrics.source.JvmSourceworker.source.jvm.class=org.apache.spark.metrics.source.JvmSourcedriver.source.jvm.class=org.apache.spark.metrics.source.JvmSourceexecutor.source.jvm.class=org.apache.spark.metrics.source.JvmSourceThis change requires Pulse node agent running on all spark clients
Additional Note:
In case of edge nodes where spark clients are not being managed by Ambari append above properties to file
/etc/spark2/conf/metrics.propertiesMake sure to have following properties enabled for any spark job (spark-defaults.conf):
spark.eventLog.enabled=truespark.eventLog.dir=hdfs:///spark2-history/
Update same properties in managed configurations for applications running on Spark 1.x and 3.x
Hive
Hive Server 2
- Login to the Ambari Admin Web UI.
- Navigate to: Hive > Configs > Advanced hive-env > hive-env template.
- Go to the end of the file and add the following lines:
Avoid JMX changes for Hive 1.x using MR engine as it has bug that causes query failure with JMX enablement.
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fi- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password> -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
Hive Meta Store
- Login to the Ambari Admin Web UI.
- Navigate to: Hive > Configs > Advanced hive-env > hive-env template.
- Go to the end of the file add the following lines:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8009"fi- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password> -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
Place Hook Jars
| Distro Version | Hive Version | Tez Version | Pulse Hook Jar Name |
|---|---|---|---|
| HDP 2.x | 1.2.x | 0.7.x | ad-hive-hook__hdp__1.2.x-assembly-1.2.3.jar |
| HDP 2.x | 2.1.x (LLAP) | 0.7.x | ad-hive-hook__hdp__2.1.x-assembly-1.2.3.jar |
| HDP 3.1.0.x | 3.1.x | 0.9.x | ad-hive-hook__hdp__3.1.0.3.1.0.0-78-assembly-1.2.3.jar |
| HDP 3.1.4.x | 3.1.x | 0.9.x | ad-hive-hook__hdp__3.1.0.3.1.4.0-315-assembly-1.2.3.jar |
For above Hive versions, we use hive hooks to capture query statistics, and it requires below config changes:
- Get hive-hook jars (Acceldata team will share) as mentioned in above tables
- Place the provided hook jars on all edge, HiveServer 2, and Hive interactive nodes on local path
/opt/acceldata. - Hook directory should be readable and executable by all users
- Login to the Ambari Admin Web UI. Navigate to: Hive > Configs > Advanced hive-env, go to the end of the file and add the following lines:
Please change the hook jar name in below properties according to installed HDP distro version
export AUX_CLASSPATH=/opt/acceldata/ad-hive-hook_hdp_1.2.x-assembly-1.2.3.jar- Navigate to: Hive > Configs > Advanced hive-interactive-env, go to the end of the file and add the following lines:
export AUX_CLASSPATH=/opt/acceldata/ad-hive-hook_hdp_2.1.x-assembly-1.2.3.jarNavigate to: Hive > Configs > Custom hive-site and Custom hive-interactive-site and add below new property values:
- ad.events.streaming.servers=(<Pulse IP>:19009)
- ad.cluster=(cluster name as specified in Pulse installation)
Navigate to: Hive > Configs > General, append
io.acceldata.hive.AdHiveHookwith comma(if needed) for the following properties:- hive.exec.failure.hooks
- hive.exec.pre.hooks
- hive.exec.post.hooks
Tez
Get same hive-hook jars (Acceldata team will share) as mentioned in above mapping table
Login to any HDFS client node and follow below steps to add Pulse hook jar inside Tez tar
In case of Hive 3.x for HDP 3.x, use below locations to update respective hook jars for the version, example here for HDP 3.1.4 hook jar:
- Use
ad-hive-hook_hdp_3.1.0.3.1.4.0-315-assembly-1.2.3.jarfor Hive 3.x on HDFS path/hdp/apps/${hdp.version}/tez/tez.tar.gz
- Use
In case of both Hive 1.x and Hive 2.x (LLAP) such as HDP 2.6.x, use below locations to update respective hook jars for the version:
- Use
ad-hive-hook_hdp_1.2.x-assembly-1.2.3.jarfor Hive 1.x on HDFS path/hdp/apps/${hdp.version}/tez/tez.tar.gz - Use
ad-hive-hook_hdp_2.1.x-assembly-1.2.3.jarfor Hive 2.x on HDFS path/hdp/apps/${hdp.version}/tez_hive2/tez.tar.gz
- Use
# Create a directorymkdir -p tez_pack/ && cd tez_pack# Take backup of existing tez tarball in HDFS /tmphdfs dfs -cp /hdp/apps/<cluster_version>/tez/tez.tar.gz /tmp# Download tez tarball from HDFS to local, switch to accesible userhdfs dfs -get /hdp/apps/<cluster_version>/tez/tez.tar.gz .# Unpack the tarballtar -zxvf tez.tar.gz# Copy Pulse hook jar to tez libs/cp </location../../pulse_hook.jar> ./lib/# Package tez tarballtar -cvzf /tmp/tez.tar.gz .# Upload back and provide right permissions and ownershiphdfs dfs -put -f /tmp/tez.tar.gz /hdp/apps/<cluster_version>/tez/tez.tar.gzhdfs dfs -chown hdfs:hadoop /hdp/apps/<cluster_version>/tez/tez.tar.gzhdfs dfs -chmod 755 /hdp/apps/<cluster_version>/tez/tez.tar.gz- Navigate to: Tez > Configs > Custom tez-site and add/update below property values:
- tez.history.logging.service.class=io.acceldata.hive.AdTezEventsNatsClient
- ad.events.streaming.servers (PULSE_IP:19009)
- ad.cluster (your cluster name, ex: ad_hdp3_dev)
- [Optional Step for Hive 3.x] ad.hdfs.sink is by default set to true, if false then TEZ will not publish query metadata proto logging details to HDFS
Sqoop
Copy and place the above specified hook jars on sqoop classpath directory (for example: /usr/hdp/current/sqoop-client/lib ). For LLAP (hive interactive) enabled cluster copy both Hive v1.2.x and 2.1.x jars on the classpath.
ODP 3.2.x and 3.3.x
All Ambari changes will be available as part of release including hook jars and JMX changes except for few components, please validate following details once as part of general checks:
ODP Kafka
- Login to the Ambari Admin Web UI.
- Navigate to the: Kafka > Configs > Advanced kafka-env > kafka-env.
- Go to the end of the file and add the following line:
export JMX_PORT=${JMX_PORT:-9999}- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
ODP Kafka 3
- Login to the Ambari Admin Web UI.
- Navigate to the: Kafka > Configs > Advanced kafka3-env > kafka3-env.
- Go to the end of the file and add the following line:
For Kafka 3 with Zookeeper:
export JMX_PORT=${JMX_PORT:-8987}For Kafka 3 with KRaft:
export JMX_PORT=${JMX_PORT:-8988}- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
export KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
Set ACLs
To set ACLs, run the following commands.
./kafka-acls.sh --bootstrap-server <broker ip> --command-config client-kerb.prop --add --allow-principal User:hdfs --allow-host '*' --operation All --topic '*' ./kafka-acls.sh --bootstrap-server <broker ip> --command-config client-kerb.prop --add --allow-principal User:hdfs --allow-host '*' --operation All --group '*'ODP Zookeeper
- Login to the Ambari Admin Web UI.
- Navigate to the: Zookeeper > Configs > Advanced zookeeper-env > zookeeper-env template.
- Go to the end of the file and add the following line:
Before including any of the following lines, ensure to add the JMXDISABLE environment variable first.
export JMXDISABLE="true" export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dzookeeper.jmx.log4j.disable=true"- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dzookeeper.jmx.log4j.disable=true -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file>"Change the values of <> with appropriate values.
- To enable SSL on the JMX Remote Port, use the following parameters:
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8989 -Dzookeeper.jmx.log4j.disable=true -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password>"Change the values of <> with appropriate values.
ODP Hive
To see the Hive table details with data on the UI, ensure to set the boolean value for hive.stats.autogather and hive.stats.column.autogather in the hive-site.xml file for it to compute the data automatically.
You can also run the following command manually to compute the table analysis.
ANALYZE TABLE <table name> COMPUTE STATISTICS
Hive Server 2
- Login to the Ambari Admin Web UI.
- Navigate to: Hive > Configs > Advanced hive-env > hive-env template.
- Go to the end of the file add the following lines:
Avoid JMX changes for Hive 1.x using MR engine as it has bug that causes query failure with JMX enablement.
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fi- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "hiveserver2" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password> -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
Hive Meta Store
- Login to the Ambari Admin Web UI.
- Navigate to: Hive > Configs > Advanced hive-env > hive-env template.
- Go to the end of the file add the following lines:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8009"fi- To enable Basic Authentication in the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
- To enable TLS/SSL on the JMX Remote Port, use the following parameters:
if [ "$SERVICE" = "metastore" ]; thenexport HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=</path/to/jmxremote.access/file> -Dcom.sun.management.jmxremote.password.file=</path/to/jmxremote.password/file> -Dcom.sun.management.jmxremote.ssl=true -Dcom.sun.management.jmxremote.registry.ssl=true -Djavax.net.ssl.keyStore=</path/to/keystore.jks/file> -Djavax.net.ssl.keyStorePassword=<Keystore Password> -Djavax.net.ssl.trustStore=</path/to/truststore.jks/file> -Djavax.net.ssl.trustStorePassword=<Truststore Password> -Dcom.sun.management.jmxremote.port=8008"fiChange the values of <> with appropriate values.
Place Hook Jars
Pulse hook JARs are included in the installation package. Additional configuration changes are required as described below.
JMX enablement should already be in place similar to HDP changes.
Update below properties as per the installation details:
- ad.events.streaming.servers=(<Pulse IP>:19009)
- ad.cluster=(cluster name as specified in Pulse installation)
Navigate to: Hive > Configs > General, check if
io.acceldata.hive.AdHiveHookis appended with comma under following properties:- hive.exec.failure.hooks
- hive.exec.pre.hooks
- hive.exec.post.hooks
ODP Tez
Pulse hook JARs are included in the installation package. Additional configuration changes are required as described below.
Update properties below as per the installation details:
- ad.events.streaming.servers=(<Pulse IP>:19009)
- ad.cluster=(cluster name as specified in Pulse installation)
Navigate to: Tez > Configs check if property tez.history.logging.service.class is configured to
io.acceldata.hive.AdTezEventsNatsClient
ODP Spark 2 & 3
- Login to the Ambari Admin Web UI.
- Navigate to the: Spark > Configs > Advanced spark2-metrics-properties.
- Go to the end of the file and add the following lines:
# Graphite sink class*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink# Location of your graphite instance*.sink.graphite.host=localhost*.sink.graphite.port=12003*.sink.graphite.protocol=tcp*.sink.graphite.prefix=spark.metrics*.sink.graphite.period=20master.source.jvm.class=org.apache.spark.metrics.source.JvmSourceworker.source.jvm.class=org.apache.spark.metrics.source.JvmSourcedriver.source.jvm.class=org.apache.spark.metrics.source.JvmSourceexecutor.source.jvm.class=org.apache.spark.metrics.source.JvmSourceODP Trino
To enable Trino JMX metrics, perform the following steps:
- Log in to the Ambari Admin Web UI.
- Navigate to Trino > Configs >
Advanced config-properties. - In
Coordinator Node ConfigandWorker Node Config, add the following parameters at the end of the file:
jmx.rmiregistry.port=9980jmx.rmiserver.port=9981Replace the port values with actual values.
Replace the port values with actual values.
Place Trino Hook JAR
- Pulse hook JARs are included in the installation package.
- To enable query statistics in Pulse, configure the Trino event listener.
- Trino’s event listener framework allows custom plugins to respond to query lifecycle events for advanced logging, debugging, and performance monitoring.
- The supported events are query creation and query completion.
- Each event provides session details, execution metrics, resource usage, and timelines.
Steps to configure and place Hook JAR:
- Navigate to the Trino plugin folder.
In the folder:
/usr/odp/current/trino/plugin/- Create the event listener directory.
Create the following directory:
ad-trino-event-listener/- Place the hook JAR in the directory.
Hook JAR:
ad-trino-event-listener-1.0.jar.In the directory:
ad-trino-event-listener/- Create the
event-listener.propertiesfile.
If not already present, create the file in the directory:
/usr/odp/current/trino/conf/- Add the following properties.
In the event-listener.properties file, add the properties:
event-listener.name=ad-trino-event-listenerad.cluster=<your-cluster-name>ad.events.streaming.servers=<Pulse-IP>:19009Replace the values <> with actual values.
- Configure the plugin directory path.
If not already set, add the following line:
plugin.dir=/usr/odp/current/trino/pluginTo the node.properties file:
/usr/odp/current/trino/conf/node.properties- Restart the Trino coordinator server.
All configurations must be applied to the Trino coordinator server.
Additional Config changes (common for HDP & ODP)
Additional configuration changes are required if ACL is enabled at services or running an old service versions:
- YARN ACL- Check if ACL enabled for YARN (
yarn.acl.enable), if yes add this propertyyarn.timeline-service.read.allowed.users=hdfsin custom yarn-site.xml. Restart Yarn service. Herehdfsis the default user being used and shared by the team, enter other specific users created for Pulse. - Kafka Protocol- Modify “PLAINTEXTSASL___ _**** _** _ __** _** _ __** _** _ __** _** _ __** _** _ __** _** _ __** _** _ _ ** ”_ **Listeners and Interbroker protocol on Ambari > Kafka to _SASL_PLAINTEXT**_ ,also check for listeners and update it to
SASL_PLAINTEXT://localhost:6667 - Kafka ACL- Allow Kafka ACL(s) permission to
hdfsuser, run below command usingkafkauser:
/usr/hdp/current/kafka-broker/bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<zk_hostname>:2181 --add --allow-principal User:hdfs --operation All --topic '*' --cluster��/usr/hdp/current/kafka-broker/bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<zk_hostname>:2181 --add --allow-principal User:hdfs --operation All --group '*' --clusterCDH 5.x & 6.x
Kafka and Zookeeper JMX are auto-enabled with CDH-based installation.
CDH Spark
- Under the Spark configurations search for
(Safety Valve) for spark-conf/spark-defaults.conf. - Add the following properties:
spark.metrics.conf.*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSinkspark.metrics.conf.*.sink.graphite.host=localhostspark.metrics.conf.*.sink.graphite.port=12003spark.metrics.conf.*.sink.graphite.protocol=tcpspark.metrics.conf.*.sink.graphite.prefix=spark.metricsspark.metrics.conf.*.sink.graphite.period=20spark.metrics.conf.*.source.jvm.class=org.apache.spark.metrics.source.JvmSource- Repeat the same steps for Spark2 configurations search for
Spark Client Advanced Configuration Snippet (Safety Valve) for spark2-conf/spark-defaults.confand add the preceding properties.
Refer to the special additional note in the HDP 2.x & 3.x section under spark configuration changes.
CDH Hive
| Distro Version | Hive Version | Pulse Hook Jar Name |
|---|---|---|
| CDH 6.2.x | 2.1.x | ad-hive-hook__2.1.1__cdh6.2.1-assembly-1.2.3.jar |
| CDH 6.3.4 | 2.1.x | ad-hive-hook__cdh__3.0.0-assembly-1.2.3.jar |
For above Hive versions, we use hive hooks to capture query statistics, and it requires below config changes:
- Get hive-hook jars (Acceldata team will share) as mentioned in above tables
- Place the provided hook jars on all edge, Hiveserver2 on local path
/opt/acceldata - Hook directory should be readable and executable by all users
- Under Hive, configurations search for the Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hive-env.sh and add the following property:
AUX_CLASSPATH=${AUX_CLASSPATH}:/opt/acceldata/<AD HIVE 1.x or HIVE 2.x hook jar name>- Under Hive, configurations search for
Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml&HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml, change view as xml _and add the following properties:
<property><name>hive.exec.failure.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property><property><name>hive.exec.post.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property><property><name>hive.exec.pre.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property>Add below new properties under Advanced hive-site xml section:
- ad.events.streaming.servers=(<Pulse IP>:19009)
- ad.cluster=(cluster name as specified in Pulse installation)
Restart the affected Hive Components and deploy the new client configuration
CDH Sqoop
Place the hook jar in classpath libraries of Sqoop client on given edge nodes.
CDP 7.x
Kafka and Zookeeper JMX are auto enabled with CDH based installation. Make the same changes for Spark as mentioned for Spark in CDH section.
CDP Kafka
Under Additional Broker Java Options/broker_java_opts, replace the
-Dcom.sun.management.jmxremote.host=127.0.0.1with -Dcom.sun.management.jmxremote.host=0.0.0.0
CDP Hive
To see the Hive table details with data on the UI, ensure to set the boolean value for hive.stats.autogather and hive.stats.column.autogather in the hive-site.xml file for it to compute the data automatically.
You can also run the following command manually to compute the table analysis.
ANALYZE TABLE <table name> COMPUTE STATISTICS
Under the Hive -> Java Configuration Options for Hive Metastore Server, update the property with following value:
{{JAVA_GC_ARGS}} -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8009Under Hive on Tez -> Java Configuration Options for HiveServer2, update the property with following value:
{{JAVA_GC_ARGS}} -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008| Distro Version | Hive Version | Tez Version | Pulse Hook Jar Name |
|---|---|---|---|
| CDP | 3.1.3 | 0.9.1 | ad-hive-hook_cdp_ 3.1.3-assembly-1.2.3.jar |
For above Hive versions, we use hive hooks to capture query statistics, and it requires below config changes:
- Get hive-hook jars (Acceldata team will share) as mentioned in above tables
- Place the provided hook jars on all edge, Hiveserver2 on local path
/opt/acceldata - Hook directory should be readable and executable by all users
- Under component Hive , search configuration
Hive Service Environment Advanced Configuration Snippet (Safety Valve), and under component Hive on Tez, search configurationHive on Tez Service Environment Advanced Configuration Snippet (Safety Valve)and add the following property:
AUX_CLASSPATH=${AUX_CLASSPATH}:/opt/acceldata/ad-hive-hook_cdp_3.1.3-assembly-1.2.3.jar- Under component Hive, search configuration for
Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml, and under component Hive on Tez, search configuration forHive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml, change view as xml and add the following properties:
<property><name>ad.cluster</name><value>[cluster_name]</value></property><property><name>ad.events.streaming.servers</name><value>[PULSE_IP]:19009</value></property><property><name>hive.exec.failure.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property><property><name>hive.exec.post.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property><property><name>hive.exec.pre.hooks</name><value>io.acceldata.hive.AdHiveHook</value><description>for Acceldata APM</description></property>- Restart the affected Hive Components and deploy the new client configuration
CDP TEZ
- Get hive-hook jars (Acceldata team will share) as mentioned in above tables
- Login to any HDFS client node and follow below steps to add Pulse hook jar inside Tez tar:
Avoid clicking action available under Tez "Upload Tez tar file to HDFS"
# Create a directorymkdir -p tez_pack/ && cd tez_pack# Take backup of existing tez tarball in HDFS /tmphdfs dfs -cp /user/tez/<tez_version>/tez.tar.gz /tmp# Download tez tarball from HDFS to local, switch to accesible userhdfs dfs -get /user/tez/<tez_version>/tez.tar.gz .# Unpack the tarballtar -zxvf tez.tar.gz# Copy Pulse hook jar to tez libs/cp </location../../pulse_hook.jar> ./lib/# Package tez tarballtar -cvzf /tmp/tez.tar.gz .# Upload back and provide right permissions and ownershiphdfs dfs -put -f /tmp/tez.tar.gz /user/tez/<tez_version>/tez.tar.gzhdfs dfs -chown tez:hadoop /user/tez/<tez_version>/tez.tar.gzhdfs dfs -chmod 755 /user/tez/<tez_version>/tez.tar.gz- Under component Tez, search configuration for
Tez Client Advanced Configuration Snippet (Safety Valve) for tez-conf/tez-site.xml, change view as xml and add the following properties:
<property><name>ad.cluster</name><value>[cluster_name]</value></property><property><name>ad.events.streaming.servers</name><value>[PULSE_IP]:19009</value></property><property><name>tez.history.logging.service.class</name><value>io.acceldata.hive.AdTezEventsNatsClient</value></property>