Title
Create new category
Edit page index title
Edit category
Edit link
CDP Deployment for Single KDC
This document provides a step by step process to deploy single Pulse instance for Cloudera clusters with single KDC.
Prerequisites
Keep the following information handy:
- CM URL (
https://<Alias/FQDN of the CM URL>:<CM Port>) - CM Username
- CM Password
- Spark History HDFS path & Spark3 History HDFS path
- Kafka Version
- Hbase Version
- Hive Version
- Hive Metastore DB Connection URL
- hive metastore Database Name
- hive metastore DB Username
- hive metastore DB Password
- Oozie DB Name
- Oozie DB URL
- Oozie DB Username
- Oozie DB Password
- Kerberos Keytab
- krb5.conf file
- Principal
- Kerberos Username
- cacerts/jssecacerts
- YARN Scheduler Type
- Kafka Interbroker protocol
- Certificate File: cert.crt
- Certificate Key: cert.key
- CA Certificate: ca.crt (optional)
- Decide whether to keep the HTTP port (Default: 4000) open or not
- Decide on which port to use (default: 443)
Uninstallation
- For uninstalling agents, you must follow the Cloudera Parcel Agent Uninstall doc.
- You must also remove the Pulse JARS and the configuration for Hive and Tez.
- Acceldata will then perform the following command for backup and uninstalling the existing Pulse.
a. Create a backup directory.
xxxxxxxxxxmkdir -p /data01/backupb. For backup, we can copy the whole config and work dir.
xxxxxxxxxxcp -R $AcceloHome/config /data01/backup/cp -R $AcceloHome/work /data01/backup/c. Uninstall the existing Pulse setup by running the following command:
xxxxxxxxxxaccelo uninstall localOUTPUT
[root@nifihost1:data01 (ad-default)]$ accelo uninstall local✗ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote no✔ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote no✔ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote noYou're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote nodes will be affected. Please confirm your action [y/n]: : yWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DBUninstalling the AccelData components from local machine ...d. Logout from the terminal session.
Download the Binaries and Docker Images and Load Them
- Download the jars, hystaller, accelo binaries, and docker images from the download links provided by Acceldata.
- Move the docker images and jars in the following directory:
xxxxxxxxxxmkdir -p /data01/images- Copy the binaries and tar files in to the
/data01/imagesfolder.
xxxxxxxxxxcp </path/to/binaries/tar> /data01/images- Change the directory.
xxxxxxxxxxcd /data01/images- Extract the single tar file.
xxxxxxxxxxtar xvf <name_of_tar_file>.tarOUTPUT
xxxxxxxxxx[root@nifihost1 images]# tar xvf pulse-333-beta.tar./ad-alerts.tgz./ad-connectors.tgz./ad-dashplots.tgz./ad-database.tgz./ad-deployer.tgz./ad-director.tgz./ad-elastic.tgz./ad-events.tgz./ad-fsanalyticsv2-connector.tgz./ad-gauntlet.tgz./ad-graphql.tgz./ad-hydra.tgz./ad-impala-connector.tgz./ad-kafka-0-10-2-connector.tgz./ad-kafka-connector.tgz./ad-ldap.tgz./ad-logsearch-curator.tgz./ad-logstash.tgz./ad-notifications.tgz./ad-oozie-connector.tgz./ad-pg.tgz./ad-pulsemon-ui.tgz./ad-recom.tgz./ad-sparkstats.tgz./ad-sql-analyser.tgz./ad-streaming.tgz./ad-vminsert.tgz./ad-vmselect.tgz./ad-vmstorage.tgz./accelo.linux./admon./hystaller- Load the Docker images by running the following command:
xxxxxxxxxxls -1 *.tgz | xargs --no-run-if-empty -L 1 docker load -i- Check if all the images are loaded into the server.
xxxxxxxxxxdocker images | grep 3.3.3Config Cluster
- Validate the all the hosts file.
- Create the
acceldatadir by running the following command:
xxxxxxxxxxcd /data01/mkdir -p acceldata- Copy the Spark hosts and Zookeeper hosts file in
acceldatadirectory, by running the following command:
xxxxxxxxxxcp </path/to/hosts_files> /data01/acceldata- Place the
accelobinary in the/data01/acceldatadirectory.
xxxxxxxxxxcp </path/to/accelo/binary> /data01/acceldata- Rename the
accelo.linuxbinary toaccelo.
xxxxxxxxxxmv /data01/acceldata/accelo.linux accelochmod +x /data01/acceldata/accelo- Change the directory.
xxxxxxxxxxcd /data01/acceldata/accelo- Run the following command to do
accelo init:
xxxxxxxxxx./accelo init- Enter the appropriate answers when prompted.
- Source the
ad.shfile.
xxxxxxxxxxsource /etc/profile.d/ad.sh- Run the
initcommand to provide the Pulse version.
xxxxxxxxxxaccelo initOUTPUT
xxxxxxxxxx[root@nifihost1:~ (ad-default)]$ accelo initEnter the AccelData ImageTag: : 3.3.3✓ Done, AccelData Init Successful.Provide the correct Pulse version, in this case it will be 3.3.3.
- Now run
accelo infocommand to get the initial info.
xxxxxxxxxxaccelo infoOUTPUT
xxxxxxxxxx[root@nifihost1:~ (ad-default)]$ accelo infoWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB ___ ____________________ ____ ___ _________ / | / ____/ ____/ ____/ / / __ \/ |/_ __/ | / /| |/ / / / / __/ / / / / / / /| | / / / /| | / ___ / /___/ /___/ /___/ /___/ /_/ / ___ |/ / / ___ |/_/ |_\____/\____/_____/_____/_____/_/ |_/_/ /_/ |_| Accelo CLI Version: 3.3.3-betaAccelo CLI Build Hash: 8ba4727f11e5b3f3902547585a37611b6ec74e7cAccelo CLI Build ID: 1700746329Accelo CLI Builder ID: ZEdjMmxrYUdGdWRGOWhZMk5sYkdSaEVLCg==Accelo CLI Git Branch Hash: TXdLaTlCVDFBdE56STNvPQo=AcceloHome: /data01/acceldataAcceloStack: ad-defaultAccelData Registry: 191579300362.dkr.ecr.us-east-1.amazonaws.com/acceldataAccelData ImageTag: 3.3.3-betaActive Cluster Name: NotFoundAcceloConfig Mongo DB Retention days: 15AcceloConfig Mongo DB HDFS Reports Retention days: 15AccelConfig TSDB Retention days: 31dNumber of AccelData stacks found in this node: 0- Run the
config clustercommand to configure the cluster in Pulse.
xxxxxxxxxxaccelo config cluster- Provide appropriate answers when prompted.
[root@pulsecdp01:acceldata (ad-default)]$ accelo config clusterINFO: Configuring the cluster ...INFO: Using default API Version v10 for CM APIIs the 'Database Service' up and running? [y/n]: : nWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✔ ClouderaEnter Your Cluster's Display Name: : cdp1Enter Cloudera URL (with http/https): : https://cdpssl01.acceldata.dvl:7183✔ Enter Cloudera Username: : admin█IMPORTANT: This password will be securely encrypted and stored in this machine.Enter Cloudera User Password: : *****Enter the cluster name to use (MUST be all lowercase & unique): : cdp1ERROR: stat /data01/acceldata/.activecluster: no such file or directoryINFO: Creating Post dirs.✔ Cluster1INFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIINFO: Using lower case for CDP Service name APIEnter the installed Kafka version (ex: 0.10.2): : 0.11.0: 0.11.0█Enter the installed HBase service version (ex: 0.9.4): : 0.9.4Enter the installed Hive service version (ex: 2.0.0): : 2.0.0Enter the installed Hive service version (ex: 2.0.0): : 2.0.0✓ Found Kerberos Realm: ADSRE.COMEnter the Spark History HDFS path: : /user/spark/applicationHistoryOozie DB URL: : jdbc:postgresql://cdpssl01.acceldata.dvl:7432/oozie_oozie_server✔ Oozie DB URL: : jdbc:postgresql://cdpssl01.acceldata.dvl:7432/oozie_oozie_server█Enter the Oozie DB Username: : oozie_oozie_serverEnter the Oozie DB Password: : **********Enter the Oozie DB JODA Timezone (Example: Asia/Kolkata): : Asia/Kolkata✔ Enter the hive metastore Database Name : : hive█✔ Hive Metastore PostgreSQL DB Connection URL: : jdbc:postgresql://cdpssl01.acceldata.dvl:7432/hive█Enter the hive metastore DB Username : : hive✔ Enter the hive metastore DB Password : : **********█✔ Enter the hive metastore DB Password : : **********█INFO: core-site.xml file has been updatedINFO: hdfs-site.xml file has been updated---------------------------Discovered configurations----------------------------------------✓ Cluster Type: CDH✓ CDH Version: 7.1.7✓ Discovered Cluster Name: cdp1✓ Discovered Services: ✓ PULSEHYDRAAGENT ✓ SOLR ✓ SPARK_ON_YARN ✓ KAFKA ✓ LIVY ✓ HUE ✓ HIVE_ON_TEZ ✓ HBASE ✓ QUEUEMANAGER ✓ RANGER ✓ IMPALA ✓ ATLAS ✓ ZOOKEEPER ✓ OOZIE ✓ HIVE ✓ YARN ✓ HDFS✓ Yarn RM URI: https://cdpssl02.acceldata.dvl:8090,https://cdpssl03.acceldata.dvl:8090✓ MapReduce Job History URI: https://cdpssl02.acceldata.dvl:19890✗ Yarn ATS is not enabled✓ HDFS Namenode URI: swebhdfs://nameservice1✓ Hive Metastore URI: thrift://cdpssl02.acceldata.dvl:9083✗ Hive LLAP is not enabled✓ Spark History Server URIs: https://cdpssl02.acceldata.dvl:18488✓ Impala URI: http://cdpssl04.acceldata.dvl:25000,http://cdpssl05.acceldata.dvl:25000,http://cdpssl01.acceldata.dvl:25000✓ Kafka Broker URI: https://cdpssl04.acceldata.dvl:9093,https://cdpssl05.acceldata.dvl:9093,https://cdpssl03.acceldata.dvl:9093✓ Zookeeper Server URI: http://cdpssl01.acceldata.dvl:2181,http://cdpssl02.acceldata.dvl:2181,http://cdpssl03.acceldata.dvl:2181Would you like to continue with the above configuration? [y/n]: : yIs Kerberos enabled in this cluster? [y/n]: : y✓ Found Kerberos Realm: ADSRE.COMEnter your Kerberos keytab username (Must have required HDFS permissions): : hdfsINFO: min-reports is set to default value 10INFO: Purging old config files✓ acceldata.conf file generated successfully.Setting up Kerberos ConfigSetting up kerberos..Enter the principal: : hdfs/cdpssl03.acceldata.dvl@ADSRE.COMEnter full path to the Keytab file (eg: /root/hdfs.keytab): : /data01/security/kerberos_cluster1.keytabEnter the krb5Conf file path: : /data01/security/krb5_cluster1.confWARN: /data01/acceldata/config/users/passwd already being generated✓ Done, Kerberos setup completed.INFO: Creating post config filesINFO: Writing the .dist filesINFO: Clustername : cdp1INFO: Performing PreCheck of FilesIs HTTPS Enabled in the Cluster on UI Endpoint? [Y/N]: : YEnter the Java Keystore cacerts File Path: : /data01/security/cacertsEnter the Java Keystore jsseCaCerts File Path: : /data01/security/cacertsINFO: Setting the active clusterWARN: Cannot find the pulse.yaml file, getting the values from acceldata.conf fileWARN[1090] cannot find the spark on yarn thriftserver service portsWARN[1090] Atlas Server not installedWARN[1090] Hive Server Interactive not installedCreating hydra inventory✔ Is the agent deployment Parcel Based? [Y/N] : : Y█pulsecdp01.acceldata.dvl is the hostname of the Pulse Server, Is this correct? [Y/N]: : y? Select the components you would like to install: Impala, Metastore, Hdfs, HiveServer2, Zookeeper, Yarn, HbaseIs Kerberos Enabled for Impala?: yEnter the JMX Port for hive_metastore: : 8009✔ Enter the JMX Port for zookeeper_server: : 9010█Enter the Kafka Broker Port: : 9092Do you want to enable Impala Agent: [Y/N]? : YWould you like to setup LogSearch? [y/n]: : y? Select the logs for components that are installed/enabled in your target cluster: kafka_server, yarn_timelinereader, impala_catalogd, yarn_timelineserver, hue_runcpserver, hive_server, oozie_jpa, ranger_audit, yarn_resourcemanager, hdfs_audit, oozie_error, hbase_regionserver, hue_error, impala_impalad, hdfs_datanode, yarn_nodemanager, mapred_historyserver, hbase_master, kafka_state_change, hdfs_namenode, kafka_server_gc, kafka_controller, kafka_err, yarn_application, kafka_log_cleaner, hive_server_interactive, oozie_audit, zookeeper, oozie_tomcat, hue_migrate, hue_access, syslog, oozie_ops, oozie_server✓ Generated the vars.yml file successfullyINFO: /data01/acceldata/work/cdp1/fsanalytics/update_fsimage.sh - ✓ DoneINFO: /data01/acceldata/work/cdp1/fsanalytics/kinit_fsimage.sh - ✓ DoneINFO: /data01/acceldata/work/cdp1/fsanalytics/update_fsimage_csv.sh - ✓ DoneConfiguring notifications✓ Generated the notifications.yml file successfullyConfiguring notifications✓ Generated the actions notifications.yml file successfullyINFO: Please run 'accelo deploy core' to deploy APM core using this configuration.Copy the License
Place the license file provided by Acceldata in the work directory.
xxxxxxxxxxcp </path/to/license> /data01/acceldata/workDeploy Core
Deploy the Pulse core components by running the following command:
OUTPUT
Configure SSL For Connectors and Streaming
If you have TLS/SSL enforced for any of the Hadoop components in the target cluster, you have to bind-mount the Java truststore files inside the containers for the following Pulse services.
- ad-connectors
- ad-sparkstats
- ad-streaming
- ad-kafka-connector
- ad-kafka-0-10-2-connector
- ad-fsanalyticsv2-connector
For Kafka connectors, first, verify the version of Kafka running in your cluster, and then generate the configurations accordingly.
Only these services will establish connections to the corresponding Hadoop components of the cluster via the HTTPS URI.
Ensure that the permissions of these files are set to 0655 . i.e, read-able for all the users.
It's not obligatory to have both configuration files available for a target cluster. Sometimes, you might only have one of the files accessible. In such cases, you can simply utilize the available file and disregard the other.
AD-CONNECTORS & AD-SPARKSTATS
- Generate the ad-core-connectors configuration file if not present:
- Edit the file in path
<$AcceloHome>/config/docker/addons/ad-core-connectors.ymland add the following lines under thevolumessection of bothad-connectorsandad-sparkstatsservice blocks.
- If you only have the
jssecacertfile available and not thecacertsfile, you can mount thejssecacertsfile as thecacertsfile inside the container, as demonstrated below:
AD-STREAMING
- Generate the ad-core configuration file if not present:
- Edit the file in path
<$AcceloHome>/config/docker/ad-core.ymland add the following lines under thevolumessection ofad-streamingservice block.
- If you only have the
jssecacertfile available and not thecacertsfile, you can mount thejssecacertsfile as thecacertsfile inside the container, as demonstrated below:
AD-FSANALYTICSV2-CONNECTOR
- Generate the ad-fsanalyticsv2-connector configuration file if not present:
- Edit the file in path
<$AcceloHome>/config/docker/addons/ad-fsanalyticsv2-connector.ymland add the following lines under thevolumessection ofad-fsanalyticsv2-connector.
- If you only have the
jssecacertfile available and not thecacertsfile, you can mount thejssecacertsfile as thecacertsfile inside the container, as demonstrated below:
AD-KAFKA-CONNECTOR
- Generate the ad-core-connectors configuration file if not present:
- Edit the file in path
<$AcceloHome>/config/docker/addons/ad-kafka-connector.ymland add the following lines under thevolumessection ofad-kafka-connector.
- If you only have the
jssecacertfile available and not thecacertsfile, you can mount thejssecacertsfile as thecacertsfile inside the container, as demonstrated below:
AD-KAFKA-0-10-2-CONNECTOR
- Generate the ad-core-connectors configuration file if not present:
- Edit the file in path
<$AcceloHome>/config/docker/addons/ad-kafka-0-10-2-connector.ymland add the following lines under thevolumessection ofad-kafka-0-10-2-connector.
- If you only have the
jssecacertfile available and not thecacertsfile, you can mount thejssecacertsfile as thecacertsfile inside the container, as demonstrated below:
Deploy Addons
Run the following command to deploy the Pulse addons, and then select the components that are needed for Spark standalone:
OUTPUT
Configure Alerts Notifications
- For setting the active cluster, run the following command:
- Configure the alerts notifications.
OUTPUT
- Set the cluster2 as the active cluster.
- Configure the alerts for the second cluster.
- Set the cluster3 as the active cluster.
- Configure the alerts for the third cluster.
- Restart the alerts notifications.
OUTPUT
Database Push Configuration
Run the following command to push config to db:
Configure Gauntlet
Updating the Gauntlet Crontab Duration
- Check if the
ad-core.ymlfile is present or not by running the following command:
- If the above file is not present, then generate it by running the following command:
- Edit the
ad-core.ymlfile.
a. Open the file.
b. Update the CRON_TAB_DURATION env variable in the ad-gauntlet section.
This makes gauntlet run every two days at midnight.
c. The updated file will look something like this:
d. Save the file.
- Restart gauntlet service by running the following command:
Updating the Gauntlet Dry Run Mode
- Check if the
ad-core.ymlfile is present or not by running the following command:
- If the above file is not present, then generate it by running the following command:
- Edit the
ad-core.ymlfile.
a. Open the file.
b. Update the DRY_RUN_ENABLE env variable in the ad-gauntlet section.
This will make the gauntlet delete the older elastic indices and MongoDB data.
c. The updated file will look something like this:
d. Save the file.
- Restart gauntlet service by running the following command:
Configuring Gauntlet for Multi Node and Multi Cluster Deployment
- Run the following command to generate the gauntlet config files:
- Change the dir to
config/gauntlet/.
- Check if all the files are present or not for all the clusters or not.
- Modify the
gauntlet_elastic_<clustername>.ymlfile.
- Edit the elastic address in the file for multi node setup.
- Modify the elastic address for both clusters.
- Push the config to database.
- Restart the gauntlet service.
Updating MongoDB Cleanup and Compaction Frequency in Hours
By default, when dry run is disabled MongoDB cleanup and compaction will run once a day. To configure the frequency, follow the steps listed below:
- Run the following command:
- Answer the prompts. If you’re unsure about how many days you wish to retain, then proceed with the default values.
- When the following prompt comes up, specify the hours of the day during which you would like MongoDB clean up and compaction to run. The value must be a CSV of hours as per the 24 hour time notation.
- Run the following command. When gauntlet runs the next time, MongoDB clean up and compaction will run at the specified hours, once per hour.
Enabling (TLS) HTTPS for Pulse Web UI Configuration
For details about the configuration, see Enable Native SSL/TLS for Pulse Web UI.
Set Up LDAP for Pulse UI
For details about the configuration, see Set up LDAP for Pulse UI.
For additional help, contact www.acceldata.force.com OR call our service desk +1 844 9433282
Copyright © 2026