CDP Multi-node, Multi-KDC, Multi-Cluster

This document provides you with a step by step process to deploy single Pulse instance for multiple Cloudera clusters with multiple KDCs.

Prerequisites

Ensure you have the following information for both clusters:

  1. CM URL (https://<Alias/FQDN of the CM URL>:<CM Port>)
  2. CM Username
  3. CM Password
  4. Spark History HDFS path & Spark3 History HDFS path
  5. Kafka Version
  6. Hbase Version
  7. Hive Version
  8. Hive Metastore DB Connection URL
  9. Hive Metastore Database Name
  10. Hive Metastore DB Username
  11. Hive Metastore DB Password
  12. Oozie DB Name
  13. Oozie DB URL
  14. Oozie DB Username
  15. Oozie DB Password
  16. Kerberos Keytab
  17. krb5.conf file
  18. Principal
  19. Kerberos Username
  20. cacerts/jssecacerts
  21. YARN Scheduler Type
  22. Kafka Interbroker protocol

To enable (TLS) HTTPS for Pulse Web UI Configuration using ad-proxy, ensure you have the following present:

  1. Certificate File: cert.crt
  2. Certificate Key: cert.key
  3. CA Certificate: ca.crt (optional)
  4. Decide whether to keep the HTTP port (Default: 4000) open or not
  5. Decide on which port to use (default: 443)

Uninstallation of Agents

Perform the following:

  1. To uninstall agents, you must follow the Cloudera Parcel Agent document.
  2. You must also remove the Pulse JARs and the configurations for Hive and Tez.
  3. The Acceldata team must then run the following commands for backup and uninstalling the existing Pulse application.

a. Create a backup directory:

Bash
Copy

b. To backup, copy the entire config and workdirectory:

Bash
Copy

c. Uninstall the existing Pulse setup by running the following command:

Bash
Copy

OUTPUT

Bash
Copy
  1. Logout from the terminal session.
  2. Perform the aforementioned steps for all Pulse server nodes.

Download and Load the Binaries and Docker Images

To download and load the binaries and Docker images, perform the following:

  1. Download the jars, hystaller, accelo binaries and docker images from the download links provided by the Acceldata team.
  2. Move the Docker images and jars into the following directory:
Bash
Copy
  1. Copy the binaries and tar files into the /data01/images folder:
Bash
Copy
  1. Change the directory:
Bash
Copy
  1. Extract the single tar file:
Bash
Copy

OUTPUT

Bash
Copy
  1. Load the Docker images by running the following command:
Bash
Copy
  1. Ensure that all the images are loaded to the server by running the following command:
Bash
Copy

Cluster Configuration

To configure the cluster, perform the following:

  1. Validate all the host files.
  2. Create the acceldata directory by running the following command:
Bash
Copy
  1. Place the accelo binary in the /data01/acceldata directory:
Bash
Copy
  1. Rename the accelo.linux binary to accelo:
Bash
Copy
  1. Change the directory:
Bash
Copy
  1. Run the following accelo init command:
Bash
Copy
  1. Enter the appropriate answers when prompted.
  2. Source the ad.sh file:
Bash
Copy
  1. To enter the Pulse version, run the init command:
Bash
Copy

OUTPUT

Bash
Copy

Provide the correct Pulse version, in this case its 3.3.3

  1. To get the initial information, run the accelo info command:
Bash
Copy

OUTPUT

Bash
Copy
  1. To enable multi-KDC, perform the following:

a. Modify the accelo.yml file.

Bash
Copy

b. Change IsMutliKDCclusterEnabled to true.

Bash
Copy

c. Save the file.

  1. To configure the cluster in Pulse, run the config cluster command.
Bash
Copy
  1. Provide appropriate answers when prompted.
Bash
Copy
  1. Run the config cluster for the second cluster.
Bash
Copy
  1. Perform the below mentioned steps for both the clusters.
  2. Edit the acceldata.conf file for the changes for MultiNode deployment.
Bash
Copy
  1. Update the elastic section of the connections collection.
Bash
Copy
  1. Save the file.
  2. Change the directory to work/<clustername>
Bash
Copy
  1. Create the override.yml file if not yet created.
Bash
Copy
  1. Enter the following code and edit as required:
Bash
Copy
  1. Save the file.

Copy the License

Place the license file provided by the Acceldata team in the work directory.

Bash
Copy

Deploy Core

  1. Deploy the Pulse core components by running the following command:
Bash
Copy

OUTPUT

Bash
Copy
  1. Push the configurations for all the clusters.
Bash
Copy

Configure SSL for Connectors and Streaming

If you have TLS/SSL enforced for any of the Hadoop components in the target cluster, you must mount the Java truststore files inside the following Pulse service containers:

  1. ad-connectors
  2. ad-sparkstats
  3. ad-streaming
  4. ad-kafka-connector
  5. ad-kafka-0-10-2-connector
  6. ad-fsanalyticsv2-connector

For Kafka connectors, verify the version of Kafka running in the cluster, and then generate the configurations accordingly based on that version.

These are the only services that will connect to the respective Hadoop components of the cluster over the HTTPS URI.

Ensure that the permissions of these files are set to 0655 . i.e, read-able for all the users.

It is not always necessary to have both files configured for a target cluster. Sometimes, you may only have one of the files available. In such cases, you can simply use the available file and disregard the other one.

AD-CONNECTORS & AD-SPARKSTATS

Perform the following:

  1. Generate the ad-core-connectors configuration file:
Bash
Copy
  1. Edit the file in path <$AcceloHome>/config/docker/addons/ad-core-connectors.yml and add the following lines under the volumes section of both ad-connectors and ad-sparkstats service blocks.
Bash
Copy
  1. If you only have the jssecacert file available and not the cacerts file, you can mount the jssecacerts file as the cacerts file inside the container as demonstrated below:
Bash
Copy

AD-STREAMING

Perform the following:

  1. Generate the ad-core configuration file:
Bash
Copy
  1. Edit the file in path <$AcceloHome>/config/docker/ad-core.yml and add the following lines under the volumes section of ad-streaming service block.
Bash
Copy
  1. If you only have the jssecacert file available and not the cacerts file, you can mount the jssecacerts file as the cacerts file inside the container as demonstrated below:
Bash
Copy

AD-FSANALYTICSV2-CONNECTOR

Perform the following:

  1. Generate the ad-fsanalyticsv2-connector configuration file:
Bash
Copy
  1. Edit the file in path <$AcceloHome>/config/docker/addons/ad-fsanalyticsv2-connector.yml and add the following lines under the volumes section of ad-fsanalyticsv2-connector
Bash
Copy
  1. If you only have the jssecacert file available and not the cacerts file, you can mount the jssecacerts file as the cacerts file inside the container as demonstrated below:
Bash
Copy

AD-KAFKA-CONNECTOR

Perform the following:

  1. Generate the ad-core-connectors configuration file:
Bash
Copy
  1. Edit the file in path <$AcceloHome>/config/docker/addons/ad-kafka-connector.yml and add the following lines under the volumes section of ad-kafka-connector
Bash
Copy
  1. If you only have the jssecacert file available and not the cacerts file, you can mount the jssecacerts file as the cacerts file inside the container as demonstrated below:
Bash
Copy

AD-KAFKA-0-10-2-CONNECTOR

Perform the following:

  1. Generate the ad-core-connectors configuration file:
Bash
Copy
  1. Edit the file in path <$AcceloHome>/config/docker/addons/ad-kafka-0-10-2-connector.yml and add the following lines under the volumes section of ad-kafka-0-10-2-connector
Bash
Copy
  1. If you only have the jssecacert file available and not the cacerts file, you can mount the jssecacerts file as the cacerts file inside the container as demonstrated below:
Bash
Copy

Deploy Add-ons

Bash
Copy

OUTPUT

Bash
Copy
Bash
Copy

Deploy the Pulse add-ons, and select the components that are needed for CDP Cluster2.

Bash
Copy

OUTPUT

Bash
Copy

Database Push Configuration

To push the configuration to the database, run the following code:

Bash
Copy

Updating Gauntlet in Dry Run Mode

To update Gauntlet in dry run mode, perform the following:

  1. Check if the ad-core.yml file is present or not by running the following command:
Bash
Copy
  1. If the above file is not present then generate it by running the following command:
Bash
Copy
  1. Edit the ad-core.yml file by performing the following:

a. Open the file.

Bash
Copy

b. Update the DRY_RUN_ENABLE environment variable in the ad-gauntlet section as shown below:

Bash
Copy

Note This makes Gauntlet delete the older elastic indices and MongoDB data.

c. The updated file must appear as shown below:

Bash
Copy

d. Save the file.

  1. Restart Gauntlet service by running the following command:
Bash
Copy

Configure Gauntlet

To update the Gauntlet Contrab duration, perform the following:

  1. Check if the ad-core.yml file is present or not by running the following command:
Bash
Copy
  1. If the above file is not present then generate it by running the following command:
Bash
Copy
  1. Edit the ad-core.yml file by performing the following:

a. Open the file

Bash
Copy

b. Update the CRON_TAB_DURATION environment variable in the ad-gauntlet section as shown below:

Bash
Copy

Note This makes Gauntlet run every two days at midnight.

The updated file must appear as shown below:

Bash
Copy

c. Save the file.

  1. Restart the Gauntlet service by running the following command:
Bash
Copy

Configuring Gauntlet for Multi-node and Multi-cluster Deployment

Perform the following:

  1. To generate the Gauntlet config files, run the following command:
Bash
Copy
  1. Change the directory to config/gauntlet/
Bash
Copy
  1. Check if all the files are present or not for all the clusters or not:
Bash
Copy
  1. Modify the gauntlet_elastic_<clustername>.yml file by running the file:
Bash
Copy
  1. Edit the elastic address in the file for multi-node setup.
Bash
Copy
  1. Modify the Elastic address for both the clusters.
  2. Push the configuration to the database:
Bash
Copy
  1. Restart the Gauntlet service:
Bash
Copy

Updating MongoDB Clean Up and Compaction Frequency In Hours

By default, when dry run is disabled, MongoDB cleanup and compaction will occur once a day. To adjust the frequency, perform the following:

  1. Run the following command:
Bash
Copy
  1. Answer the following prompts, if you are unsure about the number of days you wish to retain, then proceed with the default values.
Bash
Copy
  1. When presented with the following prompt, indicate the hours of the day when you want MongoDB cleanup and compaction to occur. The value must be a comma-separated list of hours in accordance with the 24-hour time notation.
Bash
Copy
  1. Execute the following command, and when Gauntlet runs the next time, MongoDB cleanup and compaction will be scheduled to run at the specified hours, once per hour:
Bash
Copy

Configure and Deploy FSAnlytics in the Second Pulse Server

To configure and deploy FSAnalytics in the second Pulse server, perform the following:

  1. Create the acceldata directory by running the following command:
Bash
Copy
  1. Place the accelo.linux binary in the /data01/acceldata directory:
Bash
Copy
  1. Rename the accelo.linux binary to accelo
Bash
Copy
  1. Change the directory.
Bash
Copy
  1. Run the following command to run accelo init.
Bash
Copy
  1. Provide appropriate answers when prompted.
  2. Source the ad.sh file
Bash
Copy
  1. To enter the Pulse version, run the init command:
Bash
Copy

OUTPUT

Bash
Copy

Note Provide the correct Pulse version number, in this case it is 3.3.3.

  1. Run accelo info to get the initial information.
Bash
Copy

OUTPUT

Bash
Copy
  1. Get the Pulse Master hostname and generate the Mongo URL by editing the below code:
Bash
Copy
  1. Encrypt the above string by running the following command and provide the string when prompted:
Bash
Copy
  1. Edit the ad.sh for enabling the Pulse Standalone deployment. Add the following information, to the ad.sh.
Bash
Copy
  1. Replace the MONGO_URI with the encrypted string obtained from step 10.
Bash
Copy
  1. Source the file.
Bash
Copy
  1. Now set the cluster.
Bash
Copy
  1. Copy the fsanalytics directory from the Pulse Master Server present in the below location:
Bash
Copy

Add the following to the below directory in the second cluster:

Bash
Copy

Note Create the directory if not present.

  1. Copy the /krb/security directory from the Pulse Master Server present in the below location:
Bash
Copy

Add the following to the below directory in the second cluster:

Bash
Copy

Note Create the directory if not present.

  1. Generate the ad-fsanalyticsv2-connector.yml.
Bash
Copy

OUTPUT

Bash
Copy
  1. Edit the file.
Bash
Copy
  1. Update the following environment variables to the ad-fs-elastic:
  • MONGO_URI (Acceldata team will provide the right URI)
  • MONGO_ENCRYPTED=false
  • ES_HOST=<host_running_ES>
  • ES_PORT=19013
Bash
Copy
  1. Append below hostname entry in /etc/hosts and check if the /etc/hosts file is mounted under volume section in above created file (<ACCELO_HOME>/config/docker/addons/ad-fsanalyticsv2-connector.yml)

    1. <PULSE_CORE_HOST> ad-streaming
  2. Run the deploy add-ons command and select FSAnalytics and FSElastic.

Bash
Copy

OUTPUT

Bash
Copy
  1. Since the FSAnalyticsV2 Connector has a port exposed to the outside, you will need to modify the port bound to the host. To do this, open the ad-fsanalyticsv2-connector.yml file.
Bash
Copy
  1. Update the port section of the file.
Bash
Copy
  1. Save the file.
  2. Set the cluster to the second cluster.
Bash
Copy
  1. Run the deploy add-ons command and select FSAnalyticsV2 Connector add-on.
Bash
Copy

OUTPUT

Bash
Copy
  1. Check if the two connectors are running or not.
Bash
Copy

OUTPUT

Bash
Copy
  1. Check if both the containers are bound to 19027 and 19029 ports respectively.
Bash
Copy
  1. For running fsa load, do set the following:
Bash
Copy
  1. Set the cluster to the second cluster.
Bash
Copy
  1. Load the second cluster using the following fsa command:
Bash
Copy

Enabling (TLS) HTTPS for Pulse Web UI Configuration Using ad-proxy

Deployment and Configuration

For deployment and configuration, perform the following:

  1. Copy the cert.crt, cert.key and ca.crt (optional) files to $AcceloHome/config/proxy/certs location.
  2. Check if the ad-core.yml file is present or not.
Bash
Copy
  1. If the ad-core.yml file is not present, then generate the ad-core.yml file.
Bash
Copy

OUTPUT

Bash
Copy
  1. Modify the ad-core.yml file by performing the following:

a. Open the ad-core.yml file

Bash
Copy

b. Remove the ports: field in the ad-graphql section of ad-core.yml

Bash
Copy

c. The resulted ad-graphql section must appear as shown below:

Bash
Copy

d. Save the file.

  1. Restart the ad-graphql container:
Bash
Copy
  1. Ensure that the port is not exposed to the host.
Bash
Copy

OUTPUT

Bash
Copy
  1. Check if there any errors in the ad-graphql container.
Bash
Copy
  1. To deploy the ad-proxy add-ons, run the following command and then select Proxy from the list and press enter.
Bash
Copy

OUTPUT

Bash
Copy
  1. Check if any errors are there in the ad-proxy container.
Bash
Copy
  1. You can now access the Pulse UI using https://<pulse-server-hostname>.The default port used is 443.

Configuration

If you wish to modify the SSL port to a different value, perform the following:

  1. Check if ad-proxy.yml file is present or not
Bash
Copy
  1. Generate the ad-proxy.yml file if its not present.
Bash
Copy

OUTPUT

Bash
Copy
  1. To modify the ad-proxy.yml file, perform the following:

a. Open the ad-proxy.yml file

Bash
Copy

b. Change the host port in the ports list to the desired port.

Bash
Copy

The final file must appear as the following, if the host port is 6003:

Bash
Copy

c. Save the file.

  1. Restart the ad-proxy container
Bash
Copy
  1. Ensure that there aren’t any errors:
Bash
Copy
  1. You can now access the Pulse UI using https://<pulse-server-hostname>:6003.

Setup LDAP for the Pulse User Interface

To setup LDAP for the Pulse user interface, perform the following:

  1. Check if the ldap.conf is present or not.
Bash
Copy
  1. Run the configure command to generate the default ldap.confif not present already:
Bash
Copy
  1. Expected output must appear as shown below:
Bash
Copy
  1. Edit the file in path $AcceloHome/config/ldap/ldap.conf.
Bash
Copy
  1. Configure the file with the below properties:

    1. LDAP FQDN : FQDN where LDAP server is running

      • host = [FQDN]
    2. If port 389 is being used then

      • insecureNoSSL = true
    3. SSL root CA Certificate

      • rootCA = [CERTIFICATE_FILE_PATH]
    4. bindDN : to be used for ldap search need to be member of admin group

    5. bindPW : password for entering in database, can be removed later once ldapgets enabled

    6. baseDN used for user search

      • Eg: (cn=users, cn=accounts, dc=accedata, dc=io)
    7. Filter used for the user search

      • Eg: (objectClass=person)
    8. baseDN used for group search

      • Eg: (cn= groups, cn=accounts, dc=acceldata, dc=io)
    9. Group Search: Object class used for group search

      • Eg: (objectClass= posixgroup)

Run the following command to check if the user has search entry access and group access in the LDAP directory:

Bash
Copy
  1. If the file has already been generated, it will prompt for LDAP credentials to verify connectivity and configurations, as outlined in the steps below.
  2. Run the configure command:
Bash
Copy
  1. You are prompted to provide the LDAP user credentials:
Bash
Copy
  1. If the previous step was successful, then the following message is displayed:
Bash
Copy
  1. Press 'y' and click the Enter button.
  2. Expected output must appear as shown below:
Bash
Copy
  1. Push the ldap config by running the below code:
Bash
Copy
  1. Run the deploy add-on command.
Bash
Copy
  1. Select the LDAP from the list shown and click the Enter button:
Bash
Copy

Expected output must appear as below:

Bash
Copy
  1. Run the restart command.
Bash
Copy
  1. Open Pulse on the web and create default roles.
  2. Create an ops role with the necessary permissions, and any users who logs in via LDAP will be automatically assigned to this role.

Configure Alerts Notifications

To configure alerts notifications, perform the following:

  1. To set the active cluster, run the following command:
Bash
Copy
  1. Configure the alerts notifications by running the following command:
Bash
Copy

OUTPUT

Bash
Copy
  1. Set the cluster2 as the active cluster.
Bash
Copy
  1. Configure alerts for the second cluster as shown below:
Bash
Copy
  1. Set the cluster3 as the active cluster by running the following code:
Bash
Copy
  1. Configure the alerts for the third cluster as shown below:
Bash
Copy
  1. Restart the alerts notifications.
Bash
Copy

OUTPUT

Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard