Spark Standalone Multi-cluster

This document provides you a step by step process on how to install single Pulse instance for multiple Spark Standalone clusters.

Pre-requisites

Ensure the following are present:

Spark hosts: Refer to steps 1 and 2 mentioned below the note.
Zookeeper hosts files: Refer to step 3 mentioned below the note.
Log locations
Spark history server locations
Certificates (if any for Spark history server)
Docker version

Prerequisites for enabling (TLS) HTTPS for Pulse Web UI Configuration using ad-proxy:

Certificate File: cert.crt
Certificate Key: cert.key
CA Certificate: ca.crt (optional)
Decide whether to keep the HTTP port (Default: 4000) open or not
Decide on which port to use (default: 443)

Obtain the fully qualified domain names (FQDN) for the Spark Master URLs for both clusters and include them in the spark_<clustername>.hosts file. The Spark hosts file should be structured as follows:

For Pulse 3.8.0:

Bash
    
 
<http/s>://<Alias/FQDN of the Spark Master 1>:<Spark Master UI Port><http/s>://<Alias/FQDN of the Spark Master 1>:<Spark Master12 UI Port>
Copy

For Pulse 3.8.1 or later:

Bash
    
 
SparkMasterURLList:  - <http/s>://<Alias/FQDN of the Spark Master 1>:<Spark Master UI Port>  - <http/s>://<Alias/FQDN of the Spark Master 2>:<Spark Master UI Port>SparkWorkerURLList:  - <http/s>://<Alias/FQDN of the Spark Worker 1>:<Spark Worker Port>  - <http/s>://<Alias/FQDN of the Spark Worker 2>:<Spark Worker Port>
Copy

Retrieve the fully qualified domain names (FQDN) for the Spark History Server URLs for both clusters. When requested, provide the URL in the following format:

Bash
    
 
<http/s>://<Alias/FQDN of the Spark History Server URL>:<Spark History Server URL>
Copy

Obtain the fully qualified domain names (FQDN) for the Zookeeper Server URLs for both clusters and place them in the zk_<clustername>.hosts file. The Zookeeper Hosts file should adhere to the following format:

Bash
    
 
<http/s>://<Alias/FQDN for the Zookeeper Server>:<Zookeeper Server Port>
Copy

Retrieve the log locations for the application and deployment logs, as well as the SPARK_HOME directory for both clusters.
Ensure that the Docker version is >= 20.10.x.

Uninstallation

To uninstall agents, perform the following:

To uninstall agents, you must run the hystaller uninstallcommand through their ansible setup.
You must remove the Pulse Spark Hook Jars from the locations along with the related configurations from the Spark master and worker nodes.
Acceldata team must then perform the following steps using the command below to backup and uninstall the existing Pulse application.
1. Create a backup directory: mkdir -p /data01/backup
2. As a backup, copy the entire config and work directories: cp -R $AcceloHome/config /data01/backup/cp -R $AcceloHome/work /data01/backup/``
3. Uninstall the existing Pulse setup by running the following command: accelo uninstall local

OUTPUT

Bash
    
[root@nifihost1:data01 (ad-default)]$ accelo uninstall local✗ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote no✔ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote no✔ You're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote noYou're about to uninstall the local AccelData setup. This will also DELETE all persistent data from the current node. However, NONE of the remote nodes will be affected. Please confirm your action [y/n]: : yWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DBUninstalling the AccelData components from local machine ...
Copy

Executing this action will remove all files, folders, docker containers, docker images, and the entire Acceldata directory.

Logout of the terminal session.

Download and Load Binaries and Docker Images

To download and load binaries and Docker images, perform the following:

When downloading the Pulse all-in-one TAR file, you must also download the hystaller binary separately for Pulse version 3.3.3 and perform the following:

Download all the Pulse 3.3.3 binaries.
Replace the hystaller binary with the direct download link provided by the Acceldata team.

Download the jars, hystaller, accelo binaries, and docker images from the download links provided by the Acceldata team.
Move the Docker images and jars into the following directory:

Bash
    
 
mkdir -p /data01/images
Copy

Copy the Binaries and Tar files into the /data01/images folder.

Bash
    
 
cp </path/to/binaries/tar> /data01/images
Copy

Change the directory

Bash
    
 
cd /data01/images
Copy

Extract the single tar file

Bash
    
 
tar xvf <name_of_tar_file>.tar
Copy

OUTPUT

Bash
    
 
[root@nifihost1 images]# tar xvf pulse-333-beta.tar./ad-alerts.tgz./ad-connectors.tgz./ad-dashplots.tgz./ad-database.tgz./ad-deployer.tgz./ad-director.tgz./ad-elastic.tgz./ad-events.tgz./ad-fsanalyticsv2-connector.tgz./ad-gauntlet.tgz./ad-graphql.tgz./ad-hydra.tgz./ad-impala-connector.tgz./ad-kafka-0-10-2-connector.tgz./ad-kafka-connector.tgz./ad-ldap.tgz./ad-logsearch-curator.tgz./ad-logstash.tgz./ad-notifications.tgz./ad-oozie-connector.tgz./ad-pg.tgz./ad-proxy.tgz./ad-pulsemon-ui.tgz./ad-recom.tgz./ad-sparkstats.tgz./ad-sql-analyser.tgz./ad-streaming.tgz./ad-vminsert.tgz./ad-vmselect.tgz./ad-vmstorage.tgz./accelo.linux./admon./hystaller
Copy

To load the Docker images, execute the following command:

Bash
    
 
ls -1 *.tgz | xargs --no-run-if-empty -L 1 docker load -i
Copy

Check if all the images are loaded to the server using the following command:

Bash
    
 
docker images | grep 3.3.3
Copy

Configure the Cluster

To configure the cluster in Pulse, perform the following:

Validate all the host files.
Create the acceldata directory by running the following command:

Bash
    
 
cd /data01/mkdir -p acceldata
Copy

Place the accelo binary in this /data01/acceldata directory:

Bash
    
 
cp </path/to/accelo/binary> /data01/acceldata
Copy

Rename the accelo.linux binary to accelo .

Bash
    
 
mv /data01/acceldata/accelo.linux accelochmod +x /data01/acceldata/accelo
Copy

Change the directory:

Bash
    
 
cd /data01/acceldata/accelo
Copy

Run the following command to perform accelo init:

Bash
    
 
./accelo init
Copy

Enter appropriate answers when prompted.
When the Spark master is available, you can add the following parameter in the /etc/profile.d/ad.sh file to sync the Spark worker list from the Spark master URL.

Bash
    
 
SYNC_SPARK_MASTER=true
Copy

Run the following command to source the ad.sh file:

Bash
    
 
source /etc/profile.d/ad.sh
Copy

Run the init command to provide the Pulse version:

Bash
    
 
accelo init
Copy

OUTPUT

Bash
    
 
[root@nifihost1:~ (ad-default)]$ accelo initEnter the AccelData ImageTag: : 3.3.3✓ Done, AccelData Init Successful.
Copy

Provide the correct Pulse version, in this case it is 3.3.3

Run accelo info command as follows:

Bash
    
 
accelo info
Copy

OUTPUT

Bash
    
​x
 
[root@nifihost1:~ (ad-default)]$ accelo infoWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB​​    ___   ____________________    ____  ___  _________   /   | / ____/ ____/ ____/ /   / __ \/   |/_  __/   |  / /| |/ /   / /   / __/ / /   / / / / /| | / / / /| | / ___ / /___/ /___/ /___/ /___/ /_/ / ___ |/ / / ___ |/_/  |_\____/\____/_____/_____/_____/_/  |_/_/ /_/  |_|​Accelo CLI Version:  3.3.3-betaAccelo CLI Build Hash:  8ba4727f11e5b3f3902547585a37611b6ec74e7cAccelo CLI Build ID:  1700746329Accelo CLI Builder ID:  ZEdjMmxrYUdGdWRGOWhZMk5sYkdSaEVLCg==Accelo CLI Git Branch Hash:  TXdLaTlCVDFBdE56STNvPQo=AcceloHome:  /data01/acceldataAcceloStack:  ad-defaultAccelData Registry:  191579300362.dkr.ecr.us-east-1.amazonaws.com/acceldataAccelData ImageTag:  3.3.3-betaActive Cluster Name:  NotFoundAcceloConfig Mongo DB Retention days:  15AcceloConfig Mongo DB HDFS Reports Retention days:  15AccelConfig TSDB Retention days:  31dNumber of AccelData stacks found in this node:  0
Copy

To configure the cluster in Pulse, run the config cluster command:

Bash
    
 
accelo config cluster
Copy

Provide the correct information when prompted. The output must appear as follows:

Bash
    
[root@nifihost1:acceldata (ad-default)]$ accelo config clusterINFO: Configuring the cluster ...INFO: Using default API Version v10 for CM APIIs the 'Database Service' up and running? [y/n]: : nWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✔ Stand-Alone✔ SparkEnter Your Cluster's Display Name: : spark341Enter the cluster name to use (MUST be all lowercase & unique): : spark341ERROR:  stat /data01/acceldata/.activecluster: no such file or directoryINFO: Creating Post dirs.Enter the hosts file path for Spark-On-StandAlone (MUST formatted, one IP/host per line): : spark_spark341.hostsThe hostname for the spark worker node is : : kafka2.ops.iti.acceldata.devThe hostname for the spark worker node is : : kafka1.ops.iti.acceldata.dev✔ The hostname for the spark worker node is : : nifihost2.ops.iti.acceldata.dev█Is Zookeeper installed in the cluster: [Y/N]: YEnter the hosts file path for Zookeeper Hosts (MUST formatted, one IP/host per line): : spark341_zookeeper.hostsEnter the Spark History URL (with http/https): : https://10.90.6.169:18480✔ The hostname for the spark history server is : : nifihost2.ops.iti.acceldata.dev█INFO: min-reports is set to default value 10INFO: Purging old config files✓ acceldata.conf file generated successfully.INFO: Creating post config filesINFO: Writing the .dist filesINFO: Clustername : spark341INFO: Performing PreCheck of FilesINFO: Setting the active clusterWARN: Cannot find the pulse.yaml file, getting the values from acceldata.conf fileCreating hydra inventory✔ SSH Key Algorithm used (RSA/DSA)?: : RSA█Which user should connect over SSH: : rootSSH private key file path for connecting to hosts: : /root/.ssh/id_rsanifihost1.ops.iti.acceldata.dev is the hostname of the Pulse Server, Is this correct? [Y/N]: : y✔ Enter the JMX Port for zookeeper_server: : 8989█✔ Enter the JMX Port for zookeeper_server: : 8989█Enter the JMX Port for zookeeper_server: : 8989✔ Would you like to enable NTP Stats? [y/n]: : y█Would you like to setup LogSearch? [y/n]: : y? Select the logs for components that are installed/enabled in your target cluster:  spark_application, zookeeper, syslog, kern, spark_jobhistoryserver, spark_master, spark_worker✓ Generated the vars.yml file successfullyConfiguring notifications✓ Generated the notifications.yml file successfullyConfiguring notifications✓ Generated the actions notifications.yml file successfullyINFO: Please run 'accelo deploy core' to deploy APM core using this configuration.[root@nifihost1:acceldata (ad-default)]$
Copy

Run the config cluster command for all the clusters and provide the appropriate answers when prompted.

Bash
    
[root@nifihost1:acceldata (ad-default)]$ accelo config clusterINFO: Configuring the cluster ...INFO: Using default API Version v10 for CM API✔ Is the 'Database Service' up and running? [y/n]: : n█WARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✔ Stand-Alone✔ SparkEnter Your Cluster's Display Name: : spark330Enter the cluster name to use (MUST be all lowercase & unique): : spark330Enter the hosts file path for Spark-On-StandAlone (MUST be formatted in pulse host file format): : spark_330.hostsThe hostname for the spark worker node is : : sac03.acceldata.dvlThe hostname for the spark worker node is : : sac02.acceldata.dvlIs Zookeeper installed in the cluster: [Y/N]: YEnter the hosts file path for Zookeeper Hosts (MUST formatted, one IP/host per line): : zookeeper.hosts✔ Enter the Spark History URL (with http/https): : http://sac01.acceldata.dvl:18080█INFO: min-reports is set to default value 10INFO: Purging old config files✓ acceldata.conf file generated successfully.INFO: Creating post config filesINFO: Writing the .dist filesINFO: Clustername : spark330INFO: Performing PreCheck of FilesINFO: Setting the active clusterCreating hydra inventorySSH Key Algorithm used (RSA/DSA)?: : RSA✔ Which user should connect over SSH: : root█SSH private key file path for connecting to hosts: : /root/.ssh/id_rsanifihost1.ops.iti.acceldata.dev is the hostname of the Pulse Server, Is this correct? [Y/N]: : yEnter the JMX Port for zookeeper_server: : 8989✔ Would you like to enable NTP Stats? [y/n]: : y█Would you like to setup LogSearch? [y/n]: : y? Select the logs for components that are installed/enabled in your target cluster:  syslog, kern, spark_jobhistoryserver, spark_master, spark_worker, spark_application, zookeeper✓ Generated the vars.yml file successfullyConfiguring notifications✓ Generated the notifications.yml file successfullyConfiguring notifications✓ Generated the actions notifications.yml file successfullyINFO: Please run 'accelo deploy core' to deploy APM core using this configuration.[root@nifihost1:acceldata (ad-default)]$
Copy

Run the config cluster command for Nifi Stand-Alone and select standalone > nifi.

Bash
    
 
[root@nifihost1:acceldata (ad-default)]$ accelo config clusterINFO: Configuring the cluster ...INFO: Using default API Version v10 for CM APIIs the 'Database Service' up and running? [y/n]: : nWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✔ Stand-Alone✔ NifiEnter Your Cluster's Display Name: : nifisa✔ Enter the cluster name to use (MUST be all lowercase & unique): : nifisa█INFO: Creating Post dirs.INFO: Getting the Nifi Host ListEnter the hosts file path for Nifi (One Nifi URL per line, Must be formatted): : nifi.hostsDiscovered NIFI Hosts:    ✓  nifihost1.ops.iti.acceldata.devWould you like to continue with the above NIFI nodes? [y/n]: : YINFO: min-reports is set to default value 10INFO: Purging old config files✓ acceldata.conf file generated successfully.INFO: Creating post config filesINFO: Writing the .dist filesINFO: Clustername : nifisaINFO: Performing PreCheck of FilesINFO: Setting the active clusterCreating hydra inventory✔ SSH Key Algorithm used (RSA/DSA)?: : RSA█Which user should connect over SSH: : root✔ SSH private key file path for connecting to hosts: : /root/.ssh/id_rsa█nifihost1.ops.iti.acceldata.dev is the hostname of the Pulse Server, Is this correct? [Y/N]: : yWould you like to enable NTP Stats? [y/n]: : yWould you like to enable NTP Stats? [y/n]: : yWould you like to setup LogSearch? [y/n]: : y? Select the logs for components that are installed/enabled in your target cluster:  syslog, nifi, kern✓ Generated the vars.yml file successfullyConfiguring notifications✓ Generated the notifications.yml file successfullyConfiguring notifications✓ Generated the actions notifications.yml file successfullyINFO: Please run 'accelo deploy core' to deploy APM core using this configuration.[root@nifihost1:acceldata (ad-default)]$
Copy

Copy the License

Place the license file provided by the Acceldata team in the work directory as shown below:

Bash
    
 
cp </path/to/license> /data01/acceldata/work
Copy

Deploy Pulse Core Components

Deploy the Pulse core components by running the following command:

Bash
    
 
accelo deploy core
Copy

The output must appear as follows:

Bash
    
[root@nifihost1:acceldata (ad-default)]$ accelo deploy coreERROR: Cannot connect to DB, Because:  cannot connect to mongodbWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DBHave you verified the acceldata config file at '/data01/acceldata/config/acceldata_spark341.conf' ? [y/n]: : y✓ accelo.yml file found and parsed✓ AcceloEvents - events.json file found and parsed✓ acceldata conf file found and parsed✓ .dist file found and parsed✓ hydra_hosts.yml file found and parsed✓ vars.yml file found and parsed✓ alerts notification.yml file found and parsed✓ actions notification.yml file found and parsed✓ alerts default-endpoints.yml file found and parsed✓ override.yml file found and parsed✓ gauntlet_mongo_spark341.yml file found and parsed✓ gauntlet_elastic.yml file found and parsedINFO: No existing AccelData networks found. Current stack 'ad-default' is missing.INFO: Trying to create a new network ..INFO: If you're setting up AccelData for the first time give 'y' to the below.Would you like to initiate DB with the config file '/data01/acceldata/config/acceldata'? [y/n]: : yCreating group monitors [================================================================================================>-------------------]  83.33%INFO: Pushing the hydra_hosts.yml to mongodbDeployment Completed [==============================================================================================>--------------------]  81.82% 28s✓ Done, Core services deployment completed.Now, you can access the AccelData APM Server at the configured port of this node.To deploy the AccelData addons, Run './accelo deploy addons'
Copy

Deploy Add-ons

To deploy the Pulse add-ons, run the code below and select the required components for Spark standalone:

Bash
    
 
accelo deploy addons
Copy

The output must appear as follows:

Bash
    
[root@nifihost1:acceldata (ad-default)]$ accelo deploy addonsWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DBINFO: Active Cluster:  spark341? Select the components you would like to install:  Alerts (Agents MUST be configured), Core Connectors, Dashplot, Director (Agents MUST be configured), HYDRA, LogSearch, NotificationsStarting the deployment ..Completed [==============================================================================================================================] 137.50% 29s✓ Done, Addons deployment completed.
Copy

Configure Alerts Notifications

To configure alerts notifications, perform the following:

Set the active cluster by running the following command:

Bash
    
 
accelo set
Copy

Configure the alerts notifications using the following command:

Bash
    
 
[root@nifihost1:acceldata (ad-default)]$ accelo config alerts notificationsEnter the JODA Timezone value (Example: Asia/Jakarta): : Asia/Kolkata? Select the metric groups you would like to enable:  druid, nifi, ntpd, anomaly, chrony, customApp? Select the notifications you would like to enable:  emailINFO: Configuring Email Notifications:Enter Email DefaultToEmailIds (comma separated list): :Enter Email DefaultSnoozeIntervalInSecs: : 0Enter Email MaxEmailThreshold: : 1✓ Done, Alerts Notifications Configuration file generated✓ Done, Alerts Notifications pushed to Pulse DB
Copy

OUTPUT

Bash
    
 
[root@nifihost1:acceldata (ad-default)]$ accelo config alerts notificationsEnter the JODA Timezone value (Example: Asia/Jakarta): : Asia/Kolkata? Select the metric groups you would like to enable:  druid, nifi, ntpd, anomaly, chrony, customApp? Select the notifications you would like to enable:  emailINFO: Configuring Email Notifications:Enter Email DefaultToEmailIds (comma separated list): :Enter Email DefaultSnoozeIntervalInSecs: : 0Enter Email MaxEmailThreshold: : 1✓ Done, Alerts Notifications Configuration file generated✓ Done, Alerts Notifications pushed to Pulse DB
Copy

Set cluster2 as the active cluster:

Bash
    
 
accelo set
Copy

Configure the alerts for second cluster:

Bash
    
 
[root@nifihost1:acceldata (ad-default)]$ accelo config alerts notificationsEnter the JODA Timezone value (Example: Asia/Jakarta): : Asia/Kolkata? Select the metric groups you would like to enable:  druid, nifi, ntpd, anomaly, chrony, customApp? Select the notifications you would like to enable:  emailINFO: Configuring Email Notifications:Enter Email DefaultToEmailIds (comma separated list): :Enter Email DefaultSnoozeIntervalInSecs: : 0Enter Email MaxEmailThreshold: : 1✓ Done, Alerts Notifications Configuration file generated✓ Done, Alerts Notifications pushed to Pulse DB
Copy

Set cluster3 as the active cluster:

Bash
    
 
accelo set
Copy

Configure the alerts for the third cluster:

Bash
    
 
[root@nifihost1:acceldata (ad-default)]$ accelo config alerts notificationsEnter the JODA Timezone value (Example: Asia/Jakarta): : Asia/Kolkata? Select the metric groups you would like to enable:  druid, nifi, ntpd, anomaly, chrony, customApp? Select the notifications you would like to enable:  emailINFO: Configuring Email Notifications:✔ Enter Email DefaultSnoozeIntervalInSecs: : 0█mEnter Email MaxEmailThreshold: : 11█✔ Enter Email MaxEmailThreshold: : 1█✓ Done, Alerts Notifications Configuration file generated✓ Done, Alerts Notifications pushed to Pulse DB
Copy

Restart the alerts notifications:

Bash
    
 
accelo restart ad-alerts
Copy

OUTPUT

Bash
    
[root@nifihost1:spark341 (ad-default)]$ accelo restart ad-alertsWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✗ You're about to restart AccelData services. This will restart all or any specified the service. However, any persistent data will be left untouched.✔ You're about to restart AccelData services. This will restart all or any specified the service. However, any persistent data will be left untouched.✔ You're about to restart AccelData services. This will restart all or any specified the service. However, any persistent data will be left untouched.You're about to restart AccelData services. This will restart all or any specified the service. However, any persistent data will be left untouched. Please confirm your action [y/n]: : yCompleted [===============================================================================================================================] 100.00% 1sRestart ad-alerts completed  ✓
Copy

Database Push Configuration

Run the following command to push config to db:

Bash
    
 
accelo admin datbase push-config -a
Copy

Install the new Pulse version 3.3.3 agents on all cluster nodes. Make a copy of the new hystaller file to /tmp or any executable location on all cluster nodes and then run the following command on all cluster nodes.

Change the following code snippet as per your environment

Bash
    
PULSE_HOME="/opt/pulse"PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"HYDRA_SERVER_URL="http://<PULSE_SERVER_HOSTNAME>:19072"HYDRA_HEARTBEAT_DURATION="60"HYDRA_PARCEL_MODE="False"HYDRA_HOSTNAME_CASE="lower"HYDRA_HOSTNAME_METHOD="ENV"HYDRA_HEARTBEAT_JITTER="10"PULSE_HOSTNAME="<FQDN/Alias of the server where hydra to be installed>"sudo env "PULSE_HOME=$PULSE_HOME" "PULSE_HOSTNAME=$PULSE_HOSTNAME" "PATH=$PATH" "HYDRA_SERVER_URL=$HYDRA_SERVER_URL" "HYDRA_HEARTBEAT_DURATION=$HYDRA_HEARTBEAT_DURATION" "HYDRA_PARCEL_MODE=$HYDRA_PARCEL_MODE" "HYDRA_HOSTNAME_CASE=$HYDRA_HOSTNAME_CASE" "HYDRA_HOSTNAME_METHOD=$HYDRA_HOSTNAME_METHOD" "HYDRA_HEARTBEAT_JITTER=$HYDRA_HEARTBEAT_JITTER" /tmp/hystaller install
Copy

Reconfig Cluster

After completing the edits to the override files as outlined above, the next step is to run the following command:

Bash
    
 
accelo reconfig cluster -a
Copy

OUTPUT

Bash
    
 
[root@nifihost1:spark341 (ad-default)]$ accelo reconfig cluster -aINFO: Using default API Version v10 for CM APIWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DByINFO: Read Cluster Info for  spark341  From MongoDBINFO: Clustername : spark341INFO: Reconfiguring the  spark341 clusterZookeeper EnabledINFO: Getting the Spark Master List✓  https://nifihost2.ops.iti.acceldata.dev:8480INFO: Getting the Spark Worker List✓  nifihost2.ops.iti.acceldata.dev✓  kafka2.ops.iti.acceldata.dev✓  kafka1.ops.iti.acceldata.devINFO: Purging old config filesINFO: Pushing the hydra_hosts.yml to mongodbINFO: Regenerating vars.ymlINFO: Merging the override.yml with vars struct...WARN: /data01/acceldata/config/users/passwd already being generatedINFO: Pushing vars tarINFO: Updating the Epoch TimeINFO: Reloading the Hydra ServerINFO: Read Cluster Info for  spark330  From MongoDBINFO: Clustername : spark330INFO: Reconfiguring the  spark330 clusterZookeeper EnabledINFO: Getting the Spark Master List✓  http://sac01.acceldata.dvl:8080✓  http://sac02.acceldata.dvl:8080INFO: Getting the Spark Worker List✓  sac03.acceldata.dvl✓  sac02.acceldata.dvlINFO: Purging old config filesINFO: Pushing the hydra_hosts.yml to mongodbINFO: Regenerating vars.ymlINFO: Merging the override.yml with vars struct...WARN: /data01/acceldata/config/users/group already being generatedWARN: /data01/acceldata/config/users/passwd already being generatedINFO: Pushing vars tarINFO: Updating the Epoch TimeINFO: Reloading the Hydra ServerINFO: Read Cluster Info for  nifisa  From MongoDBINFO: Clustername : nifisaINFO: Reconfiguring the  nifisa clusterINFO: Getting the Nifi Host ListDiscovered NIFI Hosts:    ✓  nifihost1.ops.iti.acceldata.devINFO: Purging old config filesINFO: Pushing the hydra_hosts.yml to mongodbINFO: Regenerating vars.ymlWARN: /data01/acceldata/config/users/group already being generatedWARN: /data01/acceldata/config/users/passwd already being generatedINFO: Pushing vars tarINFO: Updating the Epoch TimeINFO: Reloading the Hydra Server
Copy

DB Push Config

Bash
    
 
accelo admin database push-config -a
Copy

Adding Edge Nodes for Monitoring

These are edge nodes that are not the part of the spark standalone cluster.

Change the dir to work/<clustername> .

Bash
    
 
cd /data01/acceldata/work/<clustername>
Copy

Modify the hydra_hosts_override.yml file.

Bash
    
 
vi hydra_hosts_override.yml
Copy

Add the following code to add a host to a already existing host for pulse to monitor:

Bash
    
 
hosts:  append:    - <Alias/FQDN>
Copy

Run the accelo reconfig cluster command for clusters with edge nodes that require monitoring by Pulse. Alternatively, for comprehensive coverage, perform a reconfig cluster on all clusters.

Bash
    
 
accelo reconfig cluster -a
Copy

Check the hydra_hosts.yml file which will now contain the new hosts as well. For example:

Bash
    
 
cluster:  hosts:    sac01.acceldata.dvl: ""    sac02.acceldata.dvl: ""    sac03.acceldata.dvl: ""    sac04.acceldata.dvl: ""                 ----- NEW HOSTS
Copy

Configure Gauntlet

Updating the Gauntlet Crontab Duration

Check if the ad-core.yml file is present or not by running the following command:

Bash
    
 
ls -al $AcceloHome/config/docker/ad-core.yml
Copy

If the file above is not present, then generate it by:

Bash
    
 
accelo admin makeconfig ad-core
Copy

Edit the ad-core.yml file

a. Open the file:

Bash
    
 
vi $AcceloHome/config/docker/ad-core.yml
Copy

b. Update the CRON_TAB_DURATION env variable in the ad-gauntlet section:

Bash
    
 
CRON_TAB_DURATION=0 0 */2 * *
Copy

This makes gauntlet run every 2 days at midnight.

c. The updated file will look something like this:

Bash
    
 
ad-gauntlet:  image: ad-gauntlet  container_name: ad-gauntlet  environment:  - MONGO_URI=ZN4v8cuUTXYvdnDJIDp+R8Z+ZsVXXjv8zDOvh8UwQXosC8vfVkGYGWGPNnX64ZVSp9yHgErQknPBAfYZ9cOG1A==  - MONGO_ENCRYPTED=true  - ELASTIC_ADDRESSES=http://ad-elastic:9200  - DRY_RUN_ENABLE=true  - CRON_TAB_DURATION=0 0 */2 * *  volumes:  - /etc/localtime:/etc/localtime:ro  - /root/acceldata/config/logsearch/gauntlet_elastic.yml:/gauntlet/config/config.yml  - /root/acceldata/logs/logsearch/:/gauntlet/logs/  ulimits: {}  ports: []  depends_on: []  opts: {}  restart: ""  extra_hosts: []  network_alias: []
Copy

d. Save the file.

Restart gauntlet service by running the command:

Bash
    
 
accelo restart ad-gauntlet
Copy

Updating the Gauntlet Dry Run Mode

Check if the ad-core.yml file is present or not by running the following command:

Bash
    
 
ls -al $AcceloHome/config/docker/ad-core.yml
Copy

If the file above is not present, then generate it by:

Bash
    
 
accelo admin makeconfig ad-core
Copy

Edit the ad-core.yml file.

a. Open the file.

Bash
    
 
vi $AcceloHome/config/docker/ad-core.yml
Copy

b. Update the DRY_RUN_ENABLE env variable in the ad-gauntlet section:

Bash
    
 
DRY_RUN_ENABLE=false
Copy

This will make the gauntlet delete the order elastic indices and mongo db data.

c. The updated file will look something like this:

Bash
    
 
ad-gauntlet:  image: ad-gauntlet  container_name: ad-gauntlet  environment:  - MONGO_URI=ZN4v8cuUTXYvdnDJIDp+R8Z+ZsVXXjv8zDOvh8UwQXosC8vfVkGYGWGPNnX64ZVSp9yHgErQknPBAfYZ9cOG1A==  - MONGO_ENCRYPTED=true  - ELASTIC_ADDRESSES=http://ad-elastic:9200  - DRY_RUN_ENABLE=false  - CRON_TAB_DURATION=0 0 */2 * *  volumes:  - /etc/localtime:/etc/localtime:ro  - /root/acceldata/config/logsearch/gauntlet_elastic.yml:/gauntlet/config/config.yml  - /root/acceldata/logs/logsearch/:/gauntlet/logs/  ulimits: {}  ports: []  depends_on: []  opts: {}  restart: ""  extra_hosts: []  network_alias: []
Copy

d. Save the file.

Restart gauntlet service by running the command:

Bash
    
 
accelo restart ad-gauntlet
Copy

Updating MongoDB Cleanup and Compaction Frequency in Hours

By default, when dry run is disabled MongoDB cleanup and compaction will run once a day. To configure the frequency, follow the steps listed below.

Run the following command:

Bash
    
 
accelo config retention
Copy

Answer the following prompts, if you’re unsure about how many days you wish to retain. Then proceed with the default values.

Bash
    
 
✔ How many days of data would you like to retain at Mongo DB ?: 15✔ How many days of data would you like to retain at Mongo DB for HDFS reports ?: 15✔ How many days of data would you like to retain at TSDB ?: 31
Copy

When the following prompt comes up, specify the hours of the day during which you would like MongoDB clean up and compaction to run. The value must be a CSV of hours as per the 24 hour time notation.

Bash
    
✔ How often should Mongo DB clean up & compaction run, provide a comma separated string of hours (valid values are [0,23] (Ex. 8,12,15,18)?: 0,6,12,18
Copy

Run the following command. When gauntlet runs the next time, MongoDB clean up and compaction will run at the specified hours, once per hour.

Bash
    
 
accelo admin database push-config
Copy

Enabling (TLS) HTTPS for Pulse Web UI Configuration Using ad-proxy

Deployment and Configuration

Copy the cert.crt, cert.key and ca.crt (optional) files to $AcceloHome/config/proxy/certs location.
Check if ad-core.yml file is present or not.

Bash
    
 
ls -al $AcceloHome/config/docker/ad-core.yml
Copy

If ad-core.yml file is not present, then generate the ad-core.yml file.

Bash
    
 
accelo admin makeconfig ad-core
Copy

OUTPUT

Bash
    
[root@hostname:addons (ad-default)]$ accelo admin makeconfig ad-coreWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✓ Done, Configuration file generatedIMPORTANT: Please edit/verify the file '/data01/acceldata/config/docker/ad-core.yml'.If the stack is already up and running, use './accelo admin recreate' to recreate the whole environment with the new configuration.
Copy

Modify the ad-core.yml file.

a. Open the ad-core.yml file.

Bash
    
 
vi $AcceloHome/config/docker/ad-core.yml
Copy

b. Remove the ports: field in the ad-graphql section of ad-core.yml .

Bash
    
 
ports:  - 4000:4000
Copy

c. The resulting ad-graphql section will look like this:

Bash
    
 
ad-graphql:    image: ad-graphql    container_name: ""    environment:    - MONGO_URI=ZN4v8cuUTXYvdnDJIDp+R8Z+ZsVXXjv8zDOvh8UwQXosC8vfVkGYGWGPNnX64ZVSp9yHgErQknPBAfYZ9cOG1A==    - MONGO_ENCRYPTED=true    - MONGO_SECRET=Ah+MqxeIjflxE8u+/wcqWA==    - UI_PORT=4000    - LDAP_HOST=ad-ldap    - LDAP_PORT=19020    - SSL_ENFORCED=false    - SSL_ENABLED=false    - SSL_KEYDIR=/etc/acceldata/ssl/    - SSL_KEYFILE=ssl.key    - SSL_CERTDIR=/etc/acceldata/ssl/    - SSL_CERTFILE=ssl.crt    - SSL_PASSPHRASE=""    - DS_HOST=ad-query-estimation    - DS_PORT=8181    - 'FEATURE_FLAGS={ "ui_regex": { "regex": "ip-([^.]+)", "index": 1 }, "rename_nav_labels":{},      "timezone": "", "experimental": true, "themes": false, "hive_const":{ "HIVE_QUERY_COST_ENABLED":      false, "HIVE_MEMORY_GBHOUR_COST": 0, "HIVE_VCORE_HOUR_COST": 0 }, "spark_const":      { "SPARK_QUERY_COST_ENABLED": false, "SPARK_MEMORY_GBHOUR_COST": 0, "SPARK_VCORE_HOUR_COST":      0 }, "queryRecommendations": false, "hostIsTrialORLocalhost": false, "data_temp_string":      "" }'    volumes:    - /etc/localtime:/etc/localtime:ro    - /etc/hosts:/etc/hosts:ro    - /data01/acceldata/work/license:/etc/acceldata/license:ro    ulimits: {}    depends_on:    - ad-db    opts: {}    restart: ""    extra_hosts: []    network_alias: []
Copy

d. Save the file

Restart the ad-graphql container:

Bash
    
 
accelo restart ad-graphql
Copy

Check if the port is not exposed to the host.

Bash
    
 
docker ps
Copy

Check if there are any errors in ad-graphql container:

Bash
    
 
docker logs -f ad-graphql_default
Copy

Deploy the ad-proxy addons, run the following command, and select Proxy from the list and press enter.

Bash
    
 
accelo deploy addons
Copy

OUTPUT

Bash
    
 
[x]  Notifications  [x]  Oozie Connector> [x]  Proxy  [ ]  QUERY ROUTER DB  [ ]  SHARD SERVER DB  [ ]  StandAlone Connector
Copy

Check if there are any errors in the ad-proxy container:

Bash
    
 
docker logs -f ad-proxy_default
Copy

Now you can access the Pulse UI using https://<pulse-server-hostname> . By default the port used is 443 .

Configuration

If you want to change the SSL port to another port, follow the below steps:

Check if ad-proxy.yml file is present or not.

Bash
    
 
ls -altrh $AcceloHome/config/docker/addons/ad-proxy.yml
Copy

Generate the ad-proxy.yml file if its not present.

Bash
    
 
accelo admin makeconfig ad-proxy
Copy

OUTPUT

Bash
    
 
[root@hostname:addons (ad-default)]$ accelo admin makeconfig ad-proxyWARN: Gauntlet is running in dry run mode. Disable this to delete indices from elastic and purge data from mongo DB✓ Done, Configuration file generatedIMPORTANT: Please edit/verify the file '/data01/acceldata/config/docker/addons/ad-proxy.yml'.If the addon is already up and running, use './accelo deploy addons' to remove and recreate the addon service.
Copy

Modify the ad-proxy.yml .

a. Open the ad-proxy.yml file.

Bash
    
 
vi $AcceloHome/config/docker/addons/ad-proxy.yml
Copy

b. Change the host port in the ports list to the desired port.

Bash
    
 
ports:  - <DESIRED_HOST_PORT>:443
Copy

The final file will look like this, if the host port is 6003 :

Bash
    
 
version: "2"services:  ad-proxy:    image: ad-proxy    container_name: ""    environment: []    volumes:    - /etc/localtime:/etc/localtime:ro    - /data01/acceldata/config/proxy/traefik.toml:/etc/traefik/traefik.toml    - /data01/acceldata/config/proxy/config.toml:/etc/traefik/conf/config.toml    - /data01/acceldata/config/proxy/certs:/etc/acceldata    ulimits: {}    ports:    - 6003:443    depends_on: []    opts: {}    restart: ""    extra_hosts: []    network_alias: []label: Proxy
Copy

c. Save the file.

Restart the ad-proxy container.

Bash
    
 
accelo restart ad-proxy
Copy

Check if there aren’t any errors.

Bash
    
 
docker logs -f ad-proxy_default
Copy

Now you can access the Pulse UI using https://<pulse-server-hostname>:6003 .

Set Up LDAP for Pulse UI

Check if the ldap.conf is present or not.

Bash
    
 
ls -al $AcceloHome/config/ldap/ldap.conf
Copy

Run the accelo config ldap command to generate the default ldap.conf if not present already.

Bash
    
 
accelo configure ldap
Copy

OUTPUT

Bash
    
 
There is no ldap config file availableGenerating a new ldap config filePlease edit '$AcceloHome/config/ldap/ldap.conf' and rerun this command
Copy

Edit the file in path $AcceloHome/config/ldap/ldap.conf .

Bash
    
 
vi $AcceloHome/config/ldap/ldap.conf
Copy

Configure file for below properties:

LDAP FQDN : FQDN where LDAP server is running
- host = [FQDN]
If port 389 is being used then
- insecureNoSSL = true
SSL root CA Certificate
- rootCA = [CERTIFICATE_FILE_PATH]
bindDN : to be used for ldap search need to be member of admin group
bindPW : <encrypted-password-string> for entering in database.
encryptedPassword = true , set this to true to enable the use of encrypted password.
baseDN used for user search
- Eg: (cn=users, cn=accounts, dc=accedata, dc=io)
Filter used for the user search
- Eg: (objectClass=person)
baseDN used for group search
- Eg: (cn= groups, cn=accounts, dc=acceldata, dc=io)
Group Search: Object class used for group search
- Eg: (objectClass= posixgroup)

Here is the command to check if a user has search entry access and group access in LDAP directory:

Bash
    
ldapsearch -x -h <hostname> -p 389 -D "uid=admins,cn=users,dc=acceldata,dc=io" -W -b "cn=accounts,dc=acceldata,dc=io" "(&(objectClass=person)(uid=admins))"
Copy

If the file is already generated, it will ask for the LDAP credentials to validate the connectivity and configurations, which are mentioned in the below steps.
Run the accelo config ldap command.

Bash
    
 
accelo configure ldap
Copy

It will ask for the LDAP user credentials:

Bash
    
 
Checking LDAP connectionEnter LDAP username: gsEnter LDAP password: *******
Copy

If things went correctly, the below confirmation message will be displayed:

Bash
    
 
performing ldap search ou=users,dc=acceldata,dc=io sub (&(objectClass=inetOrgPerson)(uid=gs))username "gs" mapped to entry cn=gs,ou=users,dc=acceldata,dc=io✗ Do you want to use this configuration: y
Copy

Press ‘y' and press 'Enter’.

OUTPUT

Bash
    
 
Ok, Updating login properties.✓ Done, You can now login using LDAP.
Copy

Push the ldap config.

Bash
    
 
accelo admin database push-config -a
Copy

Run the deploy addon command.

Bash
    
 
accelo deploy addons
Copy

Select the LDAP from the list shown and press 'Enter':

Bash
    
 
[ ]  Job Runner  [ ]  Kafka 0.10.2 Connector  [ ]  Kafka Connector> [x]  LDAP  [ ]  Log Reduce  [ ]  LogSearch  [ ]  Memsql Connector
Copy

OUTPUT

Bash
    
Starting the deployment ..Completed [==================================================================================================] 100.00% 0s✓ Done, Addons deployment completed.
Copy

Run the restart command.

Bash
    
 
accelo restart ad-graphql
Copy

Open Pulse Web UI and create default roles.
Add ops role with required access and all incoming users with ldap login will come under this role automatically.

Spark Jars Placements and Spark Config Changes

Perform the following steps for all the Spark Cluster Nodes:

Add the following configuration in the metrics.properties file for Spark TimeSeries data:

Bash
    
 
$SPARK_HOME/conf/metrics.properties​[root@sac01 conf]# cat metrics.properties*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink*.sink.graphite.host=localhost*.sink.graphite.port=12003*.sink.graphite.protocol=tcp*.sink.graphite.prefix=spark.metrics*.sink.graphite.period=20*.source.jvm.class=org.apache.spark.metrics.source.JvmSource
Copy

Add the following configuration in the spark-defaults.conf file for the Events data:

Bash
    
 
$SPARK_HOME/conf/spark-defaults.conf​[root@sac01 conf]# cat spark-defaults.confspark.event_wait_period_sec        30spark.eventLog.enabled             truespark.extraListeners               io.acceldata.sparkstats.AdSparkListenerspark.ad.connector.context         yarn;<CLUSTERNAME>;admin;testspark.ad.events.url                http://<PULSE SERVER HOSTNAME>:19005/eventsasync
Copy

Take the ad-spark-hook.jar file and put it in the following dir:

Bash
    
 
$SPARK_HOME/jars/
Copy

Restart all the Spark services.

DotLog Download

We have introduced a feature that allows downloading of service logs in .log format. This file is not the original server log but an xlsx sheet merged into a .log format.

Perform the following to add a configurable parameter to enable this feature:

Insert the dotLogFileDownload parameter into the feature flags property of the ad-graphql section found at the file path: $Acceldata_Home/config/docker/ad-core.yml.

Bash
    
 
'FEATURE_FLAGS={ "ui_regex": { "regex": "ip-([^.]+)", "index": 1 }, "rename_nav_labels":{},      "timezone": "", "experimental": true, "themes": false, "hive_const":{ "HIVE_QUERY_COST_ENABLED":      false, "HIVE_MEMORY_GBHOUR_COST": 0, "HIVE_VCORE_HOUR_COST": 0 }, "spark_const":      { "SPARK_QUERY_COST_ENABLED": false, "SPARK_MEMORY_GBHOUR_COST": 0, "SPARK_VCORE_HOUR_COST":      0 }, "queryRecommendations": false, "hostIsTrialORLocalhost": false, "data_temp_string":      "", "dotLogFileDownload": true }'
Copy

Restart the ad-graphl service using the following command:

Bash
    
 
accelo restart ad-graphql
Copy

New Search Bar

Perform the following to enable new search options:

Locate the “ad-graphql“ section in file $Acceldata_Home/config/docker/ad-core.yml and under the “environment“ key, add the following line:

Bash
    
 
- NEW_SEARCH=true
Copy

Restart the ad-graphl service using the following command:

Bash
    
 
accelo restart ad-graphql
Copy

Does a user in the Spark Standalone environment still see the Spark option in the left menu even after their access has been revoked from the role?

Create a different role that does not have Spark permission and assign that role to the user. Alternatively, you can leave it as is because even if the Spark entry is visible in the left navigation, the user will not be able to access it if access has been revoked from their role.

Are non-admin users in the Spark Standalone environment able to access Spark even though they have the appropriate role permissions for accessing Spark?

In the role edit window, click on "Select All" just below the Page permissions. Then, remove any permissions that you do not wish to grant and save the role. Any user assigned to this role should now have access to Spark in the Spark Standalone environment.

What is the reason for the absence of the Oozie workflow link between the Oozie workflow and the application ID in PULSE for a Spark job?

The Spark job's Application ID is generated by the Oozie service. It will only appear in the Pulse UI if it is available in Oozie's Web Service UI. If it is not present in the Oozie's Web Service UI, it will not be displayed in Pulse.

Last updated on

Was this page helpful?

Spark Standalone Multi-cluster

Pre-requisites

Uninstallation

Download and Load Binaries and Docker Images

Configure the Cluster

Copy the License

Deploy Pulse Core Components

Deploy Add-ons

Configure Alerts Notifications

Database Push Configuration

Configure the Override

Deploy the Pulse Agents

Reconfig Cluster

Adding Edge Nodes for Monitoring

Configure Gauntlet

Updating the Gauntlet Crontab Duration

Updating the Gauntlet Dry Run Mode

Updating MongoDB Cleanup and Compaction Frequency in Hours

Enabling (TLS) HTTPS for Pulse Web UI Configuration Using ad-proxy

Deployment and Configuration

Configuration

Set Up LDAP for Pulse UI

Spark Jars Placements and Spark Config Changes

DotLog Download

New Search Bar