Upgrade from Version 2.2.1 to 3.1.0

This document describes the steps to migrate from Pulse 2.1.1 version to 3.1.0 version. You must perform the steps mentioned in this document in all your clusters.

Backup Steps

  1. Take backup of Dashplots Charts using Export option.
  1. Take backup of Alerts using the Export option.

Migration Steps

  1. Requires Pulse server downtime.
  2. Requires re-installation of Pulse agents running in all the cluster nodes.

Please plan your migrations accordingly.

  1. (Optional) Execute the following steps only on the standalone nodes of a multi-node Pulse deployment.
    1. Generate the encrypted string for the mongodb://accel:<MONGO_PASSWORD>@<PULSE_MASTER_HOST>:27017mongo URI, by executing the following command.
Bash
Copy

Copy the output of the above command.

b. Add the following environment variables to the /etc/profile.d/ad.sh file.

Bash
Copy

Once you execute the above steps, you must receive the output as shown in the following image.

c. Source the /etc/profile.d/ad.sh file by executing the following command.

Bash
Copy
  1. Stop the ad-streaming and ad-connector connectors by executing the following commands.
Bash
Copy
  1. Enter the Mongo container by executing the following command.
Bash
Copy
  1. Login to the Mongo shell by bececuting the following command.
Bash
Copy
  1. Execute the following commands.
Bash
Copy
  1. Rename the collection by executing the following command.
Bash
Copy

You must get a response which says { "ok": 1 }.

  1. Exit the Mongo shell by executing the following command.
Bash
Copy
  1. Ensure that you are still in the ad-db container bash shell. Use the following command to export the past 7 days data with the required fields from the tez_queries_ nsure that details collection. You can refer to this link to convert a date to epoch value.
Bash
Copy

You must get a response as # document(s) imported successfully. 0 document(s) failed to import.

  1. Using the following command, import the data file returned by the preceding command into the yarn__tez_ queries collection.
Bash
Copy

You must get a response as # document(s) imported successfully. 0 document(s) failed to import.

  1. Execute the below command to complete the migration.
Bash
Copy

You must receive the following response.

Bash
Copy
  1. Download the new CLI with 3.1.0 version.
    1. Execute the following migration command steps to migrate to 3.0.0. a. Execute the following CLI migrate command.
Bash
Copy

b. Based on whether you want to migrate as a Root user or a non-root user, execute the commands from one of the following columns.

Non Root UserRoot User

a. Disable all the Pulse Services by executing the following command.

docker stop $(docker ps -aq --filter "network=ad-default") && docker rm $(docker ps -aq --filter "network=ad-default")

a. If accelo CLI is going to be run as a root user, execute the following command:

accelo migrate -v 3.0.0

b. Change the ownership of all data directories to 1000:1000 by executing the following commands.

1. sudo chown -R 1000:1000 $AcceloHome/data

2. sudo chown -R 1000:1000 $AcceloHome/work/<ClusterName>/clusterkey

c. Execute the following migration command with the-s flag.

accelo migrate -v 3.0.0 -s

d. Execute the following command to uninstall the Pulse Hydra agent from all the current active cluster nodes.

accelo uninstall remote

You must repeat the steps 12.a, 12.b, and 12.c for all the clusters configured on the Pulse server, one by one.

  1. Execute the following steps to migrate to 3.1.0. a. Execute the following CLI migrate command.
Bash
Copy

b. Based on whether you want to migrate as a Root user or a non-root user, execute the commands from one of the following columns.

Non Root UserRoot User

a. Disable all the Pulse Services by executing the following command.

docker stop $(docker ps -aq --filter "network=ad-default") && docker rm $(docker ps -aq --filter "network=ad-default")

a. If accelo CLI is going to be run as a root user, execute the following command.

accelo migrate -v 3.1.0

b. Change the ownership of all data directories to 1000:1000 by executing the following commands.

1. sudo chown -R 1000:1000 $AcceloHome/data

2. sudo chown -R 1000:1000 $AcceloHome/work/<ClusterName>/clusterkey

c. Execute the following migration command with the-s flag.

accelo migrate -v 3.1.0 -s

  1. Execute the following command to deploy the Pulse core components.
Bash
Copy
  1. Execute the following command to deploy the required addons.
Bash
Copy
  1. Execute the following command to reconfigure all the clusters, configured in the Pulse server. The reconfigure command will update the configurations for all the clusters.
Bash
Copy
  1. Execute the following command for remote uninstallation.
Bash
Copy
  1. Execute the following command to deploy the hydra agents for all the clusters, configured in Pulse server.
Bash
Copy
  1. (Optional) Execute the following commands to deploy auto action playbooks, if you have the ad-director add-on component deployed.
Bash
Copy
  1. Execute the following command to update the HDFS dashboard data.
Bash
Copy
  1. Execute the steps in one of the following columns based on whether you are using Pulse single-node or Pulse multi-node setup.
Single-node multi Kerberos Pulse SetupMulti-Node multi Kerberos Pulse Setup
Update the spark.events.url section in the acceldata<clustername>.conf file. You must replace the <clustername> in the field with the following URL.Update the spark.events.url section in the acceldata<clustername>.conf file. You must replace the <clustername> in the field with the following URL.
spark.events.url = "http://ad-sparkstats:19004/events"http://<IP_WHERE_SPARKSATS_CONTAINER_IS_RUNNING>:19004/events
  1. Execute the following command to add the ad-events connection info in acceldata.conf file and Mongo.
Bash
Copy

Deploy NATS Container

Pulse now uses the NATS queue. To enable the NATS queue, you must execute the following command.

Bash
Copy

Set Time Zone for Logs

The commands in this section facilitate you to update the environment variables such that they match the server time zone in JODA time format.

  1. Execute the following command if the ad-logsearch.yml file is not present at the $AcceloHome/config/docker/addons directory.
Bash
Copy
  1. Open the ad-logsearch.yml file from the $AcceloHome/config/docker/addons directory.
  2. Update the following environment variable's value to match the server’s time zone in the joda format and add it to the ad-logstash section.
Bash
Copy
  1. Execute the following command to restart the logstash container. This ensures that the above changes are applied.
Bash
Copy

Oozie Connector Update

Add the following property to the acceldata/config/acceldata<CLUSTER NAME>.conf file, under the oozie.connectors section. Select the column based on the database applicable to your setup.

mariadb/mysqlPostgreSQLOracle

type = "mariadb/mysql"

user = "<DB_USERNAME>"

pass = "<ENCRYPTED_DB_USERNAME>"

driver = "org.mariadb.jdbc.Driver"

type = "postgresql"

user = "<DB_USERNAME>"

pass = "<ENCRYPTED_DB_USERNAME>"

driver = "org.postgresql.Driver"

type = "oracle"

user = "<DB_USERNAME>"

pass = "<ENCRYPTED_DB_USERNAME>"

driver = "oracle.jdbc.driver.OracleDriver"

Pulse Hooks Deployment

Pulse now uses the NATS queue. To get the streaming data from NATS, you must configure and deploy the JAR files in various folders. This section describes the folders in which you must deploy various JAR files.

Acceldata Hive Hook for HDP2

  • Hook: Acceldata Hive Hook for HDP2

  • Binary: ad-hive-hook-HDP2-1.0.jar

  • Deployment: Place the above JAR file in the /opt/acceldata/ folder in HDFS node.

  • Configuration Steps:

    • ad.events.streaming.servers (ad-events:4222)
    • ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • hive.exec.failure.hooks (io.acceldata.hive.AdHiveHook)
    • hive.exec.post.hooks (io.acceldata.hive.AdHiveHook)
    • hive.exec.pre.hooks (io.acceldata.hive.AdHiveHook)
    • Export the AUX_CLASSPATH=/opt/acceldata/ad-hook_HDP2-assembly-1.0.jar file. This file is present under hive-env template and hive-interactive-env template in hive service config.
  • Action items: You must add the ad.events.streaming.servers property under the Custom hive-site and Custom hive-interactive-site sections.

Acceldata Hive Hook for HDP3

  • Hook: Acceldata Hive Hook for HDP3

  • Binary: ad-hive-hook-HDP3-1.0.jar

  • Deployment: Place the above JAR file in the /opt/acceldata/ folder in HDFS node.

  • Configuration Steps:

    • ad.events.streaming.servers (ad-events:4222)
    • ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • hive.exec.failure.hooks (io.acceldata.hive.AdHiveHook)
    • hive.exec.post.hooks (io.acceldata.hive.AdHiveHook)
    • hive.exec.pre.hooks (io.acceldata.hive.AdHiveHook)
    • Export the AUX_CLASSPATH=/opt/acceldata/ad-hook_HDP3-assembly-1.0.jar file. This file is present under hive-env template and hive-interactive-env template in hive service config.
  • Action items: You must add the ad.events.streaming.servers property under the Custom hive-site and Custom hive-interactive-site sections.

Acceldata Tez Hook for HDP2

  • Hook: Acceldata Tez Hook for HDP2

  • Binary: ad-tez-hook-HDP2-1.0.jar

  • Deployment: You must place the above JAR file in the tarball available in HDFS at /hdp/apps/2.6.5.0-292/tez/tez.tar.gz.

  • Configuration Steps:

    • ad.events.streaming.servers (ad-events:4222)
    • ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • tez.history.logging.service.class (io.acceldata.hive.AdTezEventsNatsClient)
  • Action items: You must add the ad.events.streaming.servers property under the Custom tez-site section.

Acceldata Tez Hook for HDP3

  • Hook: Acceldata Tez Hook for HDP3

  • Binary: ad-tez-hook-HDP3-1.0.jar

  • Deployment: You must place the above JAR file in the tarball available in HDFS at /hdp/apps/3.1.4.0-315/tez/tez.tar.gz.

  • Configuration Steps:

    • ad.events.streaming.servers (ad-events:4222)
    • ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • tez.history.logging.service.class (io.acceldata.hive.AdTezEventsNatsClient)
  • Action items: You must add the ad.events.streaming.servers property under the Custom tez-site section.

Acceldata Spark Hook HDP2

  • Hook: Acceldata Spark Hook HDP2
  • Binary: ad-spark-hook-HDP2-1.0.jar
  • Deployment: You must place the above JAR file in the /opt/acceldata/ folder in HDFS node.
  • Configuration Steps:
    • spark.ad.events.streaming.servers (ad-events:4222)
    • spark.ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • You must use this JAR during Spark job submission. The command to submit the JAR file is as follows.
Bash
Copy
  • Action items:
    • You must add the spark.ad.events.streaming.servers property under Custom spark2-defaults.
    • You must modify the spark.extraListeners property. This property is located under the Advanced spark2-defaults. You must provide the value as io.acceldata.sparkhook.AdSparkHook.

Acceldata Spark Hook HDP3

  • Hook: Acceldata Spark Hook HDP3
  • Binary: ad-spark-hook-HDP3-1.0.jar
  • Deployment: You must place the above JAR file in the /opt/acceldata/ folder in HDFS node.
  • Configuration Steps:
    • spark.ad.events.streaming.servers (ad-events:4222)
    • spark.ad.cluster (your cluster name, ex: ad_hdp3_dev)
    • You must use this JAR during Spark job submission. The command to submit the JAR file is as follows.

Connector Level Configurations

This section describes the configurations you must perform on various connector microservices for Pulse deployments.

Sparkstats

You must generate the required connectors YAML file, if not available. You can execute the following command to generate the YAML file.

Bash
Copy

Configuration for Sparkstats

You must configure the ad-core-connectors.yml file. To accomplish this, you must add the following environment variable under ad-sparkstats.

NATS_SERVER_HOST_LIST=ad-events:4222

YARN applications and Hive-queries

You must perform the following configuration in Pulse configurations, under individual collectors.

Bash
Copy

New Dashplots Version and Generic Reporting Feature

  • Splashboards and Dashplots are never automatically rewritten, so either delete that dashboard to acquire the new version, or set the environment variable OVERWRITE_SPLASHBOARDS and OVERWRITE_DASHPLOTS to overwrite the existing splashboard dashboard with the newer version.
  • To access the most recent dashboard, delete the HDFS Analytics dashboard from splashboard studio and then refresh configuration.
  1. Navigate to ad-core.yml.
  2. In graphql, set the environment variables of OVERWRITE_SPLASHBOARDS and OVERWRITE_DASHPLOTS to true (default value is set to false)
  1. Export all the dashplots which are not seeded by default to file before performing the upgradation.
  2. Login to the ad-pg_default docker container with the following command after the upgrade to 3.0.3.
Bash
Copy
  1. Copy, paste and execute the snippet attached in the migration file as is and press enter to execute it.
SQL
Copy
  1. Go to dashplot studio and import the zip file exported in step 1 of this section with < 3.0.0 dashboard check box selected.

Troubleshooting

FSAnalytics Issue

Post upgrade after executing the fsa load command, incase if you encounter the following exception in fsanalytics connector execute the below steps to troubleshoot the issue.

Bash
Copy

Execute the following steps to resolve the above exception.

  1. Remove the ${ACCELOHOME}/data/fsanalytics/${ClusterName}/meta-store.dat file.
  2. Restart the ad-fsanalytics container using the following command.
Bash
Copy
  1. Execute the following command to generate the meta store data again.
Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard