Configure the Cluster
After deploying all the databases, core, and add-on services, you can configure the cluster using the following command and steps.
- Create a JSON configuration file named
inputs.json
by copying the content below.
{
"standAloneDistroType": "",
"isEdgeNode": false,
"clusterType": "",
"clusterURL": "",
"username": "",
"password": "",
"clusterName": "",
"clusterDisplayName": "",
"clusterOriginalName": "",
"stackVersion": "",
"sparkVersion": "",
"spark2Version": "",
"spark3Version": "",
"kafkaVersion": "",
"hBaseVersion": "",
"hiveVersion": "",
"sparkHistoryHDFSPath": "",
"spark2HistoryHDFSPath": "",
"spark3HistoryHDFSPath": "",
"hiveMetaStoreDBName": "",
"hiveMetastoreDBURI": "",
"hiveMetastoreDBUsername": "",
"hiveMetastoreDBPassword": "",
"oozieDBName": "",
"oozieDBURL": "",
"oozieDBUsername": "",
"oozieDBPassword": "",
"oozieDBTimezone": "",
"isKerberosEnabled": false,
"kerberosRealm": "",
"kerberosKeytabUsername": "",
"kerberosPrinciple": "",
"isParcelBasedDeployment": false,
"SSHKeyAlgo": "",
"httpsEnabledInCluster": false,
"sshUser": "",
"hmsJMXPort": "",
"hs2JMXPort": "",
"zkJMXPort": "",
"zmJMXPort": "",
"schemaRegistryJMXPort": "",
"kafkaBrokerPort": "",
"kafkaLogDirs": "",
"kafkaJMXPort": "",
"mm2JMXPort": "",
"kafkaConnectJMXPort": "",
"cruiseControlJMXPort": "",
"kafka3BrokerPort": "",
"Kafka3JMXPort": "",
"kafka3LogDirs": "",
"kafkaSABootstrapServer": "",
"kraftControllerJMXPort": "",
"kafka3ConnectJMXPort": "",
"cruiseControl3JMXPort": "",
"kafkaMirrorMaker3JMXPort": "",
"rangerAdminJMXPort": "",
"rangerUserSyncJMXPort": "",
"rangerTagSyncJMXPort": "",
"rangerKMSJMXPort": "",
"trinoCoordinatorJMXPort": "",
"trinoWorkerJMXPort": "",
"installKapxy": false,
"enableNTPStats": "",
"enableLogSearch": false,
"enableImpalaAgent": false,
"isHiveServer2InteractiveKerberized": false,
"isImpalaKerberized": false,
"standAloneComponents": "",
"sASparkHistoryURL": "",
"zkEnabled": false,
"nifiZKEnabled": false,
"mm2Enabled": false,
"nifiRegistryInstalled": false,
"nifiRegistryURL": "",
"mm2Nodes": "",
"kafkaConnectEnabled": false,
"cruiseControlEnabled": false,
"cruiseControl3Enabled": false,
"sparkOnK8s": false
}
Replace the placeholder values with actual values specific to your Kubernetes deployment.
- In case LDAP-based login is required, copy the following file, update it with the appropriate values, and save it as
ldap.conf
.
ldap {
configuration {
# The Ldap host
host = "<host name>",
# The following field is required if using port 389.
insecureNoSSL = true
# insecureSkipVerify = true
rootCA = "/etc/dex/ldap.ca",
bindDN = "Administrator@DC.ADSRE.COM",
bindPW = "<bindPW>",
encryptedPassword = true,
specialSearch = false,
prefix = "",
suffix = "",
userSearch {
# Would translate to the query "(&(objectClass=person)(uid=<username>))"
baseDN = "DC=DC,DC=adsre,DC=com",
filter = "(objectClass=person)",
username = "sAMAccountName",
idAttr = "sAMAccountName",
emailAttr = "mail",
nameAttr = "name"
# Can be 'sub' or 'one'
scope = "sub"
}
groupSearch {
# Would translate to the query "(&(objectClass=group)(member=<user uid>))"
baseDN = "DC=DC,DC=adsre,DC=com",
filter = "(objectClass=group)",
# Use if full DN is needed and not available as any other attribute
# Will only work if "DN" attribute does not exist in the record
# userAttr: DN
userAttr = "DN",
groupAttr = "member",
nameAttr = "name"
# Can be 'sub' or 'one'
scope = "sub"
}
}
}
- Upload the
krb5.conf
andcacerts
files (if applicable for the cluster).
Kerberos: Ensure the krb5.conf
file is correctly configured.
curl --request PUT \
--url http://<Ingress API for Manager Server>/api/v1/mserver/upload/krb/<clustername>\
--header '<basic credentials>' \
--header 'Content-Type: multipart/form-data' \
--form krb5.conf=@krb5.conf \
--form kerberos.keytab=@kerberos.keytab
Cacerts: Ensure the cacerts
file is correctly configured.
curl --request PUT \
--url http://<Ingress API for Manager Server>/api/v1/mserver/upload/krb/<clustername>\
--header '<basic credentials>' \
--header 'Content-Type: multipart/form-data' \
--form cacerts=@cacerts
- Upload the LDAP configuration file.
curl --request POST \
--url http://<Ingress API for Manager Server>/api/v1/mserver/config/ldap \
--header '<basic credentials>' \
--header 'Content-Type: multipart/form-data' \
--form ldap.conf=@ldap.conf
- Configure the cluster API.
curl --request POST \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/config/cluster \
--header '<basic credentials>' \
--header 'Content-Type: application/json' \
--data @inputs.json
- Push the configuration.
curl --request PUT \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/dbpush/<clustername> \
--header '<basic credentials>'
- Configure the Alert notifications.
curl --request PUT \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/alerts/update/configs/<clustername> \
--header '<basic credentials>'
--data '{
"clusterName": "<clustername>",
"metricGroups": [
"common",
"druid",
"nifi",
"ntpd",
"anomaly",
"chrony",
"customApp"
],
"notifications": {
"jira": {
"enable": false,
"maxJiraMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"projectKey": "",
"username": "",
"jiraToken": "",
"url": "",
"issueType": "",
"priority": "",
"labels": []
},
"servicenow": {
"enable": false,
"maxServiceNowMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"tableNames": [],
"bearerToken": "",
"url": "",
"caller": ""
},
"action": {
"enable": false,
"maxMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"telegram": {
"enable": false,
"maxTelegramMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"defaultBotToken": "",
"defaultChatIds": []
},
"line": {
"enable": false,
"maxLineMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"apiKeys": []
},
"slack": {
"enable": false,
"defaultIncomingWebhookUrls": "",
"maxSlackMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"pagerduty": {
"enable": false,
"routingKey": "",
"maxPagerdutyIncidentThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"hangouts": {
"enable": false,
"chatroom": {
"webhookUrl": ""
},
"maxMsgThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"webhook": {
"enable": false,
"defaultWebhookUrls": "",
"maxMsgThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"fileLog": {
"enable": false,
"maxMsgThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"logDirectory": "",
"logFileName": "",
"rollingFileSizeInStr": "",
"logFileUpperIndex": 0,
"logFileLowerIndex": 0
},
"email": {
"enable": true,
"defaultToEmailIds": "",
"maxEmailThreshold": 1,
"defaultSnoozeIntervalInSecs": 0
},
"opsgenie": {
"enable": false,
"maxMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"apiKey": ""
},
"quantum": {
"enable": false,
"maxQuantumMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"url": "",
"subscriptionKey": "",
"ticketType": "Reporting an Outage",
"state": "Queue",
"apporPlatform": "Pulse",
"reportedBy": "",
"assignedTo": ""
},
"xmatters": {
"enable": false,
"maxXMattersMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"url": "",
"alertType": "ALERT",
"version": "alertapi-0.1",
"userGroups": []
},
"bigpanda": {
"enable": false,
"maxBigPandaMessageThreshold": 1,
"defaultSnoozeIntervalInSecs": 0,
"url": "",
"appKey": "",
"authToken": "",
"mnemonic": "",
"overrideAlertOnlyIndicator": "AlertOnly"
},
"microsoftteams": {
"enable": false,
"maxTeamsMessageThreshold": 0,
"defaultSnoozeIntervalInSecs": 0,
"webhookUrl": ""
}
},
"timezone": {
"clientTimeZone": "UTC",
"clientDateTimeFormat": "dd-M-yyyy hh:mm:ss a"
},
"PauseNotificationInterval": {
"StartTime": "",
"EndTime": ""
}
}'
If you want to update an existing configuration, use the Fetch Config API to retrieve the current configuration stored in MongoDB.
curl --request GET \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/alerts/fetch/configs/<clustername> \
--header '<basic credentials>' \
The response contains the current alert configuration stored in the database. If no configuration exists in MongoDB, a default configuration will be returned in the response.
- Configure the Actions notifications.
curl --request PUT \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/actions/update/configs/<clustername> \
--header '<basic credentials>' \
--data '{
"clusterName": "sparkOnk8s",
"notifications": {
"jira": {
"enable": false,
"projectKey": "",
"username": "",
"jiraToken": "",
"url": "",
"issueType": "",
"priority": "",
"labels": []
},
"slack": {
"enable": false,
"defaultIncomingWebhookUrls": ""
},
"email": {
"enable": false,
"defaultToEmailIds": ""
},
"quantum": {
"enable": false,
"url": "",
"subscriptionKey": "",
"ticketType": "Reporting an Outage",
"state": "Queue",
"apporPlatform": "Pulse",
"reportedBy": "",
"assignedTo": ""
},
"xmatters": {
"enable": false,
"url": "",
"groupDetails": [],
"alertType": "ALERT",
"version": "alertapi-0.1"
},
"bigpanda": {
"enable": false,
"maxBigPandaMessageThreshold": 0,
"defaultSnoozeIntervalInSecs": 0,
"url": "",
"appKey": "",
"authToken": "",
"mnemonic": "",
"overrideAlertOnlyIndicator": ""
}
},
"timezone": {
"clientTimeZone": "UTC",
"clientDateTimeFormat": "dd-M-yyyy hh:mm:ss a"
},
"lastUpdateTime": 0
}'
If you want to update an existing configuration, use the Fetch Config API to retrieve the current configuration stored in MongoDB.
curl --request GET \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/alerts/fetch/configs/<clustername> \
--header '<basic credentials>' \
curl --request GET \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/actions/fetch/configs/<clustername> \
--header '<basic credentials>' \
The response contains the current action configuration stored in the database. If no configuration exists in MongoDB, a default configuration will be returned in the response.
Configure Cluster Endpoint API
This API is used to configure a cluster within Pulse by submitting the required cluster-specific details.
Endpoint: /api/v1/mserver/config/cluster
Request Type: POST
or PUT
(based on implementation)
Required Inputs (JSON Request Body): The JSON payload must include the following fields:
{
"standAloneDistroType": "Spark",
"isEdgeNode": false,
"clusterType": "Hadoop",
"clusterURL": "http://cluster-manager.example.com",
"username": "admin",
"password": "secret",
"clusterName": "cluster1",
"clusterDisplayName": "Cluster One",
"clusterOriginalName": "cluster1-original",
"stackVersion": "7.1.0",
"sparkVersion": "2.4.0",
"spark2Version": "2.4.0",
"spark3Version": "3.1.1",
"kafkaVersion": "2.8.0",
"hBaseVersion": "2.2.2",
"hiveVersion": "3.1.0",
"sparkHistoryHDFSPath": "/spark-history",
"spark2HistoryHDFSPath": "/spark2-history",
"spark3HistoryHDFSPath": "/spark3-history",
"hiveMetaStoreDBName": "metastore",
"hiveMetaStoreDBURI": "jdbc:mysql://localhost/metastore",
"hiveMetaStoreDBUsername": "hive",
"hiveMetaStoreDBPassword": "hivepass",
"oozieDBName": "oozie",
"oozieDBURL": "jdbc:mysql://localhost/oozie",
"oozieDBUsername": "oozie",
"oozieDBPassword": "ooziepass",
"oozieDBTimezone": "UTC",
"isKerberosEnabled": false,
"kerberosRealm": "EXAMPLE.COM",
"kerberosKeytabUsername": "hdfs",
"kerberosPrinciple": "hdfs/_HOST@EXAMPLE.COM",
"isParcelBasedDeployment": true,
"SSHKeyAlgo": "rsa",
"httpsEnabledInCluster": true,
"sshUser": "root",
"hmsJMXPort": "9083",
"hs2JMXPort": "10000",
"zkJMXPort": "2181",
"zmJMXPort": "9999",
"schemaRegistryJMXPort": "8081",
"kafkaBrokerPort": "9092",
"kafkaLogDirs": "/var/log/kafka",
"kafkaJMXPort": "9999",
"mm2JMXPort": "9100",
"kafkaConnectJMXPort": "8083",
"cruiseControlJMXPort": "9093",
"kafka3BrokerPort": "9094",
"Kafka3JMXPort": "9998",
"kafka3LogDirs": "/var/log/kafka3",
"kafkaSABootstrapServer": "localhost:9092",
"kraftControllerJMXPort": "9997",
"kafka3ConnectJMXPort": "8084",
"cruiseControl3JMXPort": "9095",
"kafkaMirrorMaker3JMXPort": "9101",
"rangerAdminJMXPort": "6080",
"rangerUserSyncJMXPort": "6081",
"rangerTagSyncJMXPort": "6082",
"rangerKMSJMXPort": "9292",
"trinoCoordinatorJMXPort": "8086",
"trinoWorkerJMXPort": "8087",
"installKapxy": false,
"enableNTPStats": "false",
"enableLogSearch": true,
"enableImpalaAgent": false,
"isHiveServer2InteractiveKerberized": false,
"isImpalaKerberized": false,
"standAloneComponents": "Spark,Nifi",
"sASparkHistoryURL": "http://history-server.example.com",
"zkEnabled": true,
"nifiZKEnabled": false,
"mm2Enabled": false,
"nifiRegistryInstalled": true,
"nifiRegistryURL": "http://nifi-registry.example.com",
"mm2Nodes": "node1,node2",
"kafkaConnectEnabled": true,
"cruiseControlEnabled": true,
"cruiseControl3Enabled": false,
"sparkOnK8s": false
}
Field | Description |
---|---|
standAloneDistroType | Accepted values: Spark , Nifi , Kafka . Only used for standalone cluster configuration. |
isEdgeNode | Accepted values: true , false . Set to false by default. (Revamp needed) |
standAloneComponents | Accepted values: Spark , Nifi , Kafka . For standalone deployments. |
clusterType | Accepted values: Ambari , Cloudera , Stand-Alone , Custom , None . |
clusterURL | Full URL of Ambari or Cloudera Manager (e.g., http://hostname:port ). |
username | Username for Ambari or Cloudera Manager. |
password | Password for Ambari or Cloudera Manager. |
clusterName | Used to create configs, DB names, directories, and files. |
clusterDisplayName | Name shown in the UI. |
sparkVersion, spark2Version, spark3Version | Versions of Spark in use. |
kafkaVersion, hBaseVersion, hiveVersion | Versions of Kafka, HBase, and Hive. |
sparkHistoryHDFSPath, spark2HistoryHDFSPath, spark3HistoryHDFSPath | HDFS paths for Spark history logs. |
hiveMetaStoreDBName, hiveMetaStoreDBURI, hiveMetaStoreDBUsername, hiveMetaStoreDBPassword | Hive Metastore DB connection details. |
oozieDBName, oozieDBURL, oozieDBUsername, oozieDBPassword, oozieDBTimezone – | Oozie DB connection details. |
isKerberosEnabled | Enable or disable Kerberos for the cluster. |
kerberosRealm | Kerberos realm name (e.g., EXAMPLE.COM ). |
kerberosKeytabUsername | Username used in keytab. |
kerbereosPrincipal | Kerberos principal name (typo in field, should be kerberosPrincipal ). |
isParcelBasedDeployment | Always set true for K8s deployment. |
SSHKeyAlgo | SSH key algorithm (e.g., rsa , ecdsa ). |
httpsEnabledInCluster | Enable or disable HTTPS for services. |
sshUser | SSH user for accessing nodes. |
hmsJMXPort, hs2JMXPort, zkJMXPort, zmJMXPort | JMX ports for Hive, Zookeeper, etc. |
schemaRegistryJMXPort, kafkaJMXPort, mm2JMXPort, kafkaConnectJMXPort, cruiseControlJMXPort | JMX ports for Kafka ecosystem components. |
kafka3BrokerPort, Kafka3JMXPort, kafka3LogDirs | Ports and log directories for Kafka 3. |
kafkaSABootstrapServer | Kafka bootstrap URL for standalone setups. |
kraftControllerJMXPort, kafka3ConnectJMXPort, cruiseControl3JMXPort, kafkaMirrorMaker3JMXPort | Kafka 3 JMX ports. |
rangerAdminJMXPort, rangerUserSyncJMXPort, rangerTagSyncJMXPort, rangerKMSJMXPort | JMX ports for Ranger components. |
trinoCoordinatorJMXPort, trinoWorkerJMXPort | JMX ports for Trino components. |
installKapxy | Enable Kapxy (Kafka monitoring agent). |
enableNTPStats | Enable NTP synchronization stats. |
enableLogSearch | Enable Log Search service. |
enableImpalaAgent | Enable Impala metric collection agent. |
isHiveServer2InteractiveKerberized | Whether HiveServer2 LLAP is Kerberized. |
isImpalaKerberized | Whether Impala is Kerberized. |
zkEnabled, nifiZKEnabled | Enable Zookeeper for Spark/Nifi standalone components. |
mm2Enabled | Enable MirrorMaker2 monitoring. |
nifiRegistryInstalled | Whether NiFi Registry is installed. |
nifiRegistryURL | URL of NiFi Registry (if installed). |
mm2Nodes | Nodes involved in MM2 replication. |
kafkaConnectEnabled | Enable Kafka Connect monitoring. |
cruiseControlEnabled, cruiseControl3Enabled | Enable Cruise Control (Kafka 2/3). |
sparkOnK8s | Enable Spark observability on Kubernetes. Whether Spark is running on Kubernetes instead of YARN. |
sASparkHistoryURL | Spark history server URL for standalone setups. |
Reconfigure the Cluster
The Reconfig Cluster action is used to apply changes made to the cluster configuration after the initial setup.
You can use the following command to reconfigure the cluster.
curl --request POST \
--url http://<Ingress API for the Manager Server>/api/v1/mserver/reconfig/cluster/<clustername> \
--header '<basic credentials>' \
--header 'Content-Type: application/json' \
--data '{
"forcePull": true
}'