Setting up the Environment

To deploy your Acceldata stack using Ambari, you must prepare your deployment environment accordingly.

Service Interoperability Prerequisite: For healthy working of Hadoop services in ODP, ensure that /tmp volume is mounted as exec. If you're using a non-exec mount, refer Troubleshooting ODP documentation.

Set Up Password-less SSH

To have Ambari server automatically install Ambari agents on all your cluster hosts, you must set up password-less SSH connections between the Ambari server host and all other hosts in the cluster. The Ambari server host uses SSH public key authentication to remotely access and install the Ambari agents.

You can choose to manually install an Ambari agent on each cluster host and register them with the target Ambari server. In this case, you do not need to generate and distribute SSH keys.

Perform the following steps:

  1. Generate public and private SSH keys on the Ambari server host.
Bash
Copy
  1. Copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts.
Bash
Copy
  1. Add the SSH Public Key to the authorized_keys file on your target hosts.
Bash
Copy
  1. Depending on your version of SSH, you may need to set permissions on the .ssh directory (to 700) and the authorized_keys file in that directory (to 600) on the target hosts.
Bash
Copy
  1. From the Ambari Server, make sure you can connect to each host in the cluster using SSH, without having to enter a password where <remote.target.host> has the value of each hostname in your cluster.
Bash
Copy
  1. If the following warning message displays during your first connection:
Bash
Copy
  1. Retain a copy of the SSH Private Key on the machine from which you will run the web-based Ambari Install Wizard.

It is possible to use a non-root SSH account, if that account can execute sudo without entering a password.

Set Up Service User Accounts

Each service requires a service user account. The Ambari Cluster Install wizard creates new and preserves any existing service user accounts. It also uses these accounts when configuring Hadoop services. Service user account creation applies to service user accounts on the local operating system and to LDAP/AD accounts.

Enable NTP on the Cluster and on the Browser Host

The clocks of all the nodes in your cluster and the machine that runs the browser through which you access the Ambari Web interface must be able to synchronize with each other.

To install the Network Time Protocal (NTP) service and ensure it starts on boot, run the following commands on each host:

  • RHEL/CentOS 7/8
Bash
Copy
  • Ubuntu 20/22
Copy

Check DNS and NSCD

All hosts within your system should be configured for both forward and reverse DNS resolution.

If configuring DNS in this manner is not possible, you should edit the /etc/hosts file on each host in your cluster to include the IP address and fully qualified domain name (FQDN) of all your hosts. The following instructions provide a general overview and cover basic network setup for standard Linux hosts. Different Linux versions and distributions may require varying commands and procedures, therefore consult the documentation for the specific operating system(s) used in your environment.

Hadoop heavily relies on DNS and conducts numerous DNS lookups during normal operations. To alleviate the strain on your DNS infrastructure, it is highly recommended to employ the Name Service Caching Daemon (NSCD) on Linux cluster nodes. This daemon caches host, user, and group lookups, resulting in improved resolution performance and reduced load on your DNS infrastructure.

Edit the Host File

  1. Using a text editor, open the hosts file on every host in your cluster.

For example: vi /etc/hosts

  1. Add a line for each host in your cluster. The line should consist of the IP address and the FQDN.

For example:

Bash
Copy

Do not remove the following two lines from your hosts file. Removing or editing the following lines may cause various programs that require network functionality to fail.

Copy

Set the Hostname

  1. Confirm that the hostname is set by running the hostname -f command.

This should return the <fully.qualified.domain.name> you just set.

  1. To set the hostname on each host in your cluster, use the following command:
Bash
Copy

Edit the Network Configuration File

  1. Using a text editor, open the network configuration file on every host and set the desired network configuration for each host. For example:
Bash
Copy
  1. Modify the HOSTNAME property to set the fully qualified domain name.
Bash
Copy

Configuring iptables

To ensure that Ambari can communicate with the hosts it deploys to and manages during the setup process, you can temporarily disable iptables.

OSCommand
RHEL/CentOS 7/8

systemctl disable

firewalld service

firewalld stop

Ubuntu 20/22

sudo ufw disable

sudo iptables -X

sudo iptables -t nat -F

sudo iptables -t nat -X

sudo iptables -t mangle -F

sudo iptables -t mangle -X

sudo iptables -P INPUT ACCEPT

sudo iptables -P FORWARD ACCEPT

sudo iptables -P OUTPUT ACCEPT

After the setup is complete, you can go ahead and restart iptables. However, if the security protocols in your environment do not allow disabling iptables, you can proceed with iptables enabled as long as all the required ports are open and available.

During the Ambari server setup process, Ambari performs a check to see if iptables are running. If iptables are running, a warning message is shown, prompting you to verify that the necessary ports are open and accessible. Additionally, in the Cluster Install Wizard's Host Confirm step, warnings are issued for each host where iptables are detected as running.

Disable SELinux, PackageKit and check the umask Value

  1. For the Ambari setup to function, you must disable SELinux. On each host in your cluster, enter:
Bash
Copy

To permanently disable SELinux set SELINUX=disabled in /etc/selinux/config. This ensures that SELinux does not turn itself on after you reboot the machine.

  1. On an installation host running RHEL/CentOS with PackageKit installed, open /etc/ yum/pluginconf.d/refresh-packagekit.conf using a text editor. Make the following change: enabled=0

PackageKit is not enabled by default on Ubuntu systems. Unless you have specifically enabled PackageKit, you may skip this step for a Ubuntu installation host.

  1. UMASK, short for "User Mask" or "User file creation MASK," determines the default permissions that are assigned when a new file or folder is created on a Linux system. Typically, many Linux distributions set the default umask value to 022. This means that when a new file or folder is created, it is given read (4), write (2), and execute (1) permissions for the owner (user), and read and execute permissions for the group and others, resulting in permissions of 755.

If the umask is set to 027, it grants read, write, and execute permissions of 7 for the owner and read and execute permissions of 5 for the group, while others have no permissions.

Ambari, ODP, and ODP support umask values of 022 (0022 is functionally equivalent), 027 (0027 is functionally equivalent). These values must be set on all hosts.

UMASK Examples:

  • Setting the umask for your current login session:
Bash
Copy
  • Checking your current umask:
Bash
Copy
  • Permanently changing the umask for all interactive users:
Bash
Copy

Download and Set up Database Connectors

Ambari and components like Hive, Ranger, and Oozie rely on an operational database. When installing, you can choose to use an existing database or let Ambari set up a new one. To ensure Ambari connects to your chosen database, you need to download the required database drivers and connectors from the database vendor prior to the installation. As part of your setup process, it's essential to configure these database connectors alongside your environment setup, as explained in the upcoming section, to streamline your installation or upgrade: Using an Existing or Installing a Default Database.

You must install either Postgres, Oracle or MySQL; both are not necessary. It is recommended that you use MySQL. Refer Introduction to Open Source Data Platform (ODP) for database versions supported.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated