HDFS Router Federation

NameNodes have scalability limits because of the metadata overhead comprised of inodes (files and directories) and file blocks, the number of Datanode heartbeats, and the number of HDFS RPC client requests. The common solution is to split the filesystem into smaller subclusters, HDFS Federation, and provide a federated view ViewFs. The problem is how to maintain the split of the subclusters (e.g., namespace partition), which forces users to connect to multiple subclusters and manage the allocation of folders or files to them.

Key Advantages of Router-based Federation:

  1. Scalability:

    • Router-based federation can handle more namespaces and storage than ViewFs.
    • It efficiently distributes client requests across multiple namespaces, improving load management.
  2. Fault Isolation:

    • In case of a namespace failure, the rest of the system remains unaffected.
    • This isolation ensures continuous availability even in the event of individual failures, unlike ViewFs, where a failure in one mount point could potentially impact the entire system.
  3. Distributed Namespace:

    • Router-based federation allows the namespace to be distributed across multiple NameNodes.
    • This distribution helps in balancing loads, enhancing performance, and improving reliability.
  4. Security:

    • Provides a unified Access Control List (ACL) across namespaces.
    • Offers enhanced security features compared to ViewFs, ensuring robust data protection mechanisms.
  5. Flexibility:

    • Namespaces can be dynamically added or removed without affecting the system’s overall performance.
    • This flexibility allows administrators to scale and manage the system seamlessly without downtime.
  6. Routing Policies:

    • Supports multiple routing policies, such as local, random, and hash-based, enabling better load distribution across NameNodes.
    • Customizable routing policies ensure that resources are optimally utilized.
  7. High Availability (HA):

    • Supports High Availability configurations, enhancing system reliability and accessibility.
    • ViewFs lacks native support for HA, making router-based federation more robust for critical deployments.

Example flow

The simplest configuration deploys a Router on a machine not shared with NameNode (configuration conflict when shared with NN host HDFS-17356 ). The Router monitors the local NameNode and heartbeats the state to the State Store. When a regular DFS client contacts any of the Routers to access a file in the federated filesystem, the Router checks the Mount Table in the State Store (i.e., the local cache) to find out which subcluster contains the file. Then, it checks the Membership table in the State Store (i.e., the local cache) for the NameNode responsible for the subcluster. After it has identified the correct NameNode, the Router proxies the request. The client accesses DataNodes directly.

Deployment

By default, the Router is ready to take requests and monitor the NameNode in the local machine. It needs to know the State Store endpoint by setting dfs.federation.router.store.driver.class. The rest of the options are documented in hdfs-rbf-default.xml.

Managing Multiple Routers in HDFS Router-Based Federation

To manage multiple Routers, the user needs to install the Router on hosts that do not have NameNodes. Follow these detailed steps:

Steps:

  1. Cluster Setup:

    • Ensure that you have a minimum 6-node Ambari cluster.
    • Install the HDFS service and enable High Availability (HA).
    • Add at least one additional namespace (there must be at least two namespaces).
  2. Router Installation:

    • Add 2 Routers to the cluster on nodes that do not have NameNodes.
  3. Automatic Configuration in core-site.xml: The following configurations will be automatically added to the core-site.xml of the selected hosts during Router installation (this is managed by params_linux.py):

    • fs.defaultFS: ns-fed
    • hadoop.zk.address: zk_quorum
  4. Automatic Configuration in hdfs-site.xml: The following configurations will be automatically added to Custom hdfs-site.xml:

Bash
Copy

Key Consideration: This setup ensures that the NameNode is not restarted, preventing any configuration conflicts. Only the configuration files of the Router hosts will be affected.

Steps to Configure Router Federation:

  1. Enable NN HA.
  2. Adding additional nodes to the cluster provides an option to add routers on those hosts.
  1. Add at least one additional namespace (must have at least two namespaces).
  1. Select additional NNs for new namespace.
  1. Provide a new path for storing the edits under the new journal node edits directory
  1. A new HDFS Namespace is being deployed.
  1. Run the below to update journalnode.py on all the journal nodes; otherwise, the journal node start fails.
Bash
Copy
  1. DataNodes are not started after getting stopped as part of all services thus, NN start fails.
  1. Stop the the Ambari operation for NN start and start back the DNs from the other Ambari terminal.
  1. Bootstrap Standby fails if “/hadoop/hdfs/namenode/“ contains any files or folder.

Perform the below step on the NameNode which needs bootstrap and continue.

Bash
Copy

Continue once all the steps are completed.

  1. After all the service restart, confirm on the HDFS summary page that we have two HDFS namespace.
  1. Add the two Routers to the cluster on the nodes, which do not have NameNodes from action dropdown.
  1. Select the Router Hosts.
  1. Once the Installation is completed, restart the Router Services from the Host components list. Currently the HDFS summary Page does allow an option to restart HDFS routers.
  1. Set and tune the following properties with respect to HDFS router federation under, Advanced hdfs-rbf-site.
Bash
Copy
  1. For a kerberized cluster, the router entry will be added into auth_to_local of hdfs-core-site as per the below screenshot.
  1. Modify the `dfs.federation.router.kerberos.principal=`nn/_HOST@${realm} under Kerberos → Advanced → Edit.
  2. Once all the affected services of HDFS are restarted, you can access the router UI link to fetch the details and confirm the setup.

router http UI - <router-hostname>:50071

router jmx - <router-hostname>:50071/jmx

Example:

The sub clusters are monitored by router. For example, odpssl and ospssl1.

The routers Information of the cluster.

Router JMX

dfsrouter Operations

Mount table management

The federation admin tool supports managing the mount table. For example, to create three mount points and list them:

Bash
Copy

Quotas

The router-based federation supports global quota at mount table level. The mount table entries may spread multiple sub clusters and the global quota will be accounted across these sub clusters.

The federation admin tool supports setting quotas for specified mount table entries:

Bash
Copy

The above command means that we allow the path to have a maximum of 100 file/directories and use at most 1024 bytes storage space. The parameter for ssQuota supports multiple size-unit suffix (e.g. 1k is 1KB, 5m is 5MB). If no suffix is specified then bytes is assumed.

quota operation:

  • Set storage type quota for specified mount table entry
  • Remove quota for specified mount table entry
  • Remove storage type quota for specified mount table entry
  • Mount table cache refresh by manual command

Multiple subclusters

A mount point also supports mapping multiple sub clusters. For example, to create a mount point that stores files in sub clusters odpssl and ospssl1.

Bash
Copy

Disabling name services

To prevent accessing a name service (sub cluster), it can be disabled from the federation. For example, one can disable ns1, list it, and enable it again:

Bash
Copy

Router state dump

To diagnose the current state of the routers, you can use the dumpState command. It generates a text dump of the records in the State Store. Since it uses the configuration to find and read the state store, it is often the easiest to use the machine where the routers run. The command runs locally, so the routers do not have to be up to use this command.

Bash
Copy

Metrics

The Router and State Store statistics are exposed in metrics/JMX. These info will be very useful for monitoring. More metrics information can see be seen here, RBF Metrics, Router RPC Metrics, and State Store Metrics.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated