HDFS Analytics

Charts in HDFS Analytics

The following table provides the details of the charts in the HDFS Analytics page.

Chart NameDescription
HDFS Usage by UserThe amount of HDFS storage used by each user in the cluster.
HDFS File TypesThe frequency of files types used in HDFS.
HDFS File size

The number of files per file size

<1kb, 1-10kb, 10-128kb, 128kb-1Mb, 1-10 Mb, 10-128 Mb, 128 Mb-1Gb, 1-10Gb ,> 10Gb

Small Files by UserThe number of small files per user.
Disk Space by ReplicationThe size of data used up by replicated files. This is calculated as size of file multiplied by its replication factor.
Small Files Dir

The directory path of the small files grouped by the following.

File: Number of files in the directory path.

Size: Size of files in the directory path.

Data Temp

Data temperature is the number of files stored by data storage policy. You can monitor files according to the following storage policies.

Hot: The temperature of datasets is hot if the datasets are frequently accessed or used in the last seven days.

Warm: The temperature of datasets is warm if the datasets are less frequently used or accessed few times in the past month.

Cold: The temperature of datasets is cold if the datasets are accessed or used very rarely in the last three months.

Last Modified File Count

This chart displays the number of files that were modified prior to the time period selected in the global calendar. The global calendar is present on the top right corner. You can view the aggregate data or individual data of each file. You can also view number of modified files for specific paths.

For example, if you select Last 7 Days in the global calendar, this chart displays the number of files modified prior to the last 7 days.

When you over over a data point on the chart, you can view the path and the number of files that were last modified, then the time period selected in the global calendar.

By default, the last modified day is taken as last 7 days. However, you can change it by executing the following steps.

  1. Open the _$AcceloHome/config/acceldata_<clustername>.conf_ file.
  2. Edit the DAYS field under fsanalytics.metastore in _hdfs.connectors_ section to the desired value.

For example,

Bash
Copy
  1. Save and close the file.
  2. Execute the following command to push the changes to the Mongo DB.
Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard