Configure HDFS Small File Threshold

Pulse identifies an HDFS file as a small file with a size threshold. You can configure the threshold via setting the variable hdfs.analytics.file.size = xxxxxx (in Bytes) in the cluster configuration file to set the small file size threshold. Files meeting this threshold or smaller are identified as small files and displayed on the HDFS Analytics, HDFS Explorer, etc., pages.

Steps to configure the threshold:

  1. Open the _$AcceloHome/config/acceldata_<clustername>.conf_ file.
  2. Add the following variable and configure the file size in Bytes. By default, the small file value is set to 1 MB.
Bash
Copy

The configured file size is pushed as a real-time metric under hdfs_root_analysis in the database.

  1. Save and close the file.
  2. Run the following command to push the changes. To reflect the changes on the FS Analytics page, run the FSImage command to reprocess the reports.
Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard