Documentation
ODP 3.3.6.3-1
Release Notes
What is ODP
Installation
Component User guide and Installation Instructions
Upgrade Instructions
Downgrade Instructions
Reference Guide
Security Guide
Troubleshooting Guide
Uninstall ODP
Title
Message
Create new category
What is the title of your new category?
Edit page index title
What is the title of the page index?
Edit category
What is the new title of your category?
Edit link
What is the new title and URL of your link?
Apache Pinot Sizing Guide
Summarize Page
Copy Markdown
Open in ChatGPT
Open in Claude
Connect to Cursor
Connect to VS Code
Apache Pinot cluster hardware sizing depends on several factors, including:
- Data ingestion rate (real-time or batch)
- Query concurrency and latency requirements (Read and Write QPS)
- Data volume (raw + segment size post ingestion)
- Query complexity (filters, aggregations, joins)
- Use of features like star-tree indexing, inverted indexes, etc.
Here’s a general guideline for hardware sizing across key Pinot components:
Real-time Ingestion Nodes (Server Nodes)
These handle consuming from Kafka (or another stream), indexing, and serving queries.
| Parameter | Guideline |
|---|---|
| CPU | 8–32 vCPUs per node (more cores for high ingest/query workloads) |
| Memory (RAM) | 32–128 GB (based on segment size in memory, indexes) |
| Disk (SSD recommended) | 1–2 TB NVMe SSD (ensure high IOPS; Pinot is I/O intensive) |
| Network | ≥10 Gbps (especially for high ingest rate and query throughput) |
| # of Nodes | Start with 3–5, scale based on data size and QPS |
Offline Ingestion/Storage Nodes
Used for querying large volumes of historical data (HDFS/S3 segments loaded onto these nodes).
| Parameter | Guideline |
|---|---|
| CPU | 8–16 vCPUs |
| Memory | 32–64 GB |
| Disk (SSD preferred) | 1–4 TB (based on segment storage needs) |
| # of Nodes | 3–10+ (depends on data volume and retention) |
Broker Nodes
These handle query routing and aggregation across servers.
| Parameter | Guideline |
|---|---|
| CPU | 4–8 vCPUs |
| Memory | 16–32 GB |
| # of Nodes | 2–4 (scale based on concurrency and latency targets) |
Controller Nodes
Coordinate cluster metadata, segment assignment, and retention policies.
| Parameter | Guideline |
|---|---|
| CPU | 2–4 vCPUs |
| Memory | 8–16 GB |
| Disk | 100–200 GB |
| # of Nodes | 2 (HA via active-standby setup) |
Example: Sizing for 10 TB of data with moderate real-time ingestion and ~100 QPS
| Role | #Nodes | CPU | RAM | Disk (SDD) |
|---|---|---|---|---|
| Server | 5 | 16 vCPUs | 64 GB | 2 TB |
| Broker | 3 | 8 vCPUs | 32 GB | 500 GB |
| Controller | 2 | 4 vCPUs | 16 GB | 100 GB |
Additional Notes
- Pinot memory usage is influenced by segment loading and query execution.
- Disk: Prefer SSDs for segment scan speed, especially for star-tree or sorted indexes.
- Co-location of real-time and offline servers is possible but avoid it for production if latency is critical.
- Use auto-scaling in Kubernetes or YARN environments based on CPU and memory.
For more details about the Pinot Sizing, see Capacity Planning in Apache Pinot - Part 1 | StarTree.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
Last updated on May 14, 2025
Was this page helpful?
Next to read:
Install Pinot via MpackDiscard Changes
Do you want to discard your current changes and overwrite with the template?
Archive Synced Block
Message
Create new Template
What is this template's title?
Delete Template
Message