Impala

Dashboard

Impala is a massively parallel process (MVP) SQL query execution engine that runs on the Hadoop platform. Pulse provides you with separate windows to view metrics and gain insights on Impala queries, tables, and daemons. Pulse also provides you with a dashboard that displays summary panels, Sankey diagrams with various metrics, and charts that display information about queries based on other criteria such as execution time.

Click Impala --> Dashboard in the left pane to access the Impala dashboard.

Note The default time range is Last 24 hrs. To change the time range, click the down arrow in the time selection menu.

Summary Panel

The summary tiles display several aggregated values. You can click the number on each field to view detailed information about that metric.

Metric NameDescription
UsersThe total number of users. To view Impala Query Details, click on the number.
# of QueriesThe number of queries being run during the selected timeframe. To view Impala Query Details, click on the number.
Avg CPU TimeThe average of CPU time across all queries.
Avg per Host Peak MemoryThe average peak memory usage per host.
Avg Admission Wait TimeThe average time elapsed from submission for admission to completion of the admission.
SucceededThe number of queries executed successfully. To view Impala Query Details, click on the number.
RunningThe number of queries that are in progress. To view Impala Query Details, click on the number.
FailedThe number of queries that failed to execute. To view Impala Query Details, click on the number.
KilledThe number of queries that were killed. To view Impala Query Details, click on the number.

Context Metric Distributions

The Context Metric distributions panel displays the summary of jobs as a Sankey diagram. By default, the chart displays the distribution by Duration.

You can choose to display the distribution by the following metrics:

Metric NameDescription
DurationThe time taken by the query.
Threads Total TimeThe sum of thread CPU time, storage wait, and network wait times used by all threads of the query.
Thread CPU TimeThe sum of the CPU time used by all threads of the query.
Per Node Peak Mem UsageThe peak memory usage per node.
HDFS Aggr Read MemoryThe sum of HDFS bytes read from memory by the query.

Impala Configuration Properties Summary Panel

The following table provides description of the details displayed in the Impala configuration properties summary panel:

MetricDescription
Max MemoryThe maximum memory allocated for a particular resource pool. Maximum amount of aggregate memory available across the cluster to all queries executing in this pool.
Max Running Queries,The maximum number of running queries allowed for a particular resource pool. Maximum number of concurrently running queries in this pool.
Max Queued QueriesThe maximum number of queries allowed to wait in the queue for a particular resource pool.(Maximum number of queries that can be queued in this pool.
Queued TimeoutThe amount of time, in milliseconds, that a query waits in the admission control queue for this pool before being canceled.
Minimum Query Memory LimitThe minimum amount of aggregate memory allocated to all queries executing in a particular pool. minimum per-host memory limit that will be chosen by Impala Admission control for queries in this resource pool
Maximum Query Memory Limit,The maximum amount of aggregate memory allocated to all queries executing in a particular pool. maximum per-host memory limit that will be chosen by Impala Admission control for queries in this resource pool
Clamp Memory Limit Query OptionBy specifying the Clamp Memory Limit Query Option query parameter, a user can overrule Impala's decision on the memory limit. If the query parameter is set FALSE, then the memory limit will be overridden by the required memory by the pool. If the query parameter is set TRUE, the memory limit is bound by the Minimum Query Memory Limit and the Maximum Query Memory Limit.

Other Impala Charts

The following charts are also displayed on the Impala Dashboard.

Chart NameDescription
Query execution countThe number of queries executed on the overall Impala Cluster.
Average query timeDisplays average query execution time and total query execution time of queries executed on the overall Impala Cluster.
Top 20 Users (by query)The top 20 users that ran the highest number of queries within the selected timeframe. By default, you can see the top 20 users for the last 24 hours
Top 20 Tables (by query)The top 20 tables that were accessed within the selected timeframe. By default, you can see the top 20 tables for the last 24 hours
Queries ChartThe chart displays the following metrics for a particular pool: Total Timed Out: The total number of queries timed out in the queue for a particular pool. Total Rejected: The total number of queries in the queue that were rejected for a particular pool. Total Admitted: The total number of queries admitted into the queue for a particular pool.
Avg Wait Time in Queue ChartThe average waiting time of the queries in the queue for a particular pool.

Resource Pools

You can view data, specific to a resource pool. To view data on the dashboard for the particular pool, perform the following:

  1. Click to view the list of resource pools. Click to hide the pool.
  2. Click on the name of the pool. The data corresponding to the selected pool is displayed in the dashboard.
  3. (Optional) You can search for the name of the pool by using the search box.

Impala Metrics

You can use the following Impala coordinator metrics to create dashboards and alerts.

MetricDescription
impala_thrift_server_beeswax_frontend_timedout_cnxn_requestsThe number of Beeswax API connection requests to this Impala Daemon has been timed out and waiting to be set up.
impala_thrift_server_beeswax_frontend_connection_setup_queue_sizeThe number of Beeswax API connections to this Impala Daemon have been accepted and are waiting to be set up.
impala_thrift_server_hiveserver2_http_frontend_connection_setup_timeThe amount of time clients of HiveServer2 HTTP API spent waiting for the connection to be set up.
impala_thrift_server_hiveserver2_http_frontend_timedout_cnxn_requestsThe number of HiveServer2 HTTP API connection requests to this Impala Daemon that have been timed out waiting to be set up.
thread_manager_running_threadsThe number of running threads in this process.
catalog_server_client_cache_total_clientsThe total number of clients in the Catalog Server client cache.
tmp_file_mgr_scratch_space_bytes_used_dir_0The current total spilled bytes for a single scratch directory.
impala_thrift_server_beeswax_frontend_svc_thread_wait_timeAmount of time clients of Beeswax API spent waiting for service threads.
catalog_server_client_cache_clients_in_useThe number of clients currently in use by the Catalog Server client cache.
impala_thrift_server_hiveserver2_frontend_total_connectionsThe total number of HiveServer2 API connections made to this Impala Daemon over its lifetime.
impala_thrift_server_hiveserver2_http_frontend_svc_thread_wait_timeThe amount of time clients of HiveServer2 HTTP API spent waiting for service threads.
kudu_client_versionA version string identifying the Kudu client.
tzdata_pathPath to the time_zone database
mem_tracker_process_bytes_freed_by_last_gcThe amount of memory freed by the last memory tracker garbage collection.
thread_manager_total_threads_createdThreads created over the lifetime of the process.
tmp_file_mgr_active_scratch_dirs_listThe set of all active scratch directories for spilling to disk.
impala_thrift_server_beeswax_frontend_connection_setup_timeThe amount of time clients of Beeswax API spent waiting for the connection to be set up.
tmp_file_mgr_scratch_space_bytes_used_high_water_markThe high water mark for spilled bytes across all scratch directories.
impala_thrift_server_hiveserver2_frontend_connection_setup_timeThe amount of time clients of HiveServer2 API spent waiting for the connection to be set up.
statestore_subscriber_registration_idThe most recent registration ID for this subscriber with the statestore_ Set to 'N/A' if no registration has been completed.
mem_tracker_process_bytes_over_limitThe amount of memory by which the process was over its memory limit the last time the memory limit was encountered.
mem_tracker_process_limitThe process memory tracker limit.
external_data_source_class_cache_missesNumber of cache misses in the External Data Source Class Cache.
process_start_timeThe local start time of the process.
statestore_subscriber_statestore_client_cache_total_clientsThe total number of StateStore subscriber clients in Impala Daemon's client cache. These clients are for communication from this role to the StateStore.
impala_thrift_server_hiveserver2_http_frontend_total_connectionsThe total number of HiveServer2 HTTP API connections made to this Impala Daemon over its lifetime.
request_pool_service_resolve_pool_duration_msTime (ms) spent resolving request pools.
impala_thrift_server_hiveserver2_http_frontend_connection_setup_queue_sizeThe number of HiveServer2 HTTP API connections to this Impala Daemon have been accepted and are waiting to be set up.
tmp_file_mgr_active_scratch_dirsThe number of active scratch directories for spilling to disk
tmp_file_mgr_scratch_space_bytes_usedThe current total spilled bytes across all scratch directories.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard