Tez

Tez Dashboard

Apache Tez is a framework that enables you to build YARN-based big data processing applications for better execution speed.

Using Pulse, you can monitor the tables and queries executed in Tez.

Click Tez > Dashboard in the left pane to access the Tez dashboard. The dashboard consists of summary panels, a Sankey Diagram, and charts that display information about queries and other related metrics.

The default time range is Last 24 hrs. To view statistics from a custom date range, click the icon and select a time frame and timezone of your choice.

Summary Panel

The summary tiles display several aggregated values. You can click the number on each field to view detailed information about that metric.

Metric NameDescription
UsersThe total number of users.
# of QueriesThe number of queries being run during the selected timeframe.
Avg CPU AllocatedThe average of CPU time across all queries.
Avg Memory AllocatedThe average amount of memory allocated across queries.
SucceededThe number of queries executed successfully.
RunningThe number of queries that are in progress.
FailedThe number of queries that failed to execute.
KilledThe number of queries that were killed.

Context Metric Distributions

The Context Metric distributions panel displays the summary of jobs as a Sankey diagram. You can see the flow of the selected queue to users and to the queries.

The following screenshot is an example of a Context Metric Distributions Sankey chart of the last 24 hours displayed by Duration.

Sankey Diagram

Sankey Diagram

You can gather the following information from the chart:

To see the distribution in numbers, hover over the Sankey chart.

You can observe the following in Queue.

  • 100% of queries are running in default queue.

From Users category, you can gather the following information.

  • 71.43% of queries are run by 3 users.
  • 14.29% of queries are run by 4 users.
  • 8.57% of queries are run by 3 users.
  • 5.71% of queries are run by 2 users.

From Queries category, you can gather the following information.

  • 25 queries (71.43%) are executed within 6.19 seconds to 10.76 seconds.
  • 5 queries (14.29%) are executed within 17.95 seconds to 22.12 seconds.
  • 3 queries (8.57%) are executed within 23.12 seconds to 25.83 seconds.
  • 2 queries (5.71%) are executed within 11.34 seconds to 12.13 seconds.

You can view the Sankey chart as Duration, VCores, or Memory by selecting the option from the right side of the Context Metric Distribution panel. It is demonstrated in the following clip.

Other Tez Charts

The following charts are also displayed on the Tez Dashboard.

Chart NameDescription
VCore UsageThe number of physical virtual cores used by a queue in the cluster.
Memory UsageThe amount of memory used by a queue in the cluster.
Query Execution CountThe number of queries executed within a timeframe.
Average Query TimeThe average time taken to execute queries. This metric also displays the Total Execution Time.
Top 20 Users (By Query)The top 20 users that executed the highest number of queries.
Top 20 Tables (By Query)The top 20 tables that executed the highest number of queries.
Total Connections Hive_MetastoreThe total number of established connections to the Hive Metastore over a specified time period. You can change the status of connections for the chart by clicking the status drop-down and selecting one of these options: Established, Listen, Close_wait, etc.
Total Connections Hive_Server2

The total number of established connections to Hive Server2 over a specified time period.

You can change the status of connections for the chart by clicking the status drop-down and selecting one of these options: Established, Listen, Close_wait, etc.

Top 10 Connections Hive_MetastoreThis bar chart ranks the top 10 connections to the Hive Metastore based on the number of established connections from different hosts. You can change the status of connections for the chart by clicking the status drop-down and selecting one of these options: Established, Listen, Close_wait, etc.
Top 10 Connections Hive_Server2This bar chart ranks the top 10 connections to Hive Server2 based on the number of established connections from different hosts. You can change the status of connections for the chart by clicking the status drop-down and selecting one of the these options: Established, Listen, Close_wait, etc.
Hive_Metastore and Hive_Server2 Connection Details

Hive_Metastore and Hive_Server2 Connection Details

Queues

In Queues panel, you can see the root queue, default queue, and custom queue(s) defined by the cluster administrator.

root: This is a predefined queue that is a parent of the available queues in your cluster. This queue uses 100% of resources.

default: A designated queue defined by the administrator. This queue contains jobs that do not have a queue allocated.

To view memory capacity allocated to or used by resources on a queue, click the queue in the Queues tab.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard