Spark Thrift

Spark Thrift Dashboard

The Spark Thrift Dashboard provides an overview of Spark Thrift application service that enables JDBC and ODBC clients to execute Spark SQL queries.

To view the Spark Dashboard, click Spark Thrift > Dashboard. The dashboard consists of summary panels, a Sankey Diagram with various metrics, and charts that display information about jobs based on other criteria such as memory and core utilization.

The default time range is Last 24 hrs. To view statistics from a custom date range, click the icon and select a time frame and timezone of your choice.

Summary Panels

The following table provides details of the summary of jobs:

MetricDescription
UsersThe total number of users.
# of ApplicationsThe total number of applications
Avg. CPU AllocatedThe average of CPU time across all jobs.
Avg. Memory Allocated (MB)The average amount of memory allocated across jobs in Megabyte.

Charts in Spark Thrift

Context Metric Distributions

The Context Metric Distributions panel displays the summary of jobs as a Sankey diagram.

By default, the chart displays the distribution by Duration. You can choose to display the distribution by Input Data, Output Data, Shuffle Reads, or Shuffle Writes from the drop-down list.

Core Usage by Locality

The Core Usage by Locality chart displays the core usage by the following locality types. The chart also displays Core Used and Core Wasted values (in%).

  • Process Local: The tasks in this locality are run within the same process as the source data.
  • Node Local: The tasks in this locality are run on the same machine as the source data.
  • Rack Local: The tasks in this locality are run in the same rack as the source data.
  • Any: The tasks in this locality are run anywhere else but not on the same node or rack.
  • No pref: The tasks in this locality have no locality preference.
  • Idle: The tasks in this locality that are idle.

Zooming-in Core Usage

You can take a closer look at the core usage by zooming in to any timeline on the graph.

To zoom in, drag and drop the mouse pointer on the section or timeline you want to zoom in. The second graph shows a closer view of the section or timeline you selected.

Other Charts in Spark Thrift

The following charts are also displayed on the Business Intelligence Dashboard.

Chart Name
VCore UsageThe number of physical virtual cores used by a queue in the cluster.
Memory UsageThe amount of memory used by a queue in the cluster in a particular timeframe.
Query Duration DistributionThe number of queries grouped by duration.
Query Execution CountThe number of queries executed within a timeframe.
Average Query TimeThe average time taken to execute queries. This metric also displays the Total Execution Time.
Top 20 Users (By Query)The top 20 users that executed the highest number of queries.
Top 20 Tables (By Query)The top 20 tables that executed the highest number of queries.
Storage MemoryThe amount of storage memory used by the Spark Thrift application, including Used Memory and Total Memory.

Spark SQL

The Spark SQL panel displays the list of thrift servers with their state, either Connected or Disconnected. You can filter the data to be displayed in the page by clicking on the thrift server you want to view.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard