YuniKorn leverages Prometheus to record metrics. The metrics system keeps tracking of scheduler's critical execution paths, to reveal potential performance bottlenecks. Currently, there are three categories for these metrics:
- scheduler: generic metrics of the scheduler, such as allocation latency, num of apps etc.
- queue: each queue has its own metrics sub-system, tracking queue status.
- event: record various changes of events in YuniKorn.
all metrics are declared in
|Total number of attempts to allocate containers. State of the attempt includes
released. Increase only.
|Total number of application submissions. State of the attempt includes
rejected. Increase only.
|Total number of applications. State of the application includes
|Total number of nodes. State of the node includes
|Total resource usage of node, by resource name.
|Latency of the main scheduling routine, in milliseconds.
|Latency of all nodes sorting, in milliseconds.
|Latency of node condition checks for container allocations, such as placement constraints, in milliseconds.
|Latency of preemption condition checks for container allocations, in milliseconds.
|Queue application metrics. State of the application includes
|Queue resource metrics. State of the resource includes
|Total events created.
|Total events channeled.
|Total events not channeled.
|Total events processed.
|Total events stored.
|Total events not stored.
|Total events collected.
YuniKorn metrics are collected through Prometheus client library, and exposed via scheduler restful service. Once started, they can be accessed via endpoint http://localhost:9080/ws/v1/metrics.
Aggregate Metrics to Prometheus
It's simple to setup a Prometheus server to grab YuniKorn metrics periodically. Follow these steps:
Setup Prometheus (read more from Prometheus docs)
Configure Prometheus rules: a sample configuration
- job_name: 'yunikorn'
- targets: ['docker.for.mac.host.internal:9080']
- start Prometheus
docker pull prom/prometheus:latest
docker run -p 9090:9090 -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
docker.for.mac.host.internal instead of
localhost if you are running Prometheus in a local docker container
on Mac OS. Once started, open Prometheus web UI: http://localhost:9090/graph. You'll see all available metrics from