What's next (future development)
- Gang scheduling
- Preemption phase 2
- Support spot instances for Spark scheduling
- Support for Kubernetes 1.17 and later
In this version, the Apache YuniKorn (Incubating) community is focusing on improving the stability, and configuration handling The main features delivered in this release includes:
- Core scheduler cache removal YUNIKORN-317
- Logging and tracing enhancement using OpenTracing YUNIKORN-387
- Kubernetes API upgrade to 1.16 YUNIKORN-373
- Application tracking API and CRD YUNIKORN-201
- Application and task priority support YUNIKORN-1
- Web UI refurbishment YUNIKORN-320
v0.9.0 (28 August 2020)
This release ships a number of improvements focussed on the user experience.
Resource Quota Management
This version YuniKorn provides a seamless way to manage resource quota for a Kubernetes cluster, it can work as an alternative to the namespace resource quota. There are 2 main advantages of using this feature comparing to the namespace resource quota:
- The namespace resource quota is counting resources at the admission phase, irrespective of the pod is using the resources or not. This can lead up to issues that the namespace resources could not be efficiently used.
- The namespace resource quota is flat, it doesn't support hierarchy resource quota management.
- The resource quota admission controller rejects the pods as long as it goes over the quota, this increases the complexity of the client side code.
By using the resource quota management provided by YuniKorn, it is more efficient, seamlessly setup and it provides the job queue to handle common scheduling ordering requirements.
Job Ordering Policy: StateAware (optimized FIFO)
StateAware app sorting policy orders jobs in a queue in FIFO order, and schedule them one by one on conditions.
The condition is to wait for the application enters a runnable state. This avoids the common race condition while submitting
lots of batch jobs, e.g Spark, to a single namespace (or cluster). By enforcing the certain ordering of jobs, it also improves
the scheduling of jobs to be more predictable. More explanation of this feature can be found in doc here.
Work with the cluster-autoscaler
In this release, YuniKorn has been tested heavily to work nicely with the Kubernetes cluster-autoscaler. It brings the maximum elasticity to the Kubernetes cluster by working efficiently with the cluster-autoscaler. Some bugs are fixed and some improvements are done in this release.
Even cache system
In this release, an efficient even cache system is added into the scheduler. This system caches some key scheduling events in a memory store and publishes them to Kubernetes event system when needed. More scheduling events are visible directly from Kubernetes by using kubectl command. This helps to improve the usability and debuggability a lot.
More comprehensive web UI
YuniKorn UI provides a better centralized view for resource management. An nodes page has been added to the UI, to display the detailed nodes info in the cluster. The apps page has been enhanced, it now provides a search box to search apps by queue or application ID.
v0.8.0 (4 May 2020)
This release ships a fully functional resource scheduler for Kubernetes with a number of useful features that empower to run Big Data workloads on K8s. See more at Release Notes.
- Communication protocols between RM and scheduler-shim.
- gRPC interfaces.
- Scheduler plugin interfaces.
- Hierarchy queues with min/max resource quotas.
- Resource fairness between queues, users and apps.
- Cross-queue preemption based on fairness.
- Fair/Bin-packing scheduling policies.
- Placement rules (auto queue creation/mapping).
- Customized resource types (like GPU) scheduling support.
- Rich placement constraints support.
- Automatically map incoming container requests to queues by policies.
- Node partition: partition cluster to sub-clusters with dedicated quota/ACL management.
- Configuration hot-refresh.
- Stateful recovery.
- Metrics framework.
- Support K8s predicates. Such as pod affinity/anti-affinity, node selectors.
- Support Persistent Volumes, Persistent Volume Claims, etc.
- Load scheduler configuration from configmap dynamically (hot-refresh).
- 3rd Operator/controller integration, pluggable app discovery.
- Helm chart support.
- Cluster overview page with brief info about the cluster.
- Read-only application view, including app info and task breakdown info.
- Read-only queue view, displaying queue structure, queue resource, usage info dynamically.