Skip to main content
Version: Next

Run NVIDIA GPU Jobs

Yunikorn with NVIDIA GPUs

This guide gives an overview of how to set up NVIDIA Device Plugin which enable user to run GPUs with Yunikorn, for more details please check NVIDIA device plugin for Kubernetes.

Prerequisite

Before following the steps below, Yunikorn need to deploy on the Kubernetes with GPUs.

Install NVIDIA Device Plugin

Add the nvidia-device-plugin helm repository.

helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update
helm repo list

Verify the latest release version of the plugin is available.

helm search repo nvdp --devel
NAME CHART VERSION APP VERSION DESCRIPTION
nvdp/nvidia-device-plugin 0.14.1 0.14.1 A Helm chart for ...

Deploy the device plugin

kubectl create namespace nvidia
helm install nvidia-device-plugin nvdp/nvidia-device-plugin \
--namespace nvidia \
--create-namespace \
--version 0.14.1

Check the status of the pods to ensure NVIDIA device plugin is running

kubectl get pods -A

NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-j24fx 1/1 Running 1 (11h ago) 11h
kube-system coredns-78fcd69978-2x9l8 1/1 Running 1 (11h ago) 11h
kube-system coredns-78fcd69978-gszrw 1/1 Running 1 (11h ago) 11h
kube-system etcd-katlantyss-nzxt 1/1 Running 3 (11h ago) 11h
kube-system kube-apiserver-katlantyss-nzxt 1/1 Running 4 (11h ago) 11h
kube-system kube-controller-manager-katlantyss-nzxt 1/1 Running 3 (11h ago) 11h
kube-system kube-proxy-4wz7r 1/1 Running 1 (11h ago) 11h
kube-system kube-scheduler-katlantyss-nzxt 1/1 Running 4 (11h ago) 11h
nvidia nvidia-device-plugin-1659451060-c92sb 1/1 Running 1 (11h ago) 11h

Testing NVIDIA Device Plugin

Create a gpu test yaml file.

# gpu-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
restartPolicy: Never
containers:
- name: cuda-container
image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubi8
resources:
limits:
nvidia.com/gpu: 1 #requesting 1 GPU
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule

Deploy the application.

kubectl apply -f gpu-pod.yaml

Check the logs to ensure the app completed successfully.

kubectl get pod gpu-pod

NAME READY STATUS RESTARTS AGE
gpu-pod 0/1 Completed 0 9d

Check the result.

kubectl logs gpu-pod

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Enable GPU Time-Slicing (Optional)

GPU time-slicing allow multi-tenant to share single GPU. To know how the GPU time-slicing works, please refer to Time-Slicing GPUs in Kubernetes. This page covers ways to enable GPU scheduling in Yunikorn using NVIDIA GPU Operator.

Configuration

Specify multiple configurations in a ConfigMap as in the following example.

# time-slicing-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: time-slicing-config
namespace: nvidia
data:
a100-40gb: |-
version: v1
sharing:
timeSlicing:
resources:
- name: nvidia.com/gpu
replicas: 8
- name: nvidia.com/mig-1g.5gb
replicas: 2
- name: nvidia.com/mig-2g.10gb
replicas: 2
- name: nvidia.com/mig-3g.20gb
replicas: 3
- name: nvidia.com/mig-7g.40gb
replicas: 7
rtx-3070: |-
version: v1
sharing:
timeSlicing:
resources:
- name: nvidia.com/gpu
replicas: 8
note

If the GPU type in nodes do not include the a100-40gb or rtx-3070, you could modify the yaml file based on existing GPU types. For example, there are only multiple rtx-2080ti in the local kubernetes cluster. MIG is not supported by rtx-2080ti, so it could not replace the a100-40gb. Time slicing is supported by rtx-2080ti, so it could replace rtx-3070.

info

MIG support was added to Kubernetes in 2020. Refer to Supporting MIG in Kubernetes for details on how this works.

Create a ConfigMap in the operator namespace.

kubectl create namespace nvidia
kubectl create -f time-slicing-config.yaml

Install NVIDIA GPU Operator

Add the nvidia-gpu-operator helm repository.

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
helm repo list

Enabling shared access to GPUs with the NVIDIA GPU Operator.

  • During fresh install of the NVIDIA GPU Operator with time-slicing enabled.

    helm install gpu-operator nvidia/gpu-operator \
    -n nvidia \
    --set devicePlugin.config.name=time-slicing-config
  • For dynamically enabling time-slicing with GPU Operator already installed.

    kubectl patch clusterpolicy/cluster-policy \
    -n nvidia --type merge \
    -p '{"spec": {"devicePlugin": {"config": {"name": "time-slicing-config"}}}}'

Applying the Time-Slicing Configuration

There are two methods:

  • Across the cluster

    Install the GPU Operator by passing the time-slicing ConfigMap name and the default configuration.

    kubectl patch clusterpolicy/cluster-policy \
    -n nvidia --type merge \
    -p '{"spec": {"devicePlugin": {"config": {"name": "time-slicing-config", "default": "rtx-3070"}}}}'
  • On certain nodes

    Label the node with the required time-slicing configuration in the ConfigMap.

    kubectl label node <node-name> nvidia.com/device-plugin.config=rtx-3070

Once the GPU Operator and Time-Slicing GPUs is installed, check the status of the pods to ensure all the containers are running and the validation is complete.

kubectl get pods -n nvidia
NAME                                                          READY   STATUS      RESTARTS   AGE
gpu-feature-discovery-qbslx 2/2 Running 0 20h
gpu-operator-7bdd8bf555-7clgv 1/1 Running 0 20h
gpu-operator-node-feature-discovery-master-59b4b67f4f-q84zn 1/1 Running 0 20h
gpu-operator-node-feature-discovery-worker-n58dv 1/1 Running 0 20h
nvidia-container-toolkit-daemonset-8gv44 1/1 Running 0 20h
nvidia-cuda-validator-tstpk 0/1 Completed 0 20h
nvidia-dcgm-exporter-pgk7v 1/1 Running 1 20h
nvidia-device-plugin-daemonset-w8hh4 2/2 Running 0 20h
nvidia-device-plugin-validator-qrpxx 0/1 Completed 0 20h
nvidia-operator-validator-htp6b 1/1 Running 0 20h

Verify that the time-slicing configuration is applied successfully.

kubectl describe node <node-name>
...
Capacity:
nvidia.com/gpu: 8
...
Allocatable:
nvidia.com/gpu: 8
...

Testing GPU Time-Slicing

Create a wordload test file plugin-test.yaml.

# plugin-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nvidia-plugin-test
labels:
app: nvidia-plugin-test
spec:
replicas: 5
selector:
matchLabels:
app: nvidia-plugin-test
template:
metadata:
labels:
app: nvidia-plugin-test
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
containers:
- name: dcgmproftester11
image: nvidia/samples:dcgmproftester-2.1.7-cuda11.2.2-ubuntu20.04
command: ["/bin/sh", "-c"]
args:
- while true; do /usr/bin/dcgmproftester11 --no-dcgm-validation -t 1004 -d 300; sleep 30; done
resources:
limits:
nvidia.com/gpu: 1
securityContext:
capabilities:
add: ["SYS_ADMIN"]

Create a deployment with multiple replicas.

kubectl apply -f plugin-test.yaml

Verify that all five replicas are running.

  • In pods

    kubectl get pods
    NAME                                  READY   STATUS    RESTARTS   AGE
    nvidia-plugin-test-677775d6c5-bpsvn 1/1 Running 0 8m8s
    nvidia-plugin-test-677775d6c5-m95zm 1/1 Running 0 8m8s
    nvidia-plugin-test-677775d6c5-9kgzg 1/1 Running 0 8m8s
    nvidia-plugin-test-677775d6c5-lrl2c 1/1 Running 0 8m8s
    nvidia-plugin-test-677775d6c5-9r2pz 1/1 Running 0 8m8s
  • In node

    kubectl describe node <node-name>
    ...
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.)
    Resource Requests Limits
    -------- -------- ------
    ...
    nvidia.com/gpu 5 5
    ...
  • In NVIDIA system management Interface

    nvidia-smi
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
    |-------------------------------+----------------------+----------------------+
    | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    | | | MIG M. |
    |===============================+======================+======================|
    | 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
    | 46% 86C P2 214W / 220W | 4297MiB / 8192MiB | 100% Default |
    | | | N/A |
    +-------------------------------+----------------------+----------------------+

    +-----------------------------------------------------------------------------+
    | Processes: |
    | GPU GI CI PID Type Process name GPU Memory |
    | ID ID Usage |
    |=============================================================================|
    | 0 N/A N/A 1776886 C /usr/bin/dcgmproftester11 764MiB |
    | 0 N/A N/A 1776921 C /usr/bin/dcgmproftester11 764MiB |
    | 0 N/A N/A 1776937 C /usr/bin/dcgmproftester11 764MiB |
    | 0 N/A N/A 1777068 C /usr/bin/dcgmproftester11 764MiB |
    | 0 N/A N/A 1777079 C /usr/bin/dcgmproftester11 764MiB |
    +-----------------------------------------------------------------------------+
  • In Yunikorn UI applications