How to run GPU Workloads on a K8s Cluster with kURL

Josh De Winne
 | 
Sep 15, 2022

Sometimes your Kubernetes application requires the availability of GPU enabled nodes to run certain workloads. This is a typical requirement for applications that are using certain levels of AI or Machine Learning algorithms. When using kURL for an on-prem Kubernetes installation, it is possible to enable GPU support with some minor changes in the kURL spec. This blog post will guide you through the different steps needed to enable GPUs.

IMPORTANT NOTE: This blog is an experimental example of kURL usage. While the use of the tomlConfig flag in containerd is currently supported by Replicated, this example is not currently an officially supported or tested use case with kURL. If you would like to see this become a supported use case for kURL, please reach out via our Alpha/Beta program or your account team.

Tutorial: create a GPU-enabled compute instance

For example, in GCE you can easily create a GPU enable VM using the following command. We tested this on a GCE instance of type n1-standard-8 using a nvidia-tesla-t4 GPU and 80GB disk.

[.pre]export IMAGE_PROJECT=ubuntu-os-cloud
export IMAGE_FAMILY=ubuntu-2204-lts
export MACHINE_TYPE=n1-standard-8
export ZONE=us-west2-b
gcloud compute instances create $INSTANCE --boot-disk-size=80GB --boot-disk-type=pd-ssd --image-project=$IMAGE_PROJECT \
--image-family=$IMAGE_FAMILY --machine-type=$MACHINE_TYPE \
--zone $ZONE \
--accelerator=count=1,type=nvidia-tesla-t4 \
--maintenance-policy=TERMINATE[.pre]

Install Nvidia drivers

As a first step, once the GCE instance has started, we will need to install the nvidia drivers to be able to make use of the nvidia-tesla-4. Login into the instance using SSH and install the Nvidia cuda drivers:

Add linux headers:

[.pre]sudo apt-get install linux-headers-$(uname -r)[.pre]

Capture distribution:

[.pre]distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e
's/\.//g') \
&& wget
https://developer.download.nvidia.com/compute/cuda/repos/$distribution/
x86_64/cuda-$distribution.pin \
&& sudo mv cuda-$distribution.pin /etc/apt/preferences.d/cuda-repository-pin-600[.pre]

Add gpg keys:

[.pre]sudo apt-key adv --fetch-keys
http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/3bf863cc.pub\
&& echo "deb http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.list[.pre]

Install cuda drivers:

[.pre]sudo apt-get update \&& sudo apt-get -y install cuda-drivers[.pre]

Install Nvidia Container runtime

Once the drivers are installed, we need to make sure our container runtime for k8s can also make use of the GPU. This is achieved by installing the Nvidia container runtime. On the instance execute the commands below:

Capture distribution:

[.pre]distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-
key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list[.pre]

Install nvidia-container-runtime:

[.pre]sudo apt-get update \
&& sudo apt-get install -y nvidia-container-runtime[.pre]

Install kURL

Now that we have the GPU drivers and the nvidia container runtime installed, we can install kubernetes using the kURL project. By default kURL will make use of the containerd runtime. However, there is a way to path the containerd configuration and make use of the nvidia container runtime if needed. The kURL specification below is an example of how to patch containerd, and make use of the nvidia runtime where possible:

[.pre]apiVersion: cluster.kurl.sh/v1beta1
kind: Installermetadata:
     name: "gpu"
spec:
     containerd:
          version: 1.5.x
          tomlConfig: |
              [plugins]
                  [plugins."io.containerd.grpc.v1.cri"]
                     [plugins."io.containerd.grpc.v1.cri".containerd]
                        default_runtime_name = "nvidia"
                         [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
                               privileged_without_host_devices = false
                               runtime_engine = ""
                               runtime_root = ""
                               runtime_type = "io.containerd.runc.v1"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
                BinaryName = "/usr/bin/nvidia-container-runtime"
                 SystemdCgroup = true
     ekco:
          version: latest
     kotsadm:
          version: latest
          disableS3: true
     kubernetes:
          version: 1.23.x
     longhorn:
          version: 1.3.x
     registry:
          version: 2.8.x
      weave:
          version: 2.6.x[.pre]

If you want to install kURL with the patched config, you can use the below installer. This will take about 10 minutes to bootstrap the kubernetes node.

kURL install with config.toml patch:

[.pre]curl -sSL https://kurl.sh/kurl-gpu | sudo bash[.pre]

When the installation is complete you'll see output like below:

kURL installation complete message

Deploy a GPU App

Next we'll deploy some GPU Application. In order to do so, we need the nvidia-device-plugin. This is a Helm chart, so for convenience, we've package everything as a Replicated Application.

Browse to the Kotsadm endpoint shown in your kURL output, and accept the self signed security certificate. Once asked to login, copy the password from the kURL output and login.

Log in to GPU screen

Once you're logged in, it will ask for a license file. You can use the license from this blogpost.

Upload your license file screen

Once the license is uploaded, it will ask which test application should be deployed. You can go with the defaults for now. We'll deploy the Video Analytics app later.

Configure GPU screen

Preflight checks will ensure that the environment we deploy into, will meet the requirements.

Preflight checks collecting information

If everything looks ok, press continue, and the first pod requiring the gpu should be deployed. You can check by going to Troubleshoot and generating a support bundle.

Troubleshoot analyzing GPU

Once finished, you should see an Analyzer tile showing Yes, GPU workloads are working.

GPU operator test working

Deploy Deepstream

Now that we now the GPU is working, let's deploy a more cool app. The Deepstream Video Analytics App.

Go in the Application installer to Config and Uncheck Run a simple gpu pod and check Deploy deepstream video analytics. This will generate a new version. Wait for the preflights to be succesfull, and hit deploy.

GPU examples screen

If you go to the Dashboard, you can click on Open deepstream. It will open a new webpage, and video analytics of a traffic stream will start to happen. (You might have to disable your ad blocker.)

Video analytics on a traffic stream

Add k8slove

As a last step, go back to the config screen, and check the box to add some k8s love.

K8s love GPU example

Deploy the new version, and once everything is properly running, you can click on Open deepstream again. Have fun!

K8s love video stream

Additional info