Sometimes your Kubernetes application requires the availability of GPU enabled nodes to run certain workloads. This is a typical requirement for applications that are using certain levels of AI or Machine Learning algorithms. When using kURL for an on-prem Kubernetes installation, it is possible to enable GPU support with some minor changes in the kURL spec. This blog post will guide you through the different steps needed to enable GPUs.
IMPORTANT NOTE: This blog is an experimental example of kURL usage. While the use of the tomlConfig flag in containerd is currently supported by Replicated, this example is not currently an officially supported or tested use case with kURL. If you would like to see this become a supported use case for kURL, please reach out via our Alpha/Beta program or your account team.
For example, in GCE you can easily create a GPU enable VM using the following command. We tested this on a GCE instance of type n1-standard-8 using a nvidia-tesla-t4 GPU and 80GB disk.
gcloud compute instances create $INSTANCE --boot-disk-size=80GB --boot-disk-type=pd-ssd --image-project=$IMAGE_PROJECT \
--image-family=$IMAGE_FAMILY --machine-type=$MACHINE_TYPE \
--zone $ZONE \
As a first step, once the GCE instance has started, we will need to install the nvidia drivers to be able to make use of the nvidia-tesla-4. Login into the instance using SSH and install the Nvidia cuda drivers:
Add linux headers:
[.pre]sudo apt-get install linux-headers-$(uname -r)[.pre]
[.pre]distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e
&& sudo mv cuda-$distribution.pin /etc/apt/preferences.d/cuda-repository-pin-600[.pre]
Add gpg keys:
[.pre]sudo apt-key adv --fetch-keys
&& echo "deb http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.list[.pre]
Install cuda drivers:
[.pre]sudo apt-get update \&& sudo apt-get -y install cuda-drivers[.pre]
Once the drivers are installed, we need to make sure our container runtime for k8s can also make use of the GPU. This is achieved by installing the Nvidia container runtime. On the instance execute the commands below:
[.pre]distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-
key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list[.pre]
[.pre]sudo apt-get update \
&& sudo apt-get install -y nvidia-container-runtime[.pre]
Now that we have the GPU drivers and the nvidia container runtime installed, we can install kubernetes using the kURL project. By default kURL will make use of the containerd runtime. However, there is a way to path the containerd configuration and make use of the nvidia container runtime if needed. The kURL specification below is an example of how to patch containerd, and make use of the nvidia runtime where possible:
default_runtime_name = "nvidia"
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v1"
BinaryName = "/usr/bin/nvidia-container-runtime"
SystemdCgroup = true
If you want to install kURL with the patched config, you can use the below installer. This will take about 10 minutes to bootstrap the kubernetes node.
kURL install with config.toml patch:
[.pre]curl -sSL https://kurl.sh/kurl-gpu | sudo bash[.pre]
When the installation is complete you'll see output like below:
Next we'll deploy some GPU Application. In order to do so, we need the nvidia-device-plugin. This is a Helm chart, so for convenience, we've package everything as a Replicated Application.
Browse to the Kotsadm endpoint shown in your kURL output, and accept the self signed security certificate. Once asked to login, copy the password from the kURL output and login.
Once you're logged in, it will ask for a license file. You can use the license from this blogpost.
Once the license is uploaded, it will ask which test application should be deployed. You can go with the defaults for now. We'll deploy the Video Analytics app later.
Preflight checks will ensure that the environment we deploy into, will meet the requirements.
If everything looks ok, press continue, and the first pod requiring the gpu should be deployed. You can check by going to Troubleshoot and generating a support bundle.
Once finished, you should see an Analyzer tile showing Yes, GPU workloads are working.
Now that we now the GPU is working, let's deploy a more cool app. The Deepstream Video Analytics App.
Go in the Application installer to Config and Uncheck Run a simple gpu pod and check Deploy deepstream video analytics. This will generate a new version. Wait for the preflights to be succesfull, and hit deploy.
If you go to the Dashboard, you can click on Open deepstream. It will open a new webpage, and video analytics of a traffic stream will start to happen. (You might have to disable your ad blocker.)
As a last step, go back to the config screen, and check the box to add some k8s love.
Deploy the new version, and once everything is properly running, you can click on Open deepstream again. Have fun!