Proxmox vGPU – v3

Intro

Running a vGPU has become incredibly easy nowadays. In fact, the few steps that need to be taken can easily be incorporated into a shell script. This includes verifying the Proxmox version, downloading and installing the appropriate drivers, and even checking for a compatible GPU. All of these tasks can be accomplished by downloading and running a single shell script. That’s exactly what I did – I wrote a script to simplify the process.

The script is updated frequently by submitting your debug.log. See the changelog

Proxmox

Whether you’re running Proxmox 7.4 and up (8.x), this script will automatically check for and install all necessary packages, download and build other packages, and edit configuration files, all on its own.

Check GPU

All tests have been conducted on a Nvidia 1060 6GB and a Nvidia 2070 Super 8GB, running on Proxmox version 7.4 and up (8.x). The hardware requirements remain the same as in previous versions of vgpu_unlock, and the more VRAM your GPU has onboard, the better.

Before doing anything, let’s check if your GPU is compatible. Type in the chip your GPU uses (for example 1060 or 2080)

When that results in a compatible GPU we can proceed.

When running two Nvidia GPU’s, you need to exclude one of them (for pass through only) and use only one for vGPU purposes

Step 1

The initial step, which you need to perform on your own (if you haven’t already), is to enable Vt-d/IOMMU in the BIOS. For Intel systems, look for the term Vt-d, and for AMD systems, look for IOMMU. Enable this feature, and then save and exit the BIOS.

When that’s done, boot up the server and login to the Proxmox using SSH and download the script

curl -O https://raw.githubusercontent.com/wvthoog/proxmox-vgpu-installer/main/proxmox-installer.sh && chmod +x proxmox-installer.sh

And launch it

./proxmox-installer.sh

Yep, that’s right, a single Bash script designed to handle everything. It is divided into two steps, with the Proxmox server requiring a reboot between each step. Let’s begin with Step 1.

When you first launch the script, it will display the base menu. From here, you can select the option that fits your requirements:

  • New vGPU Installation: Select this option if you don’t have any Nvidia (vGPU) drivers installed.
  • Upgrade vGPU Installation: Select this option if you have a previous Nvidia (vGPU) driver installed and want to upgrade it.
  • Remove vGPU Installation: Select this option if you want to remove a previous Nvidia (vGPU) driver from your system.
  • License vGPU: Select this option if you want to license the vGPU using FastAPI-DLS (ignore for now)

For demonstration purposes I’ve chosen for option 1: “New vGPU Installtion”.

Let the script proceed with updating the system, downloading repositories, building vgpu_unlock-rs, and making changes to various configuration files. Once the process is complete, press “y” to reboot the system.

Step 2

After the server has finished rebooting, log in once more using SSH. Run the script again using the same command as in Step 1.

./proxmox-installer

A configuration file (config.txt) has been automatically created to keep track of the current step.

In this step, the script checks if Vt-d or IOMMU is properly loaded and verifies the presence of a Nvidia card in your system. Then it displays a menu allowing you to choose which driver version to download. For Proxmox 8.x, you need to download version 16.x, and for Proxmox 7.x, download either version 16.x or 15.x.

The script will download the vGPU host driver from Megadownload repository I’ve found and patch the driver. It will proceed to install and load the patched driver. Finally the script will present you with two URL’s: one for Windows and another for Linux. These are the GRID (guest) driver for your VM’s. Write down or copy both of these URL’s. You’ll need them later to install the Nvidia drivers in your VM’s.

And that’s it, the host vGPU driver is now installed, concluding the installation on the server part. If there we’re any errors, please refer to the debug.log file in the same directory from where you’ve launched the script from

cat debug.log

We can now proceed to add a vGPU to a VM.

Licensing

For now, just say “n” to any request to license the vGPU

Will update this part when i’m satisfied the script will handle the installation process of FastAPI-DLS correctly (it can’t be installed on Proxmox 7 since it runs Debian Bullseye)

VM Install

At the last step of the installation process the script instructs you to issue the mdevctl types command. This command will present you with all the different types of vGPU’s you have at your disposal.

The mdev type you choose depends largely (but not entirely) on the amount of VRAM you have available. For example, if you have an Nvidia 2070 Super with 8GB of VRAM, you can split it into these Q profiles:

nvidia-259 offers 2x 4GB
nvidia-257 offers 4x 2GB
nvidia-256 offers 8x 1GB

Choose the profile that suits your needs and then follow these steps in the Proxmox web GUI:

  1. Click on the VM you want to assign a vGPU to
  2. Click on the Hardware tab
  3. At the top click on Add and select PCI Device
  4. Select Raw Device and select the Nvidia GPU (should say that it’s Mediated Device)
  5. Now select the desired profile in MDev Type
  6. Click Add to assign it to your VM

And you’re done.

The vGPU is now assigned to the VM, and you’re ready to launch the VM and install the Nvidia GRID (guest) drivers.

Linux

To install the guest driver, first, update the system.

sudo apt update && sudo apt dist-upgrade

After updating the system, proceed to install the kernel headers, which are required for the Nvidia driver installation.

sudo apt install linux-headers-$(uname -r) 

Next, download the Nvidia driver using the lines you copied from Step 2 of the installation process on the Proxmox side

I’m using the guest driver for host version 16.1. Make sure you’re downloading the correct version from here
wget https://storage.googleapis.com/nvidia-drivers-us-public/GRID/vGPU16.1/NVIDIA-Linux-x86_64-535.104.05-grid.run

Once downloaded, make the file executable and install it using the following commands:

chmod +x NVIDIA-Linux-x86_64-535.104.05-grid.run
sudo ./NVIDIA-Linux-x86_64-535.104.05-grid.run --dkms

Replace <NVIDIA-Linux-x86_64-535.104.05-grid.run> with the actual name of the downloaded driver file.

After the installation is complete, verify that the vGPU is running by issuing the following command: nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  GRID RTX6000-4Q                On  | 00000000:01:00.0 Off |                  N/A |
| N/A   N/A    P8              N/A /  N/A |      4MiB /  4096MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

This will display the Nvidia System Management Interface and confirm that the vGPU is active and running properly.

Windows

If you have a previous Nvidia driver installed, remove it completely using a tool like Display Driver Uninstaller (DDU) before proceeding.

I’m using the guest driver for host version 16.1. Make sure you’re downloading the correct version from here

Download the correct driver and proceed with the installation.

Tips and Tricks

Script Arguments

The script can be launched with some additional parameters. These are

  • –debug
    • Will not suppress stdout/stderr messages. Output of commands will be displayed on screen. No debug.log file will be created
  • –step
    • Will force the script to start at a particular step. For example –step 2 will launch the script at step 2.
  • –url
    • Will use a custom url to download the host vGPU driver. Must be in .run format. (For example: https://example.com/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run)
  • –file
    • Will use a custom file to install the host vGPU driver. Must be in .run format. (For example: NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run)

When the –debug argument is omitted all stdout/stderr messages will be written to the debug.log file. If you encounter any errors, review them by running

cat debug.log

Credits

Big thanks to everyone involved in developing and maintaining this neat piece of software.

For additional support join the GPU Unlocking Discord server thanks to Krutav Shah

ToDo

  • When using –url download vGPU zip files and extract them. Not just .run files
  • Check /etc/modules and all files in /etc/modprobe.d/ for conflicting lines
  • When using two identical GPU’s exclude one of them using IOMMU groups

Changelog

  • 2023-11-15: Even more bug fixes, added checks, removed step 3 and fixed licensing
  • 2023-11-9: Bug fixes and typos
  • 2023-11-2: Initial upload of the script

Troubleshoot

When encountering problems with the installation i advise to run the script and select “Remove vGPU Installation”. Reboot the Proxmox server and start over

If that didn’t help and you still encounter problems please help me refine the script even better by posting your debug.log to pastebin.com and posting the url in the comment section or by mailing me directly using the form in the About Me page

PayPal

If you like my work, please consider supporting.