Intro
Running a vGPU has become incredibly easy nowadays. In fact, the few steps that need to be taken can easily be incorporated into a shell script. This includes verifying the Proxmox version, downloading and installing the appropriate drivers, and even checking for a compatible GPU. All of these tasks can be accomplished by downloading and running a single shell script. That’s exactly what I did – I wrote a script to simplify the process.
Table of Contents
Proxmox
Whether you’re running Proxmox 7.4 and up (8.x), this script will automatically check for and install all necessary packages, download and build other packages, and edit configuration files, all on its own.
Check GPU
All tests have been conducted on a Nvidia 1060 6GB and a Nvidia 2070 Super 8GB, running on Proxmox version 7.4 and up (8.x). The hardware requirements remain the same as in previous versions of vgpu_unlock, and the more VRAM your GPU has onboard, the better.
Before doing anything, let’s check if your GPU is compatible. Type in the chip your GPU uses (for example 1060 or 2080)
When that results in a compatible GPU we can proceed.
Step 1
The initial step, which you need to perform on your own (if you haven’t already), is to enable Vt-d/IOMMU in the BIOS. For Intel systems, look for the term Vt-d, and for AMD systems, look for IOMMU. Enable this feature, and then save and exit the BIOS.
When that’s done, boot up the server and login to the Proxmox using SSH and download the script
curl -O https://raw.githubusercontent.com/wvthoog/proxmox-vgpu-installer/main/proxmox-installer.sh && chmod +x proxmox-installer.sh
And launch it
./proxmox-installer.sh
Yep, that’s right, a single Bash script designed to handle everything. It is divided into two steps, with the Proxmox server requiring a reboot between each step. Let’s begin with Step 1.
When you first launch the script, it will display the base menu. From here, you can select the option that fits your requirements:
- New vGPU Installation: Select this option if you don’t have any Nvidia (vGPU) drivers installed.
- Upgrade vGPU Installation: Select this option if you have a previous Nvidia (vGPU) driver installed and want to upgrade it.
- Remove vGPU Installation: Select this option if you want to remove a previous Nvidia (vGPU) driver from your system.
- License vGPU: Select this option if you want to license the vGPU using FastAPI-DLS (ignore for now)
For demonstration purposes I’ve chosen for option 1: “New vGPU Installtion”.
Let the script proceed with updating the system, downloading repositories, building vgpu_unlock-rs, and making changes to various configuration files. Once the process is complete, press “y” to reboot the system.
Step 2
After the server has finished rebooting, log in once more using SSH. Run the script again using the same command as in Step 1.
./proxmox-installer.sh
A configuration file (config.txt) has been automatically created to keep track of the current step.
In this step, the script checks if Vt-d or IOMMU is properly loaded and verifies the presence of a Nvidia card in your system. Then it displays a menu allowing you to choose which driver version to download. For Proxmox 8.x, you need to download version 16.x, and for Proxmox 7.x, download either version 16.x or 15.x.
The script will download the vGPU host driver from Megadownload repository I’ve found and patch the driver. It will proceed to install and load the patched driver. Finally the script will present you with two URL’s: one for Windows and another for Linux. These are the GRID (guest) driver for your VM’s. Write down or copy both of these URL’s. You’ll need them later to install the Nvidia drivers in your VM’s.
And that’s it, the host vGPU driver is now installed, concluding the installation on the server part. If there we’re any errors, please refer to the debug.log file in the same directory from where you’ve launched the script from
cat debug.log
We can now proceed to add a vGPU to a VM.
Licensing
Will update this part when i’m satisfied the script will handle the installation process of FastAPI-DLS correctly (it can’t be installed on Proxmox 7 since it runs Debian Bullseye)
VM Install
At the last step of the installation process the script instructs you to issue the mdevctl types command. This command will present you with all the different types of vGPU’s you have at your disposal.
The mdev type you choose depends largely (but not entirely) on the amount of VRAM you have available. For example, if you have an Nvidia 2070 Super with 8GB of VRAM, you can split it into these Q profiles:
nvidia-259 offers 2x 4GB
nvidia-257 offers 4x 2GB
nvidia-256 offers 8x 1GB
Choose the profile that suits your needs and then follow these steps in the Proxmox web GUI:
- Click on the VM you want to assign a vGPU to
- Click on the Hardware tab
- At the top click on Add and select PCI Device
- Select Raw Device and select the Nvidia GPU (should say that it’s Mediated Device)
- Now select the desired profile in MDev Type
- Click Add to assign it to your VM
And you’re done.
The vGPU is now assigned to the VM, and you’re ready to launch the VM and install the Nvidia GRID (guest) drivers.
Linux
To install the guest driver, first, update the system.
sudo apt update && sudo apt dist-upgrade
After updating the system, proceed to install the kernel headers, which are required for the Nvidia driver installation.
sudo apt install linux-headers-$(uname -r)
Next, download the Nvidia driver using the lines you copied from Step 2 of the installation process on the Proxmox side
wget https://storage.googleapis.com/nvidia-drivers-us-public/GRID/vGPU16.1/NVIDIA-Linux-x86_64-535.104.05-grid.run
Once downloaded, make the file executable and install it using the following commands:
chmod +x NVIDIA-Linux-x86_64-535.104.05-grid.run
sudo ./NVIDIA-Linux-x86_64-535.104.05-grid.run --dkms
Replace <NVIDIA-Linux-x86_64-535.104.05-grid.run>
with the actual name of the downloaded driver file.
After the installation is complete, verify that the vGPU is running by issuing the following command: nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 GRID RTX6000-4Q On | 00000000:01:00.0 Off | N/A |
| N/A N/A P8 N/A / N/A | 4MiB / 4096MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
This will display the Nvidia System Management Interface and confirm that the vGPU is active and running properly.
Windows
If you have a previous Nvidia driver installed, remove it completely using a tool like Display Driver Uninstaller (DDU) before proceeding.
Download the correct driver and proceed with the installation.
Tips and Tricks
Script Arguments
The script can be launched with some additional parameters. These are
- –debug
- Will not suppress stdout/stderr messages. Output of commands will be displayed on screen. No debug.log file will be created
- –step
- Will force the script to start at a particular step. For example –step 2 will launch the script at step 2.
- –url
- Will use a custom url to download the host vGPU driver. Must be in .run format. (For example: https://example.com/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run)
- –file
- Will use a custom file to install the host vGPU driver. Must be in .run format. (For example: NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run)
When the –debug argument is omitted all stdout/stderr messages will be written to the debug.log file. If you encounter any errors, review them by running
cat debug.log
Credits
Big thanks to everyone involved in developing and maintaining this neat piece of software.
- DualCoder for the original vgpu_unlock
- mbilker for the fast Rust version of vgpu_unlock
- PolloLoco for hosting all the patches and his excellent guide
For additional support join the GPU Unlocking Discord server thanks to Krutav Shah
ToDo
- When using –url download vGPU zip files and extract them. Not just .run files
- Check /etc/modules and all files in /etc/modprobe.d/ for conflicting lines
- When using two identical GPU’s exclude one of them using IOMMU groups
- Always write CONFIG_FILE to $HOME
Changelog
- 2023-11-15: Even more bug fixes, added checks, removed step 3 and fixed licensing
- 2023-11-9: Bug fixes and typos
- 2023-11-2: Initial upload of the script
Troubleshoot
When encountering problems with the installation i advise to run the script and select “Remove vGPU Installation”. Reboot the Proxmox server and start over
If that didn’t help and you still encounter problems please help me refine the script even better by posting your debug.log to pastebin.com and posting the url in the comment section or by mailing me directly using the form in the About Me page
Thanks for a great guide! I followed it and the installation worked, but i get an empty return on mdevctl types.
What could cause this?
I did not license my system, should I?
Then something went wrong. Curious to know what it is. Can you post a pastebin.com of your debug.log file ? It’s in the same directory from where you launched proxmox-installer.sh from
https://pastebin.com/aUqkYDfu
I blacklisted the driver. The GPU is in use by a VM at the moment, but when i tried to detach it from the VM and run the script, it also did not work.
Thanks for helping!
What does /var/log/nvidia-installer.log report ? Thought I’ve caught all of the exceptions, but apparently not 😉
Perhaps you can mail be using the form in the About Me page
My problem is solved now! With some amazing help from Wim himself we upgraded Proxmox to version 8 and we made it all work in both Windows and Linux VMs.
Again thanks for the help!
Glad to have helped. Made me aware of some problems the script currently has that need to be changed
Helps a lot.Can this script be used on multiple GPUs?If can,how?
I have two GPUs.How to set the config to make the script run successfully?
i’ve seen v2 page,can v3 do the same thing as “multiples GPUs” mentioned in v2?
Thanks!!!!
It’s a trade-off between extra functionality and making the script too big. I initially added a function to configure the vGPU through TOML, that doubled the script size. But i agree, checking for multiple GPU’s would be a nice feature. I’ll add that and check for conflicting lines in all config files (/etc/modules and /etc/modprobe.d) next
Can’t wait to see that.
And is there any way to use this script on multiple GPUs right now(such as change the config)?
I’ve tried to modify /etc/modprobe.d/vfio.conf and then run the script but it didn;t work.
Do the script virtualize alll GPUs?
You need to exclude one of them. List all GPU’s
lspci|grep -i nvidia
Select the one to exclude by probing the PCI ID of that bus (first 4 characters)
lspci -n -s 2b:00
Copy those PCI ID’s and edit /etc/modprobe.d/vfio.conf like this:
options vfio-pci ids=10de:1c03,10de:10f1
Update initramfs
update-initramfs -u
Reboot
Thanks for help!!!Though the solution you gave is only fit different GPUs,it inspired me to find solutions to isolate two same GPUs.Now I”ve successfully isolate one RTX2080Ti for passthrough and another for vGPU.You did save my life!!!!!
Glad to have helped. Could you share your solution ? Could be useful for when i’m updating the script
the GPU that I want to isolate is in iommu group 15, so list group 15
sudo dmesg | grep “iommu group 15”
[ 0.789601] pci 0000:80:03.0: Adding to iommu group 15
[ 0.789652] pci 0000:80:03.1: Adding to iommu group 15
[ 0.790038] pci 0000:81:00.0: Adding to iommu group 15
[ 0.790050] pci 0000:81:00.1: Adding to iommu group 15
[ 0.790061] pci 0000:81:00.2: Adding to iommu group 15
[ 0.790072] pci 0000:81:00.3: Adding to iommu group 15
then
echo “vfio-pci” > /sys/bus/pci/devices/0000:81:00.0/driver_override
echo “vfio-pci” > /sys/bus/pci/devices/0000:81:00.1/driver_override
echo “vfio-pci” > /sys/bus/pci/devices/0000:81:00.2/driver_override
echo “vfio-pci” > /sys/bus/pci/devices/0000:81:00.3/driver_override
echo “vfio-pci” > /sys/bus/pci/devices/0000:80:03.0/driver_override
echo “vfio-pci” > /sys/bus/pci/devices/0000:80:03.1/driver_override
echo “0000:81:00.0” > /sys/bus/pci/drivers/vfio-pci/bind
echo “0000:81:00.1” > /sys/bus/pci/drivers/vfio-pci/bind
echo “0000:81:00.2” > /sys/bus/pci/drivers/vfio-pci/bind
echo “0000:81:00.3” > /sys/bus/pci/drivers/vfio-pci/bind
echo “0000:80:03.0” > /sys/bus/pci/drivers/vfio-pci/bind
echo “0000:80:03.1” > /sys/bus/pci/drivers/vfio-pci/bind
update-initramfs -u
reboot
you can see more information here:https://wiki.archlinuxcn.org/wiki/PCI_passthrough_via_OVMF
Nice, will try to incorporate that into the script.
Hi, thanks so much for this guide!
I’m running into this error when I attempt to start up a windows 11 VM with the vGPU attached:
“`
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/00000000-0000-0000-0000-000000000102,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 00000000-0000-0000-0000-000000000102: error getting device from group 18: Input/output error
Verify all devices in group 18 are bound to vfio- or pci-stub and not already in use
stopping swtpm instance (pid 1479) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1
“`
Any advice on how to proceed?
Does mdevctl types work ?
And can you post the config file of your VM in /etc/pve/qemu-server/
mdevctl types works, I get this output:
“`
root@proxmox:~# mdevctl types
0000:01:00.0
nvidia-46
Available instances: 24
Device API: vfio-pci
Name: GRID P40-1Q
Description: num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120×2880, max_instance=24
nvidia-47
Available instances: 12
Device API: vfio-pci
Name: GRID P40-2Q
Description: num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680×4320, max_instance=12
nvidia-48
Available instances: 8
Device API: vfio-pci
Name: GRID P40-3Q
Description: num_heads=4, frl_config=60, framebuffer=3072M, max_resolution=7680×4320, max_instance=8
nvidia-49
Available instances: 6
Device API: vfio-pci
Name: GRID P40-4Q
“`
Here’s the config file:
“`
balloon: 6144
bios: ovmf
boot: order=scsi0;net0
cores: 4
cpu: host
cpuunits: 1024
efidisk0: local-lvm:vm-102-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00.0,mdev=nvidia-60,pcie=1,x-vga=1
machine: pc-q35-8.0
memory: 8192
meta: creation-qemu=8.0.2,ctime=1700901198
name: windows
net0: virtio=3A:FF:78:C4:84:60,bridge=vmbr0,firewall=1
numa: 1
ostype: win11
scsi0: local-lvm:vm-102-disk-1,cache=writeback,iothread=1,replicate=0,size=120G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=a4980977-8415-4b7b-96bd-5e124a73f3db
sockets: 1
tpmstate0: local-lvm:vm-102-disk-2,size=4M,version=v2.0
usb0: host=0c45:5011
usb1: host=046d:c077
vga: none
vmgenid: 50083200-a774-4715-ab07-0d77aa6844e6
“`
Stop your VM and run: journalctl -u nvidia-vgpud.service -f
Than start your VM again, and post the output
Have you created a custom profile for that VM ? (TOML file)
Just to clarify, the VM wasn’t able to start at all (so I didn’t have to stop it). It gives this status in Proxmox: “stopped: start failed: QEMU exited with code 1”
I haven’t created a custom profile — “/etc/vgpu_unlock/profile_override.toml” is empty.
Here’s the journalctl output
“`
root@proxmox:~# journalctl -u nvidia-vgpud.service -f
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: BAR1 Length: 0x100
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: Frame Rate Limiter enabled: 0x1
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: Number of Displays: 4
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: Max pixels: 16384000
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: Display: width 5120, height 2880
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: License: GRID-Virtual-PC,2.0;Quadro-Virtual-DWS,5.0;GRID-Virtual-WS,2.0;GRID-Virtual-WS-Ext,2.0
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: PID file unlocked.
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: PID file closed.
Dec 05 20:24:05 proxmox nvidia-vgpud[853]: Shutdown (853)
Dec 05 20:24:05 proxmox systemd[1]: nvidia-vgpud.service: Deactivated successfully.
“`
I left that running then tried starting the VM, but it looks like it gives the same error and failed to start again.
Here’s the output from the VM:
“`
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/00000000-0000-0000-0000-000000000102,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 00000000-0000-0000-0000-000000000102: error getting device from group 18: Input/output error
Verify all devices in group 18 are bound to vfio- or pci-stub and not already in use
stopping swtpm instance (pid 4367) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1
“`
Hmmm…strange. Everything you’ve posted so far looks ok. Can you contact directly using the contactform in the About Me page.
Thank you so much for this. Best Proxmox vGPU guide on the internet bar none in such a lovely script.
Nice script, worked flawlessly. Big thanks. I needed it after updating proxmox to version 8.
Thank you. This guide and install script is awesome. I have a quadro p2000. It works fine as vgpu. Default is non-SR-IOV mode in host mode. Is it matter on performance? Is it possible to enable?
Could you please elaborate. Haven’t heard about non SR-IOV mode to be honest.
This is a part of my report. At bottom you see host mode is non SR-IOV.
https://freeimage.host/i/J5rA9LP
But windows VM via parsec is working fine, despite that disturb about nvidia licenc. Nvidia driver notified me, without licence performance is limited. Whatever I am very grateful. Thank you.
Licensing should also work now, just try it.
Licensing works. Thank you. How could I set expiring period in fastAPI? Or just run it again when expired?
It’s valid for 6 months. Just run i again to renew
Hi. May I ask again? What do you think? The P106-100 gpu card could work with this patch? That is almost same as a 1060 gpu. And soooo cheap these cards.
The card has a different PCI ID of 10DE:1C07 instead of (something) like 10DE:1C06 for a GTX1060 6GB and thus the mapping in vgpu_unlock-rs would not work. Maybe editing the code could make this work since it’s a compatible GPU chip.
Hello
The script is simple, so I’m using it very well.
But I have one problem.
The TU104-based EVGA 2080 Rev.a 8GB will not be vgpu_unlock.
Can this problem be solved by any chance?
Have a nice day!
As in doesn’t work ?
Did the script give any errors ? (check debug.log)
“`
ERROR: An error occurred while performing the step: “Building kernel modules”. See /var/log/nvidia-installer.log for details.
ERROR: An error occurred while performing the step: “Checking to see whether the nvidia kernel module was successfully built”. See /var/log/nvidia-installer.log for details.
ERROR: The nvidia kernel module was not created.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at http://www.nvidia.com.
Failed to start nvidia-vgpud.service: Unit nvidia-vgpud.service not found.
Failed to start nvidia-vgpu-mgr.service: Unit nvidia-vgpu-mgr.service not found.
“`
I tested two cards, 1660s and 2080.
After removing the 2080 card, vgpu was installed normally.
You probably have to pass one GPU through to one VM directly and use the other for vGPU purposes.
See how to set it up here and open drop down Multiple GPU’s
The method you told me works fine.
Not just 2080, My 3060 crap is now available.
Thank you!!
Have a Nice day!
Simple and useful. Thank you.
Is such a thing possible for the 3000 series or is there a possibility? I’m new to this type of KVM systems.
I also saw a post like this “https://github.com/Kaydax/proxmox-ve-anti-detection?tab=readme-ov-file” on github and I’m sure it will be very useful in games. Combine this with your own explanation and create a “fork” can you create?
Created by google translate.
This patch is for games that detect if they’re running in a VM ? Correct ?
Then that’s something you should implement on your own, this script is purely meant for getting the vGPU running in Proxmox
I’m new to the Proxmox universe. I hope someone like you, who gives a nice and simple explanation, can explain this https://github.com/zhaodice/proxmox-ve-anti-detection post in a simpler and simpler way.
Created by Google Translate.
Sure i can do that. What this patch does, is rewrite the device ID’s a hypervisor (Proxmox) normally assigns to a VM. A game will scan for these device ID’s apparently and know that it is running in a VM and trigger a warning. With this patch you can circumvent that.
I don’t currently have an install of Proxmox to test this on but I had a few basic verifications I was wondering if someone could do.
#1 this will only split the gpu 2way, 4way, or 8way. Correct? (You can’t do 3 users, like a 6gig card split into 3x 2gig sessions) Is there a minimum ram/power or could even a 3gig 1060 be split 8 way for a nondemanding use?
#2 the system has to be rebooted every 24 hours (another page on this talked about unlicensed GRID driver only working for a day at a time)
#3 this still only works to split 1x GPU in the system, even if you were to have more
Thanks in advance if anyone has any input as I may not get back to this page for a few weeks until I have a working Proxmox first 🙂
#1 Yes it can. It will split up the card on the amount of VRAM you have available by the profile you choose. So 6GB of VRAM can be split up by 3 x 2GB
#2 That’s right, unless you use Oscar Krause’s method then you can use it for 90 days
#3 Yes, only one GPU is supported in vGPU mode. The rest of your GPU’s have to passed through directly to the VM’s
Sorry to bother I just dont yet have a computer to experiment this yet. 🙂 What are the possible profiles? Is it basically from 2-8 way splitting as options, or does anything go above 8? Assumedly it’s always the same amount of RAM (and % of gpu time) for each session, no unequal splits?
It goes above 8 as well, like 12 or 24. But you can also mix and match using a custom toml file. So let’s say 8GB available, then you can have one 4Gb and two 2Gb profiles
Thank you for your contributions my friend.
I’m new to Proxmox. I tried it with a 1050 Ti GPU and the script worked fine.
How can I edit the profiles? Below are the profiles created by the script, how can I add new ones? For example, I want to give memory like 256 MB and 512 MB.
root@pov1:~# mdevctl types
0000:01:00.0
nvidia-58
Available instances: 0
Device API: vfio-pci
Name: GRID P40-6A
Description: num_heads=1, frl_config=60, framebuffer=6144M, max_resolution=1280×1024, max_instance=4
nvidia-59
Available instances: 0
Device API: vfio-pci
Name: GRID P40-8A
Description: num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280×1024, max_instance=3
nvidia-60
Available instances: 0
Device API: vfio-pci
Name: GRID P40-12A
Description: num_heads=1, frl_config=60, framebuffer=12288M, max_resolution=1280×1024, max_instance=2
nvidia-61
Available instances: 0
Device API: vfio-pci
Name: GRID P40-24A
Description: num_heads=1, frl_config=60, framebuffer=24576M, max_resolution=1280×1024, max_instance=1
nvidia-62
Available instances: 23
Device API: vfio-pci
Name: GRID P40-1B
Description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120×2880, max_instance=24
I guess you’re going to need to create two custom profiles using toml. Located in
/etc/vgpu_unlock/profile_override.toml
Where you have to override two default profiles, like for example these two:
[profile.nvidia-48] # 384MB
num_displays = 1
display_width = 1920
display_height = 1080
max_pixels = 2073600
framebuffer = 0x14000000
framebuffer_reservation = 0x4000000
[profile.nvidia-49] # 512MB
num_displays = 1
display_width = 1920
display_height = 1080
max_pixels = 2073600
framebuffer = 0x1A000000
framebuffer_reservation = 0x6000000
Then restart nvidia_vgpu
systemctl restart nvidia-vgpud.service
systemctl restart nvidia-vgpu-mgr.service
And assing these profiles to your VM
Haven’t tested this, but it ‘should’ work. Btw, you can’t go lower than 384MB from what I’ve read
It works, I’m grateful.
nano /etc/vgpu_unlock/profile_override.toml
——————————————————————
[profile.nvidia-48] # 384MB
num_displays = 1
display_width = 1920
display_height = 1080
max_pixels = 2073600
framebuffer = 0x14000000
framebuffer_reservation = 0x4000000
[profile.nvidia-49] # 512MB
num_displays = 1
display_width = 1920
display_height = 1080
max_pixels = 2073600
framebuffer = 0x1A000000
framebuffer_reservation = 0x6000000
————————————————————————-
systemctl restart nvidia-vgpud.service
systemctl restart nvidia-vgpu-mgr.service
Hi, my host installation went succesful but I have this error on VM (tired both Debian Bookworm and Ubuntu 22.04)
make[2]: Entering directory ‘/usr/src/linux-headers-6.1.0-18-amd64’
MODPOST /tmp/selfgz609/NVIDIA-Linux-x86_64-535.104.05-grid/kernel/Module.symvers
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol ‘__rcu_read_lock’
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol ‘__rcu_read_unlock’
make[3]: *** [/usr/src/linux-headers-6.1.0-18-common/scripts/Makefile.modpost:126: /tmp/selfgz609/NVIDIA-Linux-x86_64-535.104.05-grid/kernel/Module.symvers] Error 1
That seems to be a bug in the (guest os) driver, which will need to be patched.
Give me some time to simulate your VM environment and i’ll get a patch ready.
Ok, thank you. It was from clean debian install with only ssh server and build-essential installed manually so it will be easy to reproduce. Also ubuntu was official minimal desktop install with added build-essential packages
Before trying to patch it myself, could you try one of the newer vGPU drivers in your VM from Google’s repository
I’ve tried every driver from 15.0 to 16.3 and there is the same error all the time
I’ve successfully installed on older kernel 5.15.0-94 on ubuntu 20.04. It seems like problem is with newer kernels like 6.1 on debian
I have something interesting, I’ve successfully installed driver 535.154.05 on opensuse thumbleweed VM which has kernel 6.7.4 I will try fedora and see what will happen but I see that debian 12 and ubuntu 22.04 still doesn’t work. But its not related to newest kernel
That was my initial thought, that it would be related to kernel 6.1+
Preparing an Ubuntu 22.04 tonight, and will report my findings.
well, the solution was easier than i thought. Just use gcc-12 instead of 11 like so:
sudo update-alternatives –install /usr/bin/gcc gcc /usr/bin/gcc-12 12
then run:
sudo ./NVIDIA-Linux-x86_64-535.104.05-grid.run –dkms
Hey,
first thanks for your script – it worked like a charm for my 1080TI. I just got my other server up and running (4090). Installation went through without any errors. However, my mdevctl gives me no output. Any idea?
3000 and 4000 series are not supported by vpgu_unlock. Only 2000 and below
Oh, dang – I see. And there’s no way around it? Did some research, but I am definitely not into that topic as you are. Same probably applies to A6000 then, right? Have the exact same behaviour there.
No, the A6000 is already vGPU capable and does not need patching. So it would work natively.
Hi,
Thanks a lot for the script and your work 🙂
All went well with my Proxmox 8.1.10 and a Tesla P4 card (vGPU driver 535.104.06). I can create two Windows 10 VM with nvidia-47 profiles (GRID-P40-2Q) and 573.13 Windows driver.
I have installed the licence server using your script, the token is in the right directory on my Windows machines (C:\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken\) but the “& nvidia-smi -q | Select-String “License”” returns the following message:
License Status : Unlicensed (Restricted)
The Log.NVDisplay.Container.exe log file is full of “NLS initialized” messages.
Any clue ?
Thanks a lot
It think you maybe have a driver mismatch. You have 16.1 (535.104.06) installed on Proxmox and you should install 537.13 in your Windows VM from here
Otherwise take a look here for some minor debugging
Hello
Sorry that’s a typo: i have the 537.13 version in my Windows 10 VM (I installed the same version as the one in your link).
Will try with a Linux VM and report here.
Thanks
Guillaume
Tried from scratch with a new proxmox 8.1.10 build with the same Tesla P4 (no need of vGPU unlock i guess as thats a supported vGPU card). I did a manual installation without your script of the 535.154.02 (16.3) host driver using the Proxmox tutorial.
nvidia-smi is reporting the Tesla P4, i can add a nvidia-65 mDev profile to my Windows 10 VM (GRID-P4-4Q) and install the 538.15 Windows drivers as stated here:
https://git.collinwebdesigns.de/oscar.krause/fastapi-dls#setup-client
Then i installed a FastAPI-DLS server in an Ubuntu 23.10 machine, downloaded the client-token and restarted the NvContainerLocalSystem service. After some time i have the “Licensed (Expiry: 2024-7-5 13:38:38 GMT)” message so i guess thats OK !
Dont know what was the problem when using the script …
Time to try the licence server with my Vmware homelab ! I have to find a P100 too as the P4 is very limited with only 8 Gb of VRAM.
Thanks a lot
Guillaume
Very nice Guillaume. Am working on a new version of the script, but due to time constraints hasn’t been released yet. Will do so soon. Anyway, i think FastAPI-DLS is the way to go for small non-commercial homelab setups.
Thanks a lot for your work and your help
One last question please. In my VM my display adapter is detected as a GRID P4-4Q so i cant install GeForce Experience nor Quadro Experience. And as Moonlight requires it i am screwed … Any way to spoof to a Quadro ?
Regards
Guillaume
Maybe this still works
https://wvthoog.nl/proxmox-7-vgpu-v2/#Assign_a_spoofed_Mdev_through_CLI
I successfully used the script (v16.1) to split my Tesla P4 card and shared between two win10 VMs. I also have an Ubuntu VM running Ollama (webui) but I’m unable to get the applications to use the installed linux driver. The linux driver seems to be installed correctly as ‘nvidia-smi’ shows expected ouput.
The ‘Ollama + webui’ applications run fine using the CPUs. but when I try and create the container with Nvidia GPU settings, it gives an error ‘cannot find suitable gpu’ . Anyone know if it’s even possible to use a vGPU for this?
That should be possible. I’m running LM Studio and TabbyML on a Ubuntu server as well (with a 8GB vGPU) and they work perfectly fine.
HI Guillaume,
did you finally got it to work with “moonshine”?
i also have the same problem, that the geforce or Quadro Experience software don’t like the Grid driver.
do i need to spoof it?
will a A2000 in the host will do it out of the box?
do a A2000 card also will be shown as a A2000 inside the VM?
Is the limitation to one Vgpu a limitation of the script? Or the tech? I was hoping to have multiple P4s in a dual socket 7910 and deploy a bunch of gaming VMs
That’s a limitation of the tech (vgpu_unlock) behind the script, only one GPU is supported. But your GPU (P4) are natively supported by the vGPU driver (without the need of using vgpu_unlock) So install only the the driver and add licensing and you’re good to go on all of the P4’s
Hi, Wim van ‘t Hoog. Will this also work with multiple P100s? Thanks
Yes, that card also is natively supported by the vGPU driver (without patching) Multiple P100 cards are supported at once, just add licensing and you’re set.
Thank you for putting this all together! I can also see that you are very ‘hands-on’ helpful. This is a great script and I also really like your ideas for future licensing and updates.
Is it possible to update the Script with Grid 17.0?
cant wait!
new proxmox, new ubuntu, new vgpu?
Hello,
I have trouble to install the patched nvidia driver on proxmox 8 with kernel 6.5.
make[3]: *** [scripts/Makefile.build:251: /tmp/selfgz696542/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm-custom/kernel/nvidia/nvlink_linux.o] Error 1
In file included from :
././include/linux/kconfig.h:5:10: fatal error: generated/autoconf.h: No such file or directory
5 | #include
| ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:251: /tmp/selfgz696542/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm-custom/kernel/nvidia/procfs_nvswitch.o] Error 1
make[3]: *** [scripts/Makefile.build:251: /tmp/selfgz696542/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm-custom/kernel/nvidia/i2c_nvswitch.o] Error 1
make[3]: Target ‘/tmp/selfgz696542/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm-custom/kernel/’ not remade because of errors.
make[2]: *** [/usr/src/linux-headers-6.5.13-5-pve/Makefile:2039: /tmp/selfgz696542/NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm-custom/kernel] Error 2
make[2]: Target ‘modules’ not remade because of errors.
make[1]: *** [Makefile:234: __sub-make] Error 2
make[1]: Target ‘modules’ not remade because of errors.
make[1]: Leaving directory ‘/usr/src/linux-headers-6.5.13-5-pve’
make: *** [Makefile:82: modules] Error 2
ERROR: The nvidia kernel module was not created.
Any idea to fix that?
I think that is because you are missing the kernel headers. Install those and the problem should go away
apt install pve-headers-`uname -r`
I think the issue might have to do with the NVIDIA drivers and pve-manager version. According to the docs, Proxmox 8.1.4 is validated against 535.154.02, while the installer script at the moment gives options up to 535.104.06. Encountering issues running Proxmox 8.2.2.
Running the install for pve-headers returns that they are already installed but that they are installed as ‘proxmox-headers-xx’ instead of ‘pve-headers-xx’ so maybe that package is hard coded somewhere as ‘pve’ and the different names is causing the error.
Will take that into account when releasing the new version of the script. Taking a bit longer than expected due to the amount of changes i’ve made
apr reinstall pve-headers-`uname -r` did the trick, now it seams to work. But I cant get ffmpeg to use cuda inside a VM.
[AVHWDeviceContext @ 0x5611dfe93c00] cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed -> CUDA_ERROR_NOT_SUPPORTED: operation not supported
Hi.
Could you help me? I was a working pve with quadro p2000 with this vgpu install. I bought a 2080Ti. I uninstalled vgpu in pve. Replace quadro with 2080ti then I run install. Everything has done, but after reboot mdevctl sees nothing. GPU is present, install detected too, pci devices is show in pve. If not neccessary I wouldnt like reinstall pve. Have you any idea whats wrong? Maybe not remove completely kernel driver? How could I remove manually. After install doesnt recognized nvidia-smi command.
Which Proxmox version and what do the debug messages say ?
journalctl -u nvidia-vgpud.service -n 100
journalctl -u nvidia-vgpu-mgr.service -n 100
and post the output of your debug.log to pastebin
Latest proxmox version (8.2.2) with 6.8.4-2-pve kernel. I think kernel version the problem. First I try older kernel version.
Hi. /var/log/nvidia-installer.log:
https://pastebin.com/yLv8NLhw
Maybe helpfull:
16.5
https://mega.nz/file/MFgnCbjL#t4bbRBaTiVk3v4tPgGjqJpVIoMPoOpXKaTL8GTfGDU0
17.0
https://mega.nz/file/kJgxVJyL#dJIuuyalYf3NHIyzsgXQpd-gyB4gGrprtdbYtVfIvDE
17.1
https://mega.nz/file/AIYnBDpY#9EEwqfwkX0PrNSyaIsfVEhvK43UQyxEaKcYOXHwVDew
Are these patched versions?
Can you help me ? https://pastebin.com/4RJhF0u9
I have tried to get this setup several times now and I’m still having issues with not getting any mdevctl types showing. I am using a Tesla P100 and was trying to use the 16.1 & 16.0 driver. Tried on 8.2.2 and 8.1.2, and still the same thing. After looking at the log it looks like the nvidia driver failed to install.
Debug.log
https://pastebin.com/w1BPSbtg
Nvidia-installer.log
https://pastebin.com/MutE35GM
Well it appears that it might be an nvidia install issue after seeing some of the other comments that rolled in with similar errors. I’ll wait in case it’s a patch or a kernel version issue.
Downgraded kernel and it seems to work now.
Hello Mr. Wim,
First of all, thank you for all your efforts in this project I appreciate your time to do this freely and I am impressed and I wish nothing but the best ahead.
I have tried to use your script, I tried it 3 times all on fresh installs, once on Proxmox 8.2 latest, then on 8.1 but the script updated everything to 8.2 and last time I did not let it update and upgrade so it stayed on 8.1
All three times give me blank output for mdevctl types, in the debug.log it says “ERROR: An error occurred while performing the step: “Building kernel modules”. See /var/log/nvidia-installer.log for details.” I checked the other comments for the same issue and used whatever suggestion you gave them but with no avail, I searched the internet as well and it is very hard to find anything about this issue.
and so I am sorry to ask for more of your time if possible to help me figure out my issue
This is a link to the log:
https://pastebin.com/wsV6ViXn
p.s. I am new to proxmox but I do try my best to learn and understand more as I go through this amazing experience.
I have same errors. I found on web and it works for me. I think previous kernel is supported. So, you can back to previous kernel version.
I did this:
apt install proxmox-kernel-6.5.13-5-pve-signed
apt install proxmox-headers-6.5.13-5-pve
Then you can install the vgpu script.
It works for me, I hope it helps you.
I forgot this line, do this after apt installs:
proxmox-boot-tool kernel pin 6.5.13-5-pve