Can't bind/unbind GPU from nvidia to vfio-pci properly on demand without reboot (Ubuntu 22 QEMU KVM OVMF)

Question

I am running into an odd issue. I have been struggling to get GPU passthrough to work properly to a Windows 11 VM and I've finally found something that works but it's not as ideal as I'm hoping for. Essentially if I add to /etc/modprobe.d/vfio.conf my PCI ids options vfio-pci ids=10de:2684,10de:22ba VFIO binds on startup and I can use it great for GPU passthrough. But if I then try to reattach the GPU to nvidia drivers I can't appear to use it with pytorch (although nvidia-smi works fine).

If I remove that vfio.conf file and reboot, the GPU is bound to nvidia and torch works great but when I try to unbind from nvidia and bind to vfio-pci, when I launch the VM I get Error Code 43 on the Nvidia driver and the following error in libvirt logs:

2024-04-09T15:38:49.796258Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.5,addr=0x0: Failed to mmap 0000:01:00.0 BAR 1. Performance may be slow
2024-04-09T15:39:07.971124Z qemu-system-x86_64: vfio_region_write(0000:01:00.0:region1+0x8c, 0x1,4) failed: Cannot allocate memory

It's really odd, because from all my inspections it appears the GPU is properly isolated but it appears I can't pass it through to the GPU without explicitly binding to vfio via /etc/modprobe.d/vfio.conf, and when I do that I can't seem to properly bind it back to nvidia. Once again, everything looks fine when I rebind it to nvidia but torch can't detect the GPU anymore. Any ideas?

My workaround works ok for now, but it requires reboot if I want to launch my VM. I'd ideally like to be able to bind/unbind my nvidia GPU on demand when I want to switch between using it on the host vs in the windows 11 VM. Example bind to VFIO script:

#!/bin/bash
set -x
Stop display manager
systemctl stop display-manager
Unbind VTconsoles: might not be needed
echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
Unload NVIDIA kernel modules
modprobe -r nvidia_drm 
modprobe -r nvidia_modeset
modprobe -r nvidia_uvm 
modprobe -r nvidia
Detach GPU devices from host
Use your GPU and HDMI Audio PCI host device
sudo virsh nodedev-detach pci_0000_01_00_0
sudo virsh nodedev-detach pci_0000_01_00_1
Load vfio module
modprobe vfio-pci

If I run

lspci -nnk -d 10de:2684
lspci -nnk -d 10de:22ba

It looks properly bound to vfio-pci:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2684] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:40e5]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22ba] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:40e5]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

If I reboot with vfio.conf applied and inspect things it looks the same but oddly does work when launching my Windows 11 VM:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2684] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:40e5]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22ba] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:40e5]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

But if I then unbind from vfio and bind to nvidia:

#!/bin/bash
set -x
Attach GPU devices to host
Use your GPU and HDMI Audio PCI host device
sudo virsh nodedev-reattach pci_0000_01_00_0
sudo virsh nodedev-reattach pci_0000_01_00_1
Unload vfio module
modprobe -r vfio-pci
#stop race condition
sleep 2
Load NVIDIA kernel modules
modprobe nvidia
modprobe nvidia_modeset
modprobe nvidia_uvm
modprobe nvidia_drm
Bind VTconsoles: might not be needed
echo 1 > /sys/class/vtconsole/vtcon0/bind
echo 1 > /sys/class/vtconsole/vtcon1/bind

nvidia-smi works fine:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0 Off |                  Off |
|  0%   49C    P0              67W / 450W |      0MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

But when I run something in Docker that uses pytorch:

RuntimeError: Torch is not able to use GPU

Even worse, when I try to rebind to vfio it works as if I hadn't enabled vfio.conf and I get the same error on launching the Windows 11 VM:

2024-04-09T16:04:45.089687Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.5,addr=0x0: Failed to mmap 0000:01:00.0 BAR 1. Performance may be slow
2024-04-09T16:04:55.682373Z qemu-system-x86_64: vfio_region_write(0000:01:00.0:region1+0x8c, 0x1,4) failed: Cannot allocate memory

It feels pretty clear to me that something is still using nvidia somehow, even though it is using vfio-pci kernel driver and lsof /dev/nvidia0 returns a blank string. Any ideas? I'm going a bit crazy here!

score 1 · Accepted Answer · answered Apr 11 '24 at 21:29

After more investigation I found this is the same issue: https://www.reddit.com/r/VFIO/s/ASQ3Bx3RGq

Related issue tracker: https://gitlab.freedesktop.org/drm/amd/-/issues/2794

Basically it comes down to the fact that I was trying to use my iGPU on my AMD CPU on the host and a NVIDIA GPU for passthrough. Apparently this is due to caching techniques being different on AMD and NVIDIA and the kernel not gracefully handling that right now. This repo has tips on how to patch your Kernel: https://github.com/Kinsteen/win10-gpu-passthrough

Because I don't really care much about using my host display when using the VM, I just went ahead and blacklisted amdgpu and just kill my display-manager before launching my Windows 11 VM. That works great.

Before this I tried to upgrade to Ubuntu 24 because I heard this issue was fixed in Kernel 6.6 and Ubuntu 24 uses Kernel 6.8, but alas I ended up with the same issue. So single GPU passthrough seemed like the best option for now.

Can't bind/unbind GPU from nvidia to vfio-pci properly on demand without reboot (Ubuntu 22 QEMU KVM OVMF)

Stop display manager

Unbind VTconsoles: might not be needed

Unload NVIDIA kernel modules

Detach GPU devices from host

Use your GPU and HDMI Audio PCI host device

Load vfio module

Attach GPU devices to host

Use your GPU and HDMI Audio PCI host device

Unload vfio module

Load NVIDIA kernel modules

Bind VTconsoles: might not be needed

1 Answers1