* Fischer, Anna (anna.fischer(a)hp.com) wrote:
So, when setting a breakpoint for the exit() call I'm getting a
bit closer to figuring where it kills my guest.
Thanks, this helps clarify what is happening.
Breakpoint 1, exit (status=1) at exit.c:99
99 {
Current language: auto
The current source language is "auto; currently c".
(gdb) bt
#0 exit (status=1) at exit.c:99
#1 0x0000000000470c6e in assigned_dev_pci_read_config (d=0x259c6f0, address=64, len=4)
assigned_dev_pci_read_config(..., 64, 4)
^^
This is a libvirt issue. When you use virt-manager it has libvirtd
fork/exec qemu-kvm. libvirtd will drop privileges and run qemu-kvm as
user qemu (or perhaps root if you've edited qemu.conf). Regardless of
the user, it clears capabilities. Reading PCI config space beyond just
the header requires CAP_SYS_ADMIN. The above is reading the first 4
bytes of device dependent config space, and the kernel is returning 0
because qemu doesn't have CAP_SYS_ADMIN.
Basically, this means that device assignment w/ libvirt will break
MSI/MSI-X because qemu will never be able to see that the host device
has those PCI capabilities. This, in turn, renders VF device assignment
useless (since a VF is required to support MSI and/or MSI-X).
Granting CAP_SYS_ADMIN for each qemu instance that does device assignment
would render the privilege reduction useless (CAP_SYS_ADMIN is the
kitchen sink catchall of the Linux capability system).
Hmmph...