[I'm going to repeat this for the last time, please keep the list on CC
list]
On 9/29/22 10:15, 陈新隆 wrote:
Like pic below, `cat /run/libvirt/qemu/<domain>.xml` outputs
<device
alias='ua-gpu-gpu0' /> , `virsh dumpxml <domain>` does not outputs gpu
hostdev.
It's usually better to paste the text, especially since you've posted a
picture of a text.
So, the first virsh dumpxml | grep gpu shows empty output. Then, virsh
dumpxml --inactive | grep gpu shows a line where 'ua-gpu-gpu0' alias is
declared for something. Without context lines is hard to tell for what.
Then, you grep /run/libvirt/qemu/$dom.xml where 'gpu' string appears in
two capabilities, and one <device alias=''/>.
Now, as I've said before, it's not uncommon for live XML and inactive
XML to be different. So something has hot-unplugged a device with
ua-gpu-gpu0 alias. And since there is no context lines I can't confirm
what device it was.
And lastly, the XML file under /run is considered libvirt internal and
unless I can try to explain some stuff, but knowing libvirt internals is
a must here. Those <flag/> elements are what we call QEMU capabilities
(flags that libvirt looks for when determining what features is QEMU
capable of). The fact that they have 'gpu' substring is just a
coincidence. Then the <device/> element is a list of device aliases used
by QEMU devices. Again, internal.
Since the domain is running, then both two commands represent the
live
xml of the domain. Why are there two different output results about gpu
devices ? Which output should I follow ?
NO, I think I made that clear. Since the domain is running, 'virsh
dumpxml' gives you live XML and 'virsh dumpxml --inactive' gives you
inactive XML. These two are completely different and the only thing that
they are required to have the same is domain name and domain UUID.
Have you read that wiki page I suggested couple of e-mails ago?
If you think a device disappears from your domain then you need to talk
to developers of the software you are using (if I recall correctly you
mentioned KubeVirt?).
image.png
This is a windows vm, connect to vnc server, taskmgr and display adapter
proves that there's no gpu device now.
Yep, so something hot-unplugged the device. I suspect the management
software on top of libvirt. Libvirt does not detach any device on its
own. And since this is NVIDIA I would not be surprised if it was a
licensing issue, or something similar.
Michal