[Please, keep the list on CC for benefit of others]
On 9/28/22 10:13, 陈新隆 wrote:
I'm using the `virsh dumpxml <domain name>` to check the
xml. When I
first execute this command, there's two hostdev(GPU) in the xml, but
next time I execute the same command these two hostdev elements
disappeared. During this time, the vm didn't restart or stop. Also,
within the vm `lspci | grep -i nvidia` command did not print GPU infos.
So I was wondering if there's a mechanism in libvirtd detach the hostdev
element without the vm stop or restart. This problem doesn't happen
often, so I am looking a way to reproduce it.
Can you help me with these questions :
1. If I edit the xml manually then detached the hostdev elements from
xml by invoke libvirtd apis, will the vm apply it immediately without
stop or restart? Or I must restart the vm to apply the latest xml ?
I'm not sure what you mean. Editing XML manually is different to using
libvirt APIs to detach hostdevs. Here's how it works:
1) a domain is defined (say using virsh define file.xml), libvirt parses
this XML, keeps it in a memory and stores it "somewhere" (it's under
/etc/libvirt/qemu/ but we do not want users to hand edit those files
manually as libvirt reads them only on libvirtd/virtqemud restart). This
is called inactive XML, because it reflects the inactive state of
domain. Sometimes it's also called config XML.
2) when the domain is started (e.g. virsh start), libvirt creates a copy
of inactive XML, populates it with runtime information and saves it
elsewhere (/run/libvirt/qemu/, but again, we do not want users to hand
edit those files). This copy is referred to as live XML.
3) users can alter the live XML using APIs (e.g. to hotplug a device or
hotunplug it). The inactive XML can be altered by providing altered XML
and defining it again (here, domain name and UUID must match already
existing domain).
4) upon domain shutoff, the live XML is thrown away, and finally
5) the inactive XML is never thrown away, until virsh undefine is called.
Now, you can see that there's no real connection between live and
inactive XMLs and changing one has no affect on the other, except when
the domain is cold booted again. Live and inactive XMLs can vary wildly.
Therefore, you can have inactive XML with two <hostdev/>-s, and active
XML with no <hostdev> at all. NB, hotplug APIs can also be used to alter
inactive XML (virsh attach-device --config / virsh detach-device
--config / ...) or both at the same time (virsh attach-device --config
--live / virsh attach-device --config --live / ...).
And what you describe sounds as if those two <hostdev/>-s you saw at
domain startup were hotunplugged. The fact that even 'lspci' ran from
inside the domain can't find them only supports this theory.
And no, libvirt never tries to bring inactive and live XMLs together. It
has no intelligence built in to do that and we, developers, do not want
such thing either. We might change something that user specifically
wanted and I believe nobody likes those "smart" tools that get in your way.
2. After a host device is hot-unplugged, will libvirtd aware of it
then
remove the related `hostdev` element from xml ?
Yes. I believe the reasoning is seen in the previous block of my reply.
On Mon, Sep 26, 2022 at 10:10 PM Michal Prívozník <mprivozn(a)redhat.com
<mailto:mprivozn@redhat.com>> wrote:
On 9/26/22 15:06, 陈新隆 wrote:
>
> <
https://stackoverflow.com/posts/73854544/timeline
<
https://stackoverflow.com/posts/73854544/timeline>>
>
> I'm using Kubevirt to manage my virtual machine instances. When I
using
> Kubevirt to create a vm(with two GPUs), kubevirt will generate a
libvirt
> guest domain xml for this vm which includes two GPUs, the domain
xml as
> follow :
>
> |<hostdev mode='subsystem' type='pci'
managed='no'> <driver
> name='vfio'/> <source> <address domain='0x0000'
bus='0x83' slot='0x00'
> function='0x0'/> </source> <alias
name='ua-gpu-gpu0'/> <address
> type='pci' domain='0x0000' bus='0x06'
slot='0x00' function='0x0'/>
> </hostdev> <hostdev mode='subsystem' type='pci'
managed='no'> <driver
> name='vfio'/> <source> <address domain='0x0000'
bus='0x84' slot='0x00'
> function='0x0'/> </source> <alias
name='ua-gpu-gpu1'/> <address
> type='pci' domain='0x0000' bus='0x07'
slot='0x00' function='0x0'/>
> </hostdev> |
>
> No one ever edit this domain xml, but these two |hostdev| element
> disappeared in the domain xml. During this time, I've run
> the |gpu_burn| command to do a stress test for these two GPUs.
>
> My question is :
>
> * when will libvirtd change the guest domain xml ?
> * why libvirtd delete these two |hostdev| from domain xml ?
>
Libvirt does not remove anything from domain XML (except for the
elements it does not understand, but this is not the case). My suspicion
is that you're looking at live XML instead of inactive XML or vice
versa. Libvirt allows guests to be defined (i.e. libvirt manages their
inactive definition). However, a guest can be started with wildly
different configuration (e.g. without those two <hostdev/>-s). OR, they
might have been hot-unplugged.
I still recommend reading this link:
Michal