On 10/31/22 2:21 PM, Edward Haas wrote:
On Mon, Oct 31, 2022 at 6:55 PM Andrea Bolognani <abologna(a)redhat.com
<mailto:abologna@redhat.com>> wrote:
On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:
> That discussion mentioned that a guest PCI address may change in
two cases:
> - The PCI topology changes.
> - The machine type changes.
>
> Usually, the machine type is not expected to change, especially
if one
> wants to allow migrations between nodes.
> I would hope to argue this should not be problematic in practice,
because
> guest images would be made per a specific machine type.
The machine type might not change from q35 to i440fx and vice versa,
but since the domain XML is constructed every time a KubeVirt VM is
started, the machine type might be q35-6.0 on one boot and q35-7.0
the next one if a KubeVirt upgrade that comes with a new version of
QEMU has happened in between.
This is unlikely to make a difference in terms of PCI addresses seen
in the guest OS, but it's still not accurate to say that the machine
type will not change.
Thank you for the clarification.
It makes me wonder now what are the actual implications of
the machine type change.
Live migration is a separate matter, as the machine type will
definitely not change while the VM is running.
> Regarding the PCI topology, I am not sure I understand what changes
> need to occur to the domxml for a defined guest PCI address to
change.
> The only think that I can think of is a scenario where
hotplug/unplug is
> used,
> but even then I would expect existing devices to preserve their
PCI address
> and the plug/unplug device to have a reserved address managed by
the one
> acting on it (the management system).
>
> Could you please help clarify in which scenarios the PCI topology
can cause
> a mess to the naming of interfaces in the guest?
A change in libvirt (again, due to a KubeVirt upgrade in between two
boots of the same VM) might result in different PCI addresses being
assigned to devices despite the same input XML.
We generally try fairly hard to avoid this kind of situation, but we
can only really guarantee stable PCI addresses for the lifetime of a
VM that has been defined and can't promise that the same input XML
will result in the same guest ABI when using different versions of
libvirt.
I would expect the PCI addresses that have been explicitly set in the
domxml [2] to be honored. We cannot assume that?
*If* the PCI address has been set in the original XML, that address will
be honored any and every time a new domain is defined from that XML.
Alternately, if a domain is defined once (without explicitly specifying
any PCI addresses) and then run multiple times from the same definition,
libvirt will auto-generate PCI addresses at initial definition time, and
then use those same addresses each time the domain is run.
The issue is that no management application, including KubeVirt, is
explicitly setting the PCI addresses of devices (and we believe that
hands-off practice should continue), *AND* KubeVirt is re-defining the
domain each time it is run (without querying libvirt for (and so never
saving) the PCI addresses that were assigned to the devices. So each
time the domain is stopped, all the PCI address info from that run is
thrown away. And each time the domain is re-started (by re-defining it
from the original XML that has no PCI address info), libvirt starts from
scratch assigning addresses based on the information it receives from
KubeVirt. And if the conditions have changed, then addresses are
assigned differently.
The potential situation Andrea described, where the PCI addresses could
be changed merely due to an upgrade of KubeVirt/libvirt/qemu from one
run to the next in spite of being fed the same (adress-less) XML, is
actually extremely rare (I don't remember such a case) but theoretically
it could happen. The more common change would be if a device was added
or removed during one run of the guest, and then remained added/removed
the next time it was run - that could change the PCI addresses of one or
more of the remaining devices, depending on their ordering in the XML).
So, libvirt provides two avenues to maintaining stable PCI addresses
(and thus, network device names) across multiple runs of a domain
(either define once, run many, or else query the XML of the running
domain and use that XML (containing PCI addresses) the next time the
domain is started, but KubeVirt doesn't use either of these (and if
memory serves me correctly, it really can't due to its design. And
delegating management of PCI addresses to KubeVirt is pushing too much
complexity out to KubeVirt.
I mainly referred to that input option, not to the expectation that
the
generated
configuration (of the domxml) to be identical between different versions.
[2]
https://libvirt.org/formatdomain.html#device-addresses
<
https://libvirt.org/formatdomain.html#device-addresses>
--
Andrea Bolognani / Red Hat / Virtualization