On 10/21/19 9:02 AM, Andrea Bolognani wrote:
On Sat, 2019-10-19 at 02:18 -0400, Laine Stump wrote:
> Commit 01ca4010d86 (libvirt v5.1.0) moved address reservation for
> hotplugged interface devices up to an earlier point in
> qemuDomainAttachNetDevice() because some function called when
> hotplugging on aarch64 needed to know the address type, and failed
> when it was "none".
I don't see anything in the original commit suggesting the change
was connected to aarch64? In fact, for the most part I would expect
aarch64/virt guests to go down the very same code paths as x86/q35
guests.
The secret info is in the emails surrounding the V1 patch that turned
into the V2 patch that was eventually pushed:
https://www.redhat.com/archives/libvir-list/2018-December/msg00439.html
https://www.redhat.com/archives/libvir-list/2018-December/msg00460.html
I think the issue is that in the VIR_DOMAIN_NET_TYPE_VHOSTUSER case of
the switch statement in qemuDomainAttachNetDevice(), it is calling
qemuDomainSupportsNicdev(), and qemuDomainSupportsNicdev has this code:
/* non-virtio ARM nics require legacy -net nic */
if (((def->os.arch == VIR_ARCH_ARMV6L) ||
(def->os.arch == VIR_ARCH_ARMV7L) ||
(def->os.arch == VIR_ARCH_AARCH64)) &&
net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO &&
net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI)
return false;
return true;
So if you were attempting to hotplug a vhostuser device on aarch64 with
no address specified in the XML, it would return false, which would
cause qemuDomainAttachNetDevice() to fail and log an error.
> This unfortunately caused a regression, because it also made PCI
> address reservation happen before we noticed that the device was a
> *hostdev* interface. Those interfaces are hotplugged by just calling
> out to qemuDomainAttachHostdevDevice() - that function would then also
> attempt to reserve the *same PCI address* that had just been reserved
> in qemuDomainAttachNetDevice().
And of course at the time we didn't have, and after this patch still
don't have, test suite coverage for that O:-)
I'll choose to pretend I didn't see that.
> The solution is to move the bit of code that short-circuits out to
> virDomainHostdevAttach() up *even earlier* so that no PCI address has
> been allocated by the time it's called.
>
> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1744523
> Signed-off-by: Laine Stump <laine(a)redhat.com>
The prevailing style seems to be
...
been allocated by the time it's called.
https://bugzilla.redhat.com/show_bug.cgi?id=1744523
Signed-off-by: Laine Stump <laine(a)redhat.com>
Feel free to change it or leave it as-is, I personally don't feel
very strongly either way.
Remember when BZes weren't listed *at all* in commit messages?
Stumperidge Farms remembers. I guess the debate about this happened on
one of the many patch threads that I didn't get around to reading (I
remember seeing some private messages, but I thought those were only wrt
downstream distro patches, not upstream). I can see the point with
patches that are *in support* of some patch that fixes a bug (and
therefore shouldn't claim "resolves"), but I think it's nice to have an
explicit indicator in a patch that means "immediately prior to this
patch, the bug still existed. After this patch, the bug no longer
exists". Yeah yeah, I know it's more complicated than that.
Anyway, if the winds are currently blowing in the direction of "don't
say "Resolves:" then fine, I won't says "Resolves:".
> +++ b/src/qemu/qemu_hotplug.c
> + case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> + /* this switch is skipped by hostdev interfaces */
Maybe something like
/* hostdev interfaces have been dealt with earlier */
instead?
Sure.
With at least the reference to aarch64 removed (unless of course I
have misunderstood the situation and the original commit was really
aarch64-related),
Still want me to remove that, or add extra comments describing the
original problem more thoroughly? (seems kind of the wrong place for
that - that documentation train already sailed)
Reviewed-by: Andrea Bolognani <abologna(a)redhat.com>