On Mon, Jul 01, 2024 at 03:42:29PM +0200, Mikulas Patocka wrote:
On Mon, 1 Jul 2024, Daniel P. Berrangé wrote:
> On Sun, Jun 30, 2024 at 08:49:48PM +0200, Mikulas Patocka wrote:
> >
> >
> > On Sun, 30 Jun 2024, Tejun Heo wrote:
> >
> > > Do you happen to know why libvirt is doing that? There are many other
> > > implications to configuring the system that way and I don't think we
want to
> > > design kernel behaviors to suit topology information fed to VMs which can
be
> > > arbitrary.
> > >
> > > Thanks.
> >
> > I don't know why. I added users(a)lists.libvirt.org to the CC.
> >
> > How should libvirt properly advertise "we have 16 threads that are
> > dynamically scheduled by the host kernel, so the latencies between them
> > are changing and unpredictable"?
>
> NB, libvirt is just control plane, the actual virtual hardware exposed
> is implemented across QEMU and the KVM kernel mod. Guest CPU topology
> and/or NUMA cost information is the responsibility of QEMU.
>
> When QEMU's virtual CPUs are floating freely across host CPUs there's
> no perfect answer. The host admin needs to make a tradeoff in their
> configuration
>
> They can optimize for density, by allowing guest CPUs to float freely
> and allow CPU overcommit against host CPUs, and the guest CPU topology
> is essentially a lie.
>
> They can optimize for predictable performance, by strictly pinning
> guest CPUs 1:1 to host CPUs, and minimize CPU overcommit, and have
> the guest CPU topology 1:1 match the host CPU topology.
The problem that we have here is that the commit
63c5484e74952f60f5810256bd69814d167b8d22 ("workqueue: Add multiple
affinity scopes and interface to select them") changes the behavior of
unbound workqueues, so that work items are only executed on CPUs that
share last level cache with the task that submitted them.
If there are 16 virtual CPUs that are freely floating across physical
CPUs, virt-manager by default selects a topology where it advertises 16
sockets, 1 CPU per socket, 1 thread per CPU. The result is that the
unbound workqueues are no longer unbound, they can't move work across
sockets and they are bound to just one virtual CPU, causing dm-crypt
performance degradation. (the crypto operations are no longer
parallelized).
Whose bug is this? Is it a bug in virt-manager because it advertises
invalid topology? Is this a bug in that patch 63c5484e7495 because it
avoids moving work items across sockets?
It is hard to call it is a bug in anything. The Linux patch is reasonable
in honouring the CPU topology. The hypervisor is reasonable to exposing
sockets=N,cores=1 as there's no right answer for the topology choice, and
it has no idea how it may or may not impact guest OS behaviour or perf.
Letting CPUs float freely is sensible default behaviour too. All of them
conspire to have a perf impact here, but the deployment is not seeking
to maximise performance, rather to maximise flexibility & density.
None the less, I'd suggest that virt-manager should be a explicitly
asking for sockets=1,cores=N, as that has broader guest OS compatibility.
By chance that would also help this scenario, but that woudn't be a
driving factor as we can't pick defaults based on the needs of particular
versions of a particular guest kernel.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|