On Thu, Jun 17, 2021 at 14:03:44 +0200, David Hildenbrand wrote:
On 17.06.21 13:18, Michal Prívozník wrote:
> On 6/17/21 11:44 AM, David Hildenbrand wrote:
> > On 17.03.21 12:57, Michal Privoznik wrote:
> > > v3 of:
> > >
> > >
https://listman.redhat.com/archives/libvir-list/2021-February/msg00961.html
> > >
> > >
> > > diff to v2:
> > > - Dropped code that forbade use of virtio-mem and memballoon at the same
> > > time;
> > > - This meant that I had to adjust memory accounting,
> > > qemuDomainSetMemoryFlags() - see patches 11/15 and 12/15 which are
> > > new.
> > > - Fixed small nits raised by Peter in his review of v2
> > >
> > >
> >
> > Hi Michal, do you have a branch somewhere that I can easily checkout to
> > play with it/test it?
> >
>
> Yes:
>
>
https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v4
>
> There were some comments in Peter's review and I really should fix my
> code according to them and merge/send v4.
>
> Michal
>
Thanks! I started with a single virtio-mem device.
1. NUMA requirement
Right now, one can really only configure "maxMemory" with NUMA specified,
otherwise there will be "error: unsupported configuration: At least
one numa node has to be configured when enabling memory hotplug".
I recall this is a limitation of older QEMU which would not create ACPI SRAT
tables otherwise. In QEMU, this is no longer the case. As soon as "maxmem"
is specified on the QEMU cmdline, we fake a single NUMA node:
hw/core/numa.c:numa_complete_configuration()
"Enable NUMA implicitly by adding a new NUMA node automatically"
-> m->auto_enable_numa_with_memdev / mc->auto_enable_numa_with_memhp
m->auto_enable_numa_with_memdev (slots=0) is set since 5.1 on x86-64 and arm64
m->auto_enable_numa_with_memhp (slots>0) is set since 2.11 on x86-64 and 4.1 on
arm64
So in theory, with newer QEMU on x86-64 and arm64 we could drop that
limitation in libvirt (might require some changes eventually
regarding the "node" specification handling). ppc64 shouldn't care as there
is no ACPI.
The main reason for the check to be present is actually exactly what you
are describing. qemu fakes the numa node in case none are configured,
this means that in case where libvirt would not enforce that you'd get a
discrepancy between the config and what qemu exposes.
[...]
3. "memory" item handling
Whenever I edit the XML and set e.g., "<memory
unit='GiB'>4</memory>", it's silently converted back to "20
GiB".
Maybe that's just always implicitly calculated from the NUMA spec and the defined
devices.
In cases where you've got numa confuigured, the <memory> element is
re-calculated back from the size of the numa nodes, as when you change
the value there isn't any obvious algorithm on picking NUMA nodes where
to pull from.