On Thu, Jul 20, 2017 at 03:03:25PM +0200, Wim ten Have wrote:
On Thu, 20 Jul 2017 11:10:31 +0100
"Daniel P. Berrange" <berrange(a)redhat.com> wrote:
> On Thu, Jul 20, 2017 at 11:29:26AM +0200, Wim Ten Have wrote:
> > From: Wim ten Have <wim.ten.have(a)oracle.com>
> >
> > The QEMU driver can erroneously allocate more vpus to a domain
> > than there are cpus in the domain if the <numa> element is used
> > to describe <cpu> element topology. Fix this by calculating
> > the number of cpus described in the <numa> element of a <cpu>
> > element and comparing it to the number of vcpus. If the number
> > of vcpus is greater than the number of cpus, cap the number of
> > vcpus to the number of cpus.
> >
> > This patch also adds a small libvirt documentation update under
> > the "CPU Allocation" section.
> >
> > Signed-off-by: Wim ten Have <wim.ten.have(a)oracle.com>
> > ---
> > docs/formatdomain.html.in | 9 ++++++++-
> > src/conf/domain_conf.c | 14 +++++++++++---
> > 2 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > index 07208ee..3c85307 100644
> > --- a/docs/formatdomain.html.in
> > +++ b/docs/formatdomain.html.in
> > @@ -501,7 +501,14 @@
> > <dt><code>vcpu</code></dt>
> > <dd>The content of this element defines the maximum number of
virtual
> > CPUs allocated for the guest OS, which must be between 1 and
> > - the maximum supported by the hypervisor.
> > + the maximum supported by the hypervisor. If a
<code>numa</code>
> > + element is contained within the <code>cpu</code> element,
the
> > + number of virtual CPUs to be allocated is compared with the sum
> > + of the <code>cpus</code> attributes across the
<code>cell</code>
> > + elements within the <code>numa</code> element. If the
number of
> > + virtual CPUs is greater than the sum of the
<code>cpus</code>
> > + attributes, the number of virtual CPUs is capped at sum of the
> > + <code>cpus</code> attributes.
> > <dl>
> > <dt><code>cpuset</code></dt>
> > <dd>
> > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> > index 958a5b7..73d7f4f 100644
> > --- a/src/conf/domain_conf.c
> > +++ b/src/conf/domain_conf.c
> > @@ -3844,12 +3844,20 @@ virDomainDefPostParseMemory(virDomainDefPtr def,
> > unsigned long long numaMemory = 0;
> > unsigned long long hotplugMemory = 0;
> >
> > - /* Attempt to infer the initial memory size from the sum NUMA memory
sizes
> > - * in case ABI updates are allowed or the <memory> element
wasn't specified */
> > + /* Attempt to infer the initial memory size from the sum NUMA memory
> > + * sizes in case ABI updates are allowed or the <memory> element
> > + * wasn't specified. Also cap the vcpu count to the sum of NUMA cpus
> > + * allocated for all nodes. */
> > if (def->mem.total_memory == 0 ||
> > parseFlags & VIR_DOMAIN_DEF_PARSE_ABI_UPDATE ||
> > - parseFlags & VIR_DOMAIN_DEF_PARSE_ABI_UPDATE_MIGRATION)
> > + parseFlags & VIR_DOMAIN_DEF_PARSE_ABI_UPDATE_MIGRATION) {
> > + unsigned int numaVcpus = 0;
> > +
> > + if ((numaVcpus = virDomainNumaGetCPUCountTotal(def->numa)))
> > + virDomainDefSetVcpus(def, numaVcpus);
> > +
> > numaMemory = virDomainNumaGetMemorySize(def->numa);
> > + }
>
> AFAICT, this scenario is simply a user configuration mistake, and so we
> should report an error message to them to fix it.
Not to push but I think this is the correct way ... O:-).
Ok, perhaps the documentation note/commit message should be slightly
rewritten aswell as the altered comment on specific code section.
The change is _NOT_ changing final guest provided 'maxvcpus' but
'vcpus' instead. This means that it adds the
"current='#count'" under
the vcpu element if such is less then vcpu 'maxvcpus' provided count.
If equal there is no issue and if larger there is indeed a correct
error message (exceptional condition).
Imagine simpel domain description _WITHOUT_ <numa> and below <vcpu>
element description;
<vcpu placement='static' current='4'>16</vcpu>
This nicely creates a guest with 4 domain CPUs added, where the
administrator can "hot-add" an additional 12 CPUs making the full
'maxvcpus' defined 16. "hot-add" by virsh;
virsh # setvcpus kvm26 16
Without the fix under former <numa> domain description libvirt would
bring whole '16' vcpus to the guest, and if there was a current value
given all by current on top of the <numa> <cell ... to NUMA-node0
for that matter :-( so wrong;
<vcpu placement='static'>16</vcpu>
..
<numa>
<cell id='0' cpus='0' memory='262144'
unit='KiB'/>
<cell id='1' cpus='1' memory='262144'
unit='KiB'/>
<cell id='2' cpus='2' memory='262144'
unit='KiB'/>
<cell id='3' cpus='3' memory='262144'
unit='KiB'/>
</numa>
This is saying that we have 16 CPU sockets, but you're only
assigning NUMA affinity to 4 of the sockets. IMHO that is
a configuration mistake. All 16 CPU sockets need to have
affinity assigned, so that when they're later brought online
they can be placed suitably on the host.
With the fix, its post configuration action will now nicely rewrite
the <vcpu ... current='#count'> element as shown below;
<vcpu placement='static' current='4'>16</vcpu>
..
<numa>
<cell id='0' cpus='0' memory='262144'
unit='KiB'/>
<cell id='1' cpus='1' memory='262144'
unit='KiB'/>
<cell id='2' cpus='2' memory='262144'
unit='KiB'/>
<cell id='3' cpus='3' memory='262144'
unit='KiB'/>
</numa>
IMHO that is still just as broken as the previous config,
because it isn't saying how to set affinity on the remaining
12 vCPUs that can be brought online.
In case "current='#count'" description is given where it does not
match the <numa> <cell ... #cpus> count then such is corrected to
cap the sum of all <numa> <cell ... #cpus>.
Perhaps that occurance should be rejected with an error message, then
such is not exceeding the domain administrators written <vcpu>
'maxvcpus'
element being '16' within my example above. That is a hard limit and
of course rejected with an error.
I will not argue further ... O:-) I'll wait and if it is a _NO_ then
it is a _NO_!
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|