On Thu, Jul 20, 2017 at 14:17:29 +0100, Daniel Berrange wrote:
On Thu, Jul 20, 2017 at 03:03:25PM +0200, Wim ten Have wrote:
> On Thu, 20 Jul 2017 11:10:31 +0100
> "Daniel P. Berrange" <berrange(a)redhat.com> wrote:
>
> > On Thu, Jul 20, 2017 at 11:29:26AM +0200, Wim Ten Have wrote:
> > > From: Wim ten Have <wim.ten.have(a)oracle.com>
> > >
> > > The QEMU driver can erroneously allocate more vpus to a domain
> > > than there are cpus in the domain if the <numa> element is used
> > > to describe <cpu> element topology. Fix this by calculating
> > > the number of cpus described in the <numa> element of a <cpu>
> > > element and comparing it to the number of vcpus. If the number
> > > of vcpus is greater than the number of cpus, cap the number of
> > > vcpus to the number of cpus.
> > >
> > > This patch also adds a small libvirt documentation update under
> > > the "CPU Allocation" section.
> > >
> > > Signed-off-by: Wim ten Have <wim.ten.have(a)oracle.com>
> > > ---
> > > docs/formatdomain.html.in | 9 ++++++++-
> > > src/conf/domain_conf.c | 14 +++++++++++---
> > > 2 files changed, 19 insertions(+), 4 deletions(-)a
[...]
> Without the fix under former <numa> domain description
libvirt would
> bring whole '16' vcpus to the guest, and if there was a current value
> given all by current on top of the <numa> <cell ... to NUMA-node0
> for that matter :-( so wrong;
>
> <vcpu placement='static'>16</vcpu>
> ..
> <numa>
> <cell id='0' cpus='0' memory='262144'
unit='KiB'/>
> <cell id='1' cpus='1' memory='262144'
unit='KiB'/>
> <cell id='2' cpus='2' memory='262144'
unit='KiB'/>
> <cell id='3' cpus='3' memory='262144'
unit='KiB'/>
> </numa>
This is saying that we have 16 CPU sockets, but you're only
assigning NUMA affinity to 4 of the sockets. IMHO that is
a configuration mistake. All 16 CPU sockets need to have
affinity assigned, so that when they're later brought online
they can be placed suitably on the host.
>
> With the fix, its post configuration action will now nicely rewrite
> the <vcpu ... current='#count'> element as shown below;
>
> <vcpu placement='static' current='4'>16</vcpu>
> ..
> <numa>
> <cell id='0' cpus='0' memory='262144'
unit='KiB'/>
> <cell id='1' cpus='1' memory='262144'
unit='KiB'/>
> <cell id='2' cpus='2' memory='262144'
unit='KiB'/>
> <cell id='3' cpus='3' memory='262144'
unit='KiB'/>
> </numa>
IMHO that is still just as broken as the previous config,
because it isn't saying how to set affinity on the remaining
12 vCPUs that can be brought online.
Yes, exactly. The reason for leaving this in the 'gray zone' was that we
did not check this previously and qemu just picked one of the nodes and
dumped the rest of the vcpus there.
The correct way would be to enforce full mapping in NUMA.
Note that this needs to be done in the validation callbacks rather than
the post parse callbacks, since there's real risk of VMs vanishing.
Validation callbacks are triggered only at start of the VM so provide an
error.
Peter