On Wed, Apr 03, 2019 at 01:48:33PM -0400, Cole Robinson wrote:
On 3/26/19 4:06 PM, Allen, John wrote:
> For pinned vcpus, vcpupin will report inaccurate affinity values on machines
> with high core counts (256 cores in my case). The problem is produced as
> follows:
>
> $ virsh vcpupin myguest 0 4
>
> $ virsh vcpupin myguest 0
>
> VCPU CPU Affinity
> ---------------------------
> 0 4,192,194,196-197
>
> Running taskset on the qemu threads shows the correct affinity, so this seems
> to be a reporting problem. Strangely, the value "192" is significant. If I
pin
> a cpu greater than 192, the problem no longer appears.
>
> I believe the cause of the problem in my case is that in this case in
> src/conf/domain_conf.c:virDomainDefGetVcpuPinInfoHelper:
>
> ...
> if (vcpu && vcpu->cpumask)
> bitmap = vcpu->cpumask;
> ...
>
> vcpu->cpumask is "shortened" in that it is only long enough to contain
the last
> set bit in the mask. However, when we go to copy the mask to the buffer that is
> returned, we use the masklen passed to the function which is the "full"
> masklen with a bit for each cpu. So it seems virBitmapToDataBuf copies some
> extra data past the end of the bitmask. Why the "192" value is always set
and I
> typically see similar bogus bits set is still unknown.
>
> What is the function meant to assume in this case? Is it sane to assume that
> the bitmask is the full length of the buffer here and it's the responsibility
> of the setter of vcpu->cpumask to provide the length of the bitmap we're
> expecting? Or should we assume that we may receive a shortened bitmask here and
> expand the bitmask before copying to the buffer?
>
Hi Cole,
Sorry for the delayed response. I am just getting back from vacation. I have
provided the information below. However, I believe I have an understanding of
the problem and I will be submitting a patch later today.
I didn't dig into the code much but I can try and help figure it
out.
Can you also provide:
* libvirt version
libvirt 5.3.0
* host distro
Ubuntu 18.04
* output of 'virsh nodeinfo'
CPU model: x86_64
CPU(s): 256
CPU frequency: 1739 MHz
CPU socket(s): 1
Core(s) per socket: 64
Thread(s) per core: 2
NUMA cell(s): 2
Memory size: 131833228 KiB
* output of 'virsh nodecpumap'
CPUs present: 256
CPUs online: 256
CPU map:
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
Thanks,
Cole