On Thu, 2 Jul 2015 15:18:46 +0100
"Daniel P. Berrange" <berrange(a)redhat.com> wrote:
On Thu, Jul 02, 2015 at 04:02:58PM +0200, Henning Schild wrote:
> Hi,
>
> i am currently looking into realtime VMs using libvirt. My first
> starting point was reserving a couple of cores using isolcpus and
> later tuning the affinity to place my vcpus on the reserved pcpus.
>
> My first observation was that libvirt ignores isolcpus. Affinity
> masks of new qemus will default to all cpus and will not be
> inherited from libvirtd. A comment in the code suggests that this
> is done on purpose.
Ignore realtime + isolcpus for a minute. It is not unreasonable for
the system admin to decide system services should be restricted to
run on a certain subset of CPUs. If we let VMs inherit the CPU
pinning on libvirtd, we'd be accidentally confining VMs to a subset
of CPUs too. With new cgroups layout, libvirtd lives in a separate
cgroups tree /system.slice, while VMs live in /machine.slice. So
for both these reasons, when starting VMs, we explicitly ignore
any affinity libvirtd has and set VMs mask to allow any CPU.
Sure, that was my first guess as well. Still i wanted to raise the
topic again from the realtime POV.
I am using a pretty recent libvirt from git but did not come across the
system.slice yet. Might be a matter of configuration/invocation of
libvirtd.
> After that i changed the code to use only the available cpus by
> default. But taskset was still showing all 'f's on my qemus. Then i
> traced my change down to sched_setaffinity assuming that some other
> mechanism might have reverted my hack, but it is still in place.
From the libvirt POV, we can't tell whether the admin set isolcpus
because they want to reserve those CPUs only for VMs, or because
they want to stop VMs using those CPUs by default. As such libvirt
does not try to interpret isolcpus at all, it leaves it upto a
higher level app to decide on this policy.
I know, you have to tell libvirt that the reservation is actually for
libvirt. My idea was to introduce a config option in libvirt and maybe
sanity check it by looking at whether the pcpus are actually reserved.
Rik recently posted a patch to allow easy programmatic checking of
isolcpus via sysfs.
In the case of OpenStack, the /etc/nova/nova.conf allows a config
setting 'vcpu_pin_set' to say what set of CPUs VMs should be allowed
to run on, and nova will then update the libvirt XML when starting
each guest.
I see, would it not still make sense to have that setting centrally in
libvirt? I am thinking about people not using nova but virsh or
virt-manager.
> Libvirt is setting up cgroups and now my suspicion is that
cgroups
> and taskset might not work well together.
> > /sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2dvm1.scope/vcpu0#
> > cpuacct.usage_percpu
> > 247340587 50851635 89631114 23383025 412639264 1241965 55442753
> > 19923 14093629 15863859 27403280 1292195745 82031088 53690508
> > 135826421 124915000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> Looks like the last 16 cores are not used.
>
> But if i use taskset to ask for the affinity mask i get all 32 cpus.
>
> > taskset -p `cat tasks`
> > pid 12905's current affinity mask: ffffffff
>
> I know that is not strictly libvirt but also a kernel question,
> still you guys are probably able to point me to what i am missing
> here.
>
> > Linux 3.18.11+ #4 SMP PREEMPT RT
BTW, I dropped Osier from the CC list, since he no longer works
as Red Hat.
Yeah, the reply from my mailserver suggested that.
Henning