On Mon, May 09, 2022 at 05:02:07PM +0200, Michal Privoznik wrote:
The Linux kernel offers a way to mitigate side channel attacks on
Hyper
Threads (e.g. MDS and L1TF). Long story short, userspace can define
groups of processes (aka trusted groups) and only processes within one
group can run on sibling Hyper Threads. The group membership is
automatically preserved on fork() and exec().
Now, there is one scenario which I don't cover in my series and I'd like
to hear proposal: if there are two guests with odd number of vCPUs they
can no longer run on sibling Hyper Threads because my patches create
separate group for each QEMU. This is a performance penalty. Ideally, we
would have a knob inside domain XML that would place two or more domains
into the same trusted group. But since there's pre-existing example (of
sharing a piece of information between two domains) I've failed to come
up with something usable.
Right now users have two choices
- Run with SMT enabled. 100% of CPUs available. VMs are vulnerable
- Run with SMT disabled. 50% of CPUs available. VMs are safe
What the core scheduling gives is somewhere inbetween, depending on
the vCPU count. If we assume all guests have even CPUs then
- Run with SMT enabled + core scheduling. 100% of CPUs available.
100% of CPUs are used, VMs are safe
This is the ideal scenario, and probably the fairly common scenario
too as IMHO even number CPU counts are likely to be typical.
If we assume the worst case, of entirely 1 vCPU guests then we have
- Run with SMT enabled + core scheduling. 100% of CPUs available.
50% of CPUs are used, VMs are safe
This feels highly unlikely though, as all except tiny workloads
want > 1 vCPU.
With entirely 3 vCPU guests then we have
- Run with SMT enabled + core scheduling. 100% of CPUs available.
75% of CPUs are used, VMs are safe
With entirely 5 vCPU guests then we have
- Run with SMT enabled + core scheduling. 100% of CPUs available.
83% of CPUs are used, VMs are safe
If we have a mix of even and odd numbered vCPU guests, with mostly
even numbered, then I think utilization will be high enough that
almost no one will care about the last few %.
While we could try to come up with a way to express sharing of
cores between VMs I don't think its worth it, in the absence of
someone presenting compelling data why it'll be needed in a non
niche use case. Bear in mind, that users can also resort to
pinning VMs explicitly to get sharing.
In terms of defaults I'd very much like us to default to enabling
core scheduling, so that we have a secure deployment out of the box.
The only caveat is that this does have the potential to be interpreted
as a regression for existing deployments in some cases. Perhaps we
should make it a meson option for distros to decide whether to ship
with it turned on out of the box or not ?
I don't think we need core scheduling to be a VM XML config option,
because security is really a host level matter IMHO, such that it
does't make sense to have both secure & insecure VMs co-located.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|