> In short: there is no (live) migration support for nested VMX
yet. So as
> soon as your guest is using VMX itself ("nVMX"), this is not expected to
> work.
Hi David, thanks for getting back to us on this.
Hi Florian,
(sombeody please correct me if I'm wrong)
I see your point, except the issue Kashyap and I are describing does
not occur with live migration, it occurs with savevm/loadvm (virsh
managedsave/virsh start in libvirt terms, nova suspend/resume in
OpenStack lingo). And it's not immediately self-evident that the
limitations for the former also apply to the latter. Even for the live
migration limitation, I've been unsuccessful at finding documentation
that warns users to not attempt live migration when using nesting, and
this discussion sounds like a good opportunity for me to help fix
that.
Just to give an example,
https://www.redhat.com/en/blog/inception-how-usable-are-nested-kvm-guests
from just last September talks explicitly about how "guests can be
snapshot/resumed, migrated to other hypervisors and much more" in the
opening paragraph, and then talks at length about nested guests —
without ever pointing out that those very features aren't expected to
work for them. :)
Well, it still is a kernel parameter "nested" that is disabled by
default. So things should be expected to be shaky. :) While running
nested guests work usually fine, migrating a nested hypervisor is the
problem.
Especially see e.g.
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/...
"However, note that nested virtualization is not supported or
recommended in production user environments, and is primarily intended
for development and testing. "
So to clarify things, could you enumerate the currently known
limitations when enabling nesting? I'd be happy to summarize those and
add them to the
linux-kvm.org FAQ so others are less likely to hit
their head on this issue. In particular:
The general problem is that migration of an L1 will not work when it is
running L2, so when L1 is using VMX ("nVMX").
Migrating an L2 should work as before.
The problem is, in order for L1 to make use of VMX to run L2, we have to
run L2 in L0, simulating VMX -> nested VMX a.k.a. nVMX . This requires
additional state information about L1 ("nVMX" state), which is not
properly migrated when migrating L1. Therefore, after migration, the CPU
state of L1 might be screwed up after migration, resulting in L1 crashes.
In addition, certain VMX features might be missing on the target, which
also still has to be handled via the CPU model in the future.
L0, should hopefully not crash, I hope that you are not seeing that.
- Is
https://fedoraproject.org/wiki/How_to_enable_nested_virtualization_in_KVM
still accurate in that -cpu host (libvirt "host-passthrough") is the
strongly recommended configuration for the L2 guest?
- If so, are there any recommendations for how to configure the L1
guest with regard to CPU model?
You have to indicate the VMX feature to your L1 ("nested hypervisor"),
that is usually automatically done by using the "host-passthrough" or
"host-model" value. If you're using a custom CPU model, you have to
enable it explicitly.
- Is live migration with nested guests _always_ expected to break on
all architectures, and if not, which are safe?
x86 VMX: running nested guests works, migrating nested hypervisors does
not work
x86 SVM: running nested guests works, migrating nested hypervisor does
not work (somebody correct me if I'm wrong)
s390x: running nested guests works, migrating nested hypervisors works
power: running nested guests works only via KVM-PR ("trap and emulate").
migrating nested hypervisors therefore works. But we are not using
hardware virtualization for L1->L2. (my latest status)
arm: running nested guests is in the works (my latest status), migration
is therefore also not possible.
- Idem, for savevm/loadvm?
savevm/loadvm is not expected to work correctly on an L1 if it is
running L2 guests. It should work on L2 however.
- With regard to the problem that Kashyap and I (and Dennis, the
kernel.org bugzilla reporter) are describing, is this expected to work
any better on AMD CPUs? (All reports are on Intel)
No, remeber that they are also still missing migration support of the
nested SVM state.
- Do you expect nested virtualization functionality to be adversely
affected by KPTI and/or other Meltdown/Spectre mitigation patches?
Not an expert on this. I think it should be affected in a similar way as
ordinary guests :)
Kashyap, can you think of any other limitations that would benefit
from improved documentation?
We should certainly document what I have summaries here properly at a
central palce!
Cheers,
Florian
--
Thanks,
David / dhildenb