[libvirt] [PATCH v3] qemu: Add check for whether KVM nesting is enabled

Support for nested KVM is handled via a kernel module configuration parameters values for kvm_intel, kvm_amd, kvm_hv (PPC), or kvm (s390). While it's possible to fetch the kmod config values via virKModConfig, unfortunately that is the static value and we need to get the current/dynamic value from the kernel file system. So this patch adds a new API virHostKVMSupportsNesting that will search the 3 kernel modules to get the nesting value and check if it is 'Y' (or 'y' just in case) or '1' to return a true/false whether the KVM kernel supports nesting. We need to do this in order to handle cases where adjustments to the value are made after libvirtd is started to force a refetch of the latest QEMU capabilities since the correct CPU settings need to be made for a guest to add the "vmx=on" to/for the guest config. Signed-off-by: John Ferlan <jferlan@redhat.com> NB to be removed before push - I got data from: (IBM Z) https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/htm... (PPC slide 131) https://events.linuxfoundation.org/wp-content/uploads/2017/12/Taking-it-to-t... Signed-off-by: John Ferlan <jferlan@redhat.com> --- v2: https://www.redhat.com/archives/libvir-list/2018-November/msg00955.html Changes from code review... - Rename variables/API's to KVMSupportsNested - Movement of logic to check/set the 'nested' to inside locations that ensure KVM was enabled (via capability). - Change of logic to not use virKModConfig and instead look at the running kernel value for /sys/module/*/parameters/nested where * is kvm_intel, kvm_amd, kvm_hv, or kvm src/qemu/qemu_capabilities.c | 54 ++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 20a1a0c201..bef92a679f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -558,6 +558,7 @@ struct _virQEMUCaps { virObject parent; bool usedQMP; + bool kvmSupportsNesting; char *binary; time_t ctime; @@ -1530,6 +1531,7 @@ virQEMUCapsPtr virQEMUCapsNewCopy(virQEMUCapsPtr qemuCaps) return NULL; ret->usedQMP = qemuCaps->usedQMP; + ret->kvmSupportsNesting = qemuCaps->kvmSupportsNesting; if (VIR_STRDUP(ret->binary, qemuCaps->binary) < 0) goto error; @@ -3589,6 +3591,9 @@ virQEMUCapsLoadCache(virArch hostArch, virQEMUCapsInitHostCPUModel(qemuCaps, hostArch, VIR_DOMAIN_VIRT_KVM); virQEMUCapsInitHostCPUModel(qemuCaps, hostArch, VIR_DOMAIN_VIRT_QEMU); + if (virXPathBoolean("boolean(./kvmSupportsNesting)", ctxt) > 0) + qemuCaps->kvmSupportsNesting = true; + ret = 0; cleanup: VIR_FREE(str); @@ -3808,6 +3813,9 @@ virQEMUCapsFormatCache(virQEMUCapsPtr qemuCaps) if (qemuCaps->sevCapabilities) virQEMUCapsFormatSEVInfo(qemuCaps, &buf); + if (qemuCaps->kvmSupportsNesting) + virBufferAddLit(&buf, "<kvmSupportsNesting/>\n"); + virBufferAdjustIndent(&buf, -2); virBufferAddLit(&buf, "</qemuCaps>\n"); @@ -3848,6 +3856,41 @@ virQEMUCapsSaveFile(void *data, } +/* Check the kernel module parameters 'nested' file to determine if enabled + * + * Intel: 'kvm_intel' uses 'Y' + * AMD: 'kvm_amd' uses '1' + * PPC64: 'kvm_hv' uses 'Y' + * S390: 'kvm' uses '1' + */ +static bool +virQEMUCapsKVMSupportsNesting(void) +{ + static char const * const kmod[] = {"kvm_intel", "kvm_amd", + "kvm_hv", "kvm"}; + VIR_AUTOFREE(char *) value = NULL; + int rc; + size_t i; + + for (i = 0; i < ARRAY_CARDINALITY(kmod); i++) { + VIR_FREE(value); + rc = virFileReadValueString(&value, "/sys/module/%s/parameters/nested", + kmod[i]); + if (rc == -2) + continue; + if (rc < 0) { + virResetLastError(); + return false; + } + + if (value[0] == 'Y' || value[0] == 'y' || value[0] == '1') + return true; + } + + return false; +} + + static bool virQEMUCapsIsValid(void *data, void *privData) @@ -3856,6 +3899,7 @@ virQEMUCapsIsValid(void *data, virQEMUCapsCachePrivPtr priv = privData; bool kvmUsable; struct stat sb; + bool kvmSupportsNesting; if (!qemuCaps->binary) return true; @@ -3933,6 +3977,14 @@ virQEMUCapsIsValid(void *data, qemuCaps->kernelVersion); return false; } + + kvmSupportsNesting = virQEMUCapsKVMSupportsNesting(); + if (kvmSupportsNesting != qemuCaps->kvmSupportsNesting) { + VIR_DEBUG("Outdated capabilities for '%s': kvm kernel nested " + "value changed from %d", + qemuCaps->binary, qemuCaps->kvmSupportsNesting); + return false; + } } return true; @@ -4576,6 +4628,8 @@ virQEMUCapsNewForBinaryInternal(virArch hostArch, if (VIR_STRDUP(qemuCaps->kernelVersion, kernelVersion) < 0) goto error; + + qemuCaps->kvmSupportsNesting = virQEMUCapsKVMSupportsNesting(); } cleanup: -- 2.17.2

ping? Thanks - John On 11/28/18 8:55 PM, John Ferlan wrote:
Support for nested KVM is handled via a kernel module configuration parameters values for kvm_intel, kvm_amd, kvm_hv (PPC), or kvm (s390). While it's possible to fetch the kmod config values via virKModConfig, unfortunately that is the static value and we need to get the current/dynamic value from the kernel file system.
So this patch adds a new API virHostKVMSupportsNesting that will search the 3 kernel modules to get the nesting value and check if it is 'Y' (or 'y' just in case) or '1' to return a true/false whether the KVM kernel supports nesting.
We need to do this in order to handle cases where adjustments to the value are made after libvirtd is started to force a refetch of the latest QEMU capabilities since the correct CPU settings need to be made for a guest to add the "vmx=on" to/for the guest config.
Signed-off-by: John Ferlan <jferlan@redhat.com>
NB to be removed before push - I got data from:
(IBM Z) https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/htm...
(PPC slide 131) https://events.linuxfoundation.org/wp-content/uploads/2017/12/Taking-it-to-t...
Signed-off-by: John Ferlan <jferlan@redhat.com> ---
v2: https://www.redhat.com/archives/libvir-list/2018-November/msg00955.html
Changes from code review... - Rename variables/API's to KVMSupportsNested - Movement of logic to check/set the 'nested' to inside locations that ensure KVM was enabled (via capability). - Change of logic to not use virKModConfig and instead look at the running kernel value for /sys/module/*/parameters/nested where * is kvm_intel, kvm_amd, kvm_hv, or kvm
src/qemu/qemu_capabilities.c | 54 ++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 20a1a0c201..bef92a679f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -558,6 +558,7 @@ struct _virQEMUCaps { virObject parent;
bool usedQMP; + bool kvmSupportsNesting;
char *binary; time_t ctime; @@ -1530,6 +1531,7 @@ virQEMUCapsPtr virQEMUCapsNewCopy(virQEMUCapsPtr qemuCaps) return NULL;
ret->usedQMP = qemuCaps->usedQMP; + ret->kvmSupportsNesting = qemuCaps->kvmSupportsNesting;
if (VIR_STRDUP(ret->binary, qemuCaps->binary) < 0) goto error; @@ -3589,6 +3591,9 @@ virQEMUCapsLoadCache(virArch hostArch, virQEMUCapsInitHostCPUModel(qemuCaps, hostArch, VIR_DOMAIN_VIRT_KVM); virQEMUCapsInitHostCPUModel(qemuCaps, hostArch, VIR_DOMAIN_VIRT_QEMU);
+ if (virXPathBoolean("boolean(./kvmSupportsNesting)", ctxt) > 0) + qemuCaps->kvmSupportsNesting = true; + ret = 0; cleanup: VIR_FREE(str); @@ -3808,6 +3813,9 @@ virQEMUCapsFormatCache(virQEMUCapsPtr qemuCaps) if (qemuCaps->sevCapabilities) virQEMUCapsFormatSEVInfo(qemuCaps, &buf);
+ if (qemuCaps->kvmSupportsNesting) + virBufferAddLit(&buf, "<kvmSupportsNesting/>\n"); + virBufferAdjustIndent(&buf, -2); virBufferAddLit(&buf, "</qemuCaps>\n");
@@ -3848,6 +3856,41 @@ virQEMUCapsSaveFile(void *data, }
+/* Check the kernel module parameters 'nested' file to determine if enabled + * + * Intel: 'kvm_intel' uses 'Y' + * AMD: 'kvm_amd' uses '1' + * PPC64: 'kvm_hv' uses 'Y' + * S390: 'kvm' uses '1' + */ +static bool +virQEMUCapsKVMSupportsNesting(void) +{ + static char const * const kmod[] = {"kvm_intel", "kvm_amd", + "kvm_hv", "kvm"}; + VIR_AUTOFREE(char *) value = NULL; + int rc; + size_t i; + + for (i = 0; i < ARRAY_CARDINALITY(kmod); i++) { + VIR_FREE(value); + rc = virFileReadValueString(&value, "/sys/module/%s/parameters/nested", + kmod[i]); + if (rc == -2) + continue; + if (rc < 0) { + virResetLastError(); + return false; + } + + if (value[0] == 'Y' || value[0] == 'y' || value[0] == '1') + return true; + } + + return false; +} + + static bool virQEMUCapsIsValid(void *data, void *privData) @@ -3856,6 +3899,7 @@ virQEMUCapsIsValid(void *data, virQEMUCapsCachePrivPtr priv = privData; bool kvmUsable; struct stat sb; + bool kvmSupportsNesting;
if (!qemuCaps->binary) return true; @@ -3933,6 +3977,14 @@ virQEMUCapsIsValid(void *data, qemuCaps->kernelVersion); return false; } + + kvmSupportsNesting = virQEMUCapsKVMSupportsNesting(); + if (kvmSupportsNesting != qemuCaps->kvmSupportsNesting) { + VIR_DEBUG("Outdated capabilities for '%s': kvm kernel nested " + "value changed from %d", + qemuCaps->binary, qemuCaps->kvmSupportsNesting); + return false; + } }
return true; @@ -4576,6 +4628,8 @@ virQEMUCapsNewForBinaryInternal(virArch hostArch,
if (VIR_STRDUP(qemuCaps->kernelVersion, kernelVersion) < 0) goto error; + + qemuCaps->kvmSupportsNesting = virQEMUCapsKVMSupportsNesting(); }
cleanup:

On 11/29/18 2:55 AM, John Ferlan wrote:
Support for nested KVM is handled via a kernel module configuration parameters values for kvm_intel, kvm_amd, kvm_hv (PPC), or kvm (s390). While it's possible to fetch the kmod config values via virKModConfig, unfortunately that is the static value and we need to get the current/dynamic value from the kernel file system.
So this patch adds a new API virHostKVMSupportsNesting that will search the 3 kernel modules to get the nesting value and check if it is 'Y' (or 'y' just in case) or '1' to return a true/false whether the KVM kernel supports nesting.
We need to do this in order to handle cases where adjustments to the value are made after libvirtd is started to force a refetch of the latest QEMU capabilities since the correct CPU settings need to be made for a guest to add the "vmx=on" to/for the guest config.
Ah, so if nested KVM is not enabled that qemu might report vmx:false and if it is enabled it reports vmx:true? ACK in that case. Michal

On 12/13/18 8:29 AM, Michal Privoznik wrote:
On 11/29/18 2:55 AM, John Ferlan wrote:
Support for nested KVM is handled via a kernel module configuration parameters values for kvm_intel, kvm_amd, kvm_hv (PPC), or kvm (s390). While it's possible to fetch the kmod config values via virKModConfig, unfortunately that is the static value and we need to get the current/dynamic value from the kernel file system.
So this patch adds a new API virHostKVMSupportsNesting that will search the 3 kernel modules to get the nesting value and check if it is 'Y' (or 'y' just in case) or '1' to return a true/false whether the KVM kernel supports nesting.
We need to do this in order to handle cases where adjustments to the value are made after libvirtd is started to force a refetch of the latest QEMU capabilities since the correct CPU settings need to be made for a guest to add the "vmx=on" to/for the guest config.
Ah, so if nested KVM is not enabled that qemu might report vmx:false and if it is enabled it reports vmx:true?
Right... If I followed the breadcrumbs in the bz (private/locked 1645139) the kernel parameter was changed after libvirtd was started. Then a L0 guest was created using virt-manager and started. When looking at that guest the 'vmx' was not on, thus no nesting allowed. Even restarting libvirtd didn't help because none of the conditions used to determine whether we should re-read the capabilities was present. Only after clearing the capabilities cache did the "proper" thing happen (see comment 14 in the bz - thanks to Pavel for the details - I'm not sure I would have come up with that given the details I saw in the case). John
ACK in that case.
Michal
participants (2)
-
John Ferlan
-
Michal Privoznik