[RFC PATCH 0/2] Handle physical address bits

Hello everyone, These patches let one specify how wide the physical addresses are, in bits. Using current QEMU's default of 40 may limit the amount of RAM the guest can see (which is the reason why, in our packages, we bump that to 42; and as far as I've understood from reading some old mailing list threads on the subject, other downstreams do something similar). It also can cause other problems, such as the one described in: https://bugzilla.redhat.com/show_bug.cgi?id=1578278#c5 Basically, the VM thinking and reporting to the user that L1TF is unmitigated, because while its RAM may fits in MAX_PHYS_ADDR (e.g., equal to 42 or 40) it does not fit in MAX_PHYS_ADDR/2, which is necessary for PTE inversion to be effective. I've also mentioned this during my KVM Forum talk (slides 81, 82, from here: http://xenbits.xen.org/people/dariof/talks/Virtual%20Topology%20Friend%20or%...) The series alleviates the problem by providing an user with an easy way to either specify an arbitrary number of physical address bits bits for the VM (with, e.g., <maxphysaddr mode='emulate' bits='42'/>) or just using the same number of bits of the host (with <maxphysaddr mode='passthrough'/>). This in theory is already possible, but only in an hack-ish way, such as adding: <qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='host,host-phys-bits=on'/> </qemu:commandline> But this is super inconvenient. :-) I have not done it such as host-phys-bits=on is automatically added when using cpu-passthrough as CPU model, as I think that that actually belongs in QEMU. Basically, this works and can be used very much like the <cache/> element. Ah, so, the series is RFC basically because the QEMU capability test is failing. I think I may be failing at understanding how things should work. Or maybe I "just" need to regenerate the files against which capabilities themselves are checked (and if this is the case, I need to understand how). Well, I'll keep looking into this, and try to figure it out (but, if anyone has any hint, that's much appreciated :-P). Feedback welcome. Thanks and Regards --- Dario Faggioli (2): Add support for specifying max physical address size. qemu: Add support for max physical address size src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 28 +++++++++++ src/qemu/qemu_domain.c | 46 +++++++++++++++++++ .../cpu-phys-bits-emulate.args | 29 ++++++++++++ .../cpu-phys-bits-emulate.xml | 20 ++++++++ .../cpu-phys-bits-emulate2.args | 30 ++++++++++++ .../cpu-phys-bits-emulate2.xml | 20 ++++++++ .../cpu-phys-bits-emulate3.err | 1 + .../cpu-phys-bits-emulate3.xml | 20 ++++++++ .../cpu-phys-bits-passthrough.args | 29 ++++++++++++ .../cpu-phys-bits-passthrough.xml | 20 ++++++++ .../cpu-phys-bits-passthrough2.err | 1 + .../cpu-phys-bits-passthrough2.xml | 20 ++++++++ .../cpu-phys-bits-passthrough3.err | 1 + .../cpu-phys-bits-passthrough3.xml | 20 ++++++++ tests/qemuxml2argvtest.c | 7 +++ 17 files changed, 295 insertions(+) create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

This patch introduces the: <maxphysaddr mode='passthrough'/> <maxphysaddr mode='emulate' bits='42'/> sub element of /domain/cpu. Purpose is being able to have a guest see the physical address size exactly as the host does (if mode='passthrough' is used) or any size the user wants it to see (if mode='emulate' is used and bits='' is specified). This can be useful if the VM needs to have a large amount of memory. Signed-off-by: Dario Faggioli <dfaggioli@suse.com> --- docs/formatdomain.rst | 21 ++++++++ docs/schemas/cputypes.rng | 19 +++++++ src/conf/cpu_conf.c | 52 ++++++++++++++++++++ src/conf/cpu_conf.h | 17 +++++++ src/libvirt_private.syms | 2 + .../genericxml2xmlindata/cpu-phys-bits-emulate.xml | 20 ++++++++ .../cpu-phys-bits-passthrough.xml | 20 ++++++++ tests/genericxml2xmltest.c | 3 + 8 files changed, 154 insertions(+) create mode 100644 tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml create mode 100644 tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index ae635bedff..851391011f 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1207,6 +1207,7 @@ following collection of elements. :since:`Since 0.7.5` <vendor>Intel</vendor> <topology sockets='1' dies='1' cores='2' threads='1'/> <cache level='3' mode='emulate'/> + <maxphysaddr mode='emulate' bits='42'> <feature policy='disable' name='lahf_lm'/> </cpu> ... @@ -1223,6 +1224,7 @@ following collection of elements. :since:`Since 0.7.5` <cpu mode='host-passthrough' migratable='off'> <cache mode='passthrough'/> + <maxphysaddr mode='passthrough'> <feature policy='disable' name='lahf_lm'/> ... @@ -1446,6 +1448,25 @@ In case no restrictions need to be put on CPU model and its features, a simpler The virtual CPU will report no CPU cache of the specified level (or no cache at all if the ``level`` attribute is missing). +``maxphysaddress`` + :since:`Since 6.10.0` the ``maxphysaddr`` element specifies the size in bits + of the physical addresses. The default behavior is that the vCPUs will see + what it is configured by default in the hypervisor itself. + + ``mode`` + The following values are supported: + + ``passthrough`` + The number of physical address bits reported by the host CPU will be + passed through to the virtual CPUs + ``emulate`` + The hypervisor will define a specific value for the number of bits + of physical addresses via the ``bits`` arrtibute, which is mandatory. + + ``bits`` + The number of bits of the physical addresses that the vCPUs should see, + if the ``mode`` attribute is set to ``emulate``. + Guest NUMA topology can be specified using the ``numa`` element. :since:`Since 0.9.8` diff --git a/docs/schemas/cputypes.rng b/docs/schemas/cputypes.rng index aea6ff0267..232115226a 100644 --- a/docs/schemas/cputypes.rng +++ b/docs/schemas/cputypes.rng @@ -299,6 +299,22 @@ </element> </define> + <define name="cpuMaxPhysAddr"> + <element name="maxphysaddr"> + <attribute name="mode"> + <choice> + <value>emulate</value> + <value>passthrough</value> + </choice> + </attribute> + <optional> + <attribute name="bits"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + <define name="hostcpu"> <element name="cpu"> <element name="arch"> @@ -410,6 +426,9 @@ <optional> <ref name="cpuCache"/> </optional> + <optional> + <ref name="cpuMaxPhysAddr"/> + </optional> </interleave> </element> </define> diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 7778e01131..a151bbf45b 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -83,6 +83,11 @@ VIR_ENUM_IMPL(virCPUCacheMode, "disable", ); +VIR_ENUM_IMPL(virCPUMaxPhysAddrMode, + VIR_CPU_MAX_PHYS_ADDR_MODE_LAST, + "emulate", + "passthrough", +); virCPUDefPtr virCPUDefNew(void) { @@ -128,6 +133,7 @@ virCPUDefFree(virCPUDefPtr def) if (g_atomic_int_dec_and_test(&def->refs)) { virCPUDefFreeModel(def); VIR_FREE(def->cache); + VIR_FREE(def->addr); VIR_FREE(def->tsc); VIR_FREE(def); } @@ -250,6 +256,11 @@ virCPUDefCopyWithoutModel(const virCPUDef *cpu) *copy->cache = *cpu->cache; } + if (cpu->addr) { + copy->addr = g_new0(virCPUMaxPhysAddrDef, 1); + *copy->addr = *cpu->addr; + } + if (cpu->tsc) { copy->tsc = g_new0(virHostCPUTscInfo, 1); *copy->tsc = *cpu->tsc; @@ -670,6 +681,38 @@ virCPUDefParseXML(xmlXPathContextPtr ctxt, def->cache->mode = mode; } + if (virXPathInt("count(./maxphysaddr)", ctxt, &n) < 0) { + return -1; + } else if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("at most one CPU maximum physical address bits " + "element may be specified")); + return -1; + } else if (n == 1) { + int bits = -1; + g_autofree char *strmode = NULL; + int mode; + + if (virXPathBoolean("boolean(./maxphysaddr[1]/@bits)", ctxt) == 1 && + (virXPathInt("string(./maxphysaddr[1]/@bits)", ctxt, &bits) < 0 || + bits < 0)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("CPU maximum physical address bits < 0")); + return -1; + } + + if (!(strmode = virXPathString("string(./maxphysaddr[1]/@mode)", ctxt)) || + (mode = virCPUMaxPhysAddrModeTypeFromString(strmode)) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing or invalid CPU maximum physical " + "address bits mode")); + return -1; + } + + def->addr = g_new0(virCPUMaxPhysAddrDef, 1); + def->addr->bits = bits; + def->addr->mode = mode; + } *cpu = g_steal_pointer(&def); return 0; } @@ -841,6 +884,15 @@ virCPUDefFormatBuf(virBufferPtr buf, virBufferAddLit(buf, "/>\n"); } + if (def->addr) { + virBufferAddLit(buf, "<maxphysaddr "); + if (def->addr->bits != -1) + virBufferAsprintf(buf, "bits='%d' ", def->addr->bits); + virBufferAsprintf(buf, "mode='%s'", + virCPUMaxPhysAddrModeTypeToString(def->addr->mode)); + virBufferAddLit(buf, "/>\n"); + } + for (i = 0; i < def->nfeatures; i++) { virCPUFeatureDefPtr feature = def->features + i; diff --git a/src/conf/cpu_conf.h b/src/conf/cpu_conf.h index 3ef14b7932..e7bbe916a3 100644 --- a/src/conf/cpu_conf.h +++ b/src/conf/cpu_conf.h @@ -117,6 +117,22 @@ struct _virCPUCacheDef { virCPUCacheMode mode; }; +typedef enum { + VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE, + VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH, + + VIR_CPU_MAX_PHYS_ADDR_MODE_LAST +} virCPUMaxPhysAddrMode; + +VIR_ENUM_DECL(virCPUMaxPhysAddrMode); + +typedef struct _virCPUMaxPhysAddrDef virCPUMaxPhysAddrDef; +typedef virCPUMaxPhysAddrDef *virCPUMaxPhysAddrDefPtr; +struct _virCPUMaxPhysAddrDef { + int bits; /* -1 for unspecified */ + virCPUMaxPhysAddrMode mode; +}; + typedef struct _virCPUDef virCPUDef; typedef virCPUDef *virCPUDefPtr; @@ -140,6 +156,7 @@ struct _virCPUDef { size_t nfeatures_max; virCPUFeatureDefPtr features; virCPUCacheDefPtr cache; + virCPUMaxPhysAddrDefPtr addr; virHostCPUTscInfoPtr tsc; virTristateSwitch migratable; /* for host-passthrough mode */ }; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 95e50835ad..8629f81d33 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -117,6 +117,8 @@ virCPUDefParseXMLString; virCPUDefRef; virCPUDefStealModel; virCPUDefUpdateFeature; +virCPUMaxPhysAddrModeTypeFromString; +virCPUMaxPhysAddrModeTypeToString; virCPUModeTypeToString; diff --git a/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml b/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml new file mode 100644 index 0000000000..0b1b0f1672 --- /dev/null +++ b/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr bits='42' mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml b/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml new file mode 100644 index 0000000000..cce676eaa6 --- /dev/null +++ b/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 5110bfba86..6c1c6e699f 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -256,6 +256,9 @@ mymain(void) DO_TEST_BACKUP_FULL("backup-pull-internal-invalid", true); + DO_TEST("cpu-phys-bits-emulate"); + DO_TEST("cpu-phys-bits-passthrough"); + virObjectUnref(caps); virObjectUnref(xmlopt);

On Thu, Oct 29, 2020 at 03:55:36PM +0000, Dario Faggioli wrote:
This patch introduces the:
<maxphysaddr mode='passthrough'/> <maxphysaddr mode='emulate' bits='42'/>
sub element of /domain/cpu.
That makes sense as a location.
Purpose is being able to have a guest see the physical address size exactly as the host does (if mode='passthrough' is used) or any size the user wants it to see (if mode='emulate' is used and bits='' is specified).
This can be useful if the VM needs to have a large amount of memory.
I notice that QEMU recently (well in 2018) introduced a further option "host-phys-bits-limit". This says to use host phys bits, but cap it at a certain level. I don't see that this does anything that we can't already do with the existing "phys-bits" setting though - it merely automates the lookup of the host bits. So I don't think we particlarly need host-phys-bits-limit in libvirt. What I think we do want though is for the <capabilities> XML to report the host CPU's phys bits, so mgmt apps can query this and use it to decide on a suitable guest limit. We have ABIs for checking guest ABI compatibility, which should probably validate these new settings somewhree.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com> --- docs/formatdomain.rst | 21 ++++++++ docs/schemas/cputypes.rng | 19 +++++++ src/conf/cpu_conf.c | 52 ++++++++++++++++++++ src/conf/cpu_conf.h | 17 +++++++ src/libvirt_private.syms | 2 + .../genericxml2xmlindata/cpu-phys-bits-emulate.xml | 20 ++++++++ .../cpu-phys-bits-passthrough.xml | 20 ++++++++ tests/genericxml2xmltest.c | 3 + 8 files changed, 154 insertions(+) create mode 100644 tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml create mode 100644 tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index ae635bedff..851391011f 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1207,6 +1207,7 @@ following collection of elements. :since:`Since 0.7.5` <vendor>Intel</vendor> <topology sockets='1' dies='1' cores='2' threads='1'/> <cache level='3' mode='emulate'/> + <maxphysaddr mode='emulate' bits='42'> <feature policy='disable' name='lahf_lm'/> </cpu> ... @@ -1223,6 +1224,7 @@ following collection of elements. :since:`Since 0.7.5`
<cpu mode='host-passthrough' migratable='off'> <cache mode='passthrough'/> + <maxphysaddr mode='passthrough'> <feature policy='disable' name='lahf_lm'/> ...
@@ -1446,6 +1448,25 @@ In case no restrictions need to be put on CPU model and its features, a simpler The virtual CPU will report no CPU cache of the specified level (or no cache at all if the ``level`` attribute is missing).
+``maxphysaddress`` + :since:`Since 6.10.0` the ``maxphysaddr`` element specifies the size in bits + of the physical addresses. The default behavior is that the vCPUs will see + what it is configured by default in the hypervisor itself. + + ``mode`` + The following values are supported: + + ``passthrough`` + The number of physical address bits reported by the host CPU will be + passed through to the virtual CPUs + ``emulate`` + The hypervisor will define a specific value for the number of bits + of physical addresses via the ``bits`` arrtibute, which is mandatory. + + ``bits`` + The number of bits of the physical addresses that the vCPUs should see, + if the ``mode`` attribute is set to ``emulate``. + Guest NUMA topology can be specified using the ``numa`` element. :since:`Since 0.9.8`
diff --git a/docs/schemas/cputypes.rng b/docs/schemas/cputypes.rng index aea6ff0267..232115226a 100644 --- a/docs/schemas/cputypes.rng +++ b/docs/schemas/cputypes.rng @@ -299,6 +299,22 @@ </element> </define>
+ <define name="cpuMaxPhysAddr"> + <element name="maxphysaddr"> + <attribute name="mode"> + <choice> + <value>emulate</value> + <value>passthrough</value> + </choice> + </attribute> + <optional> + <attribute name="bits"> + <ref name="unsignedInt"/> + </attribute> + </optional> + </element> + </define> + <define name="hostcpu"> <element name="cpu"> <element name="arch"> @@ -410,6 +426,9 @@ <optional> <ref name="cpuCache"/> </optional> + <optional> + <ref name="cpuMaxPhysAddr"/> + </optional> </interleave> </element> </define> diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 7778e01131..a151bbf45b 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -83,6 +83,11 @@ VIR_ENUM_IMPL(virCPUCacheMode, "disable", );
+VIR_ENUM_IMPL(virCPUMaxPhysAddrMode, + VIR_CPU_MAX_PHYS_ADDR_MODE_LAST, + "emulate", + "passthrough", +);
virCPUDefPtr virCPUDefNew(void) { @@ -128,6 +133,7 @@ virCPUDefFree(virCPUDefPtr def) if (g_atomic_int_dec_and_test(&def->refs)) { virCPUDefFreeModel(def); VIR_FREE(def->cache); + VIR_FREE(def->addr); VIR_FREE(def->tsc); VIR_FREE(def); } @@ -250,6 +256,11 @@ virCPUDefCopyWithoutModel(const virCPUDef *cpu) *copy->cache = *cpu->cache; }
+ if (cpu->addr) { + copy->addr = g_new0(virCPUMaxPhysAddrDef, 1); + *copy->addr = *cpu->addr; + } + if (cpu->tsc) { copy->tsc = g_new0(virHostCPUTscInfo, 1); *copy->tsc = *cpu->tsc; @@ -670,6 +681,38 @@ virCPUDefParseXML(xmlXPathContextPtr ctxt, def->cache->mode = mode; }
+ if (virXPathInt("count(./maxphysaddr)", ctxt, &n) < 0) { + return -1; + } else if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("at most one CPU maximum physical address bits " + "element may be specified")); + return -1; + } else if (n == 1) { + int bits = -1; + g_autofree char *strmode = NULL; + int mode; + + if (virXPathBoolean("boolean(./maxphysaddr[1]/@bits)", ctxt) == 1 && + (virXPathInt("string(./maxphysaddr[1]/@bits)", ctxt, &bits) < 0 || + bits < 0)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("CPU maximum physical address bits < 0")); + return -1; + } + + if (!(strmode = virXPathString("string(./maxphysaddr[1]/@mode)", ctxt)) || + (mode = virCPUMaxPhysAddrModeTypeFromString(strmode)) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing or invalid CPU maximum physical " + "address bits mode")); + return -1; + } + + def->addr = g_new0(virCPUMaxPhysAddrDef, 1); + def->addr->bits = bits; + def->addr->mode = mode; + } *cpu = g_steal_pointer(&def); return 0; } @@ -841,6 +884,15 @@ virCPUDefFormatBuf(virBufferPtr buf, virBufferAddLit(buf, "/>\n"); }
+ if (def->addr) { + virBufferAddLit(buf, "<maxphysaddr "); + if (def->addr->bits != -1) + virBufferAsprintf(buf, "bits='%d' ", def->addr->bits); + virBufferAsprintf(buf, "mode='%s'", + virCPUMaxPhysAddrModeTypeToString(def->addr->mode)); + virBufferAddLit(buf, "/>\n"); + } + for (i = 0; i < def->nfeatures; i++) { virCPUFeatureDefPtr feature = def->features + i;
diff --git a/src/conf/cpu_conf.h b/src/conf/cpu_conf.h index 3ef14b7932..e7bbe916a3 100644 --- a/src/conf/cpu_conf.h +++ b/src/conf/cpu_conf.h @@ -117,6 +117,22 @@ struct _virCPUCacheDef { virCPUCacheMode mode; };
+typedef enum { + VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE, + VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH, + + VIR_CPU_MAX_PHYS_ADDR_MODE_LAST +} virCPUMaxPhysAddrMode; + +VIR_ENUM_DECL(virCPUMaxPhysAddrMode); + +typedef struct _virCPUMaxPhysAddrDef virCPUMaxPhysAddrDef; +typedef virCPUMaxPhysAddrDef *virCPUMaxPhysAddrDefPtr; +struct _virCPUMaxPhysAddrDef { + int bits; /* -1 for unspecified */ + virCPUMaxPhysAddrMode mode; +}; +
typedef struct _virCPUDef virCPUDef; typedef virCPUDef *virCPUDefPtr; @@ -140,6 +156,7 @@ struct _virCPUDef { size_t nfeatures_max; virCPUFeatureDefPtr features; virCPUCacheDefPtr cache; + virCPUMaxPhysAddrDefPtr addr; virHostCPUTscInfoPtr tsc; virTristateSwitch migratable; /* for host-passthrough mode */ }; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 95e50835ad..8629f81d33 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -117,6 +117,8 @@ virCPUDefParseXMLString; virCPUDefRef; virCPUDefStealModel; virCPUDefUpdateFeature; +virCPUMaxPhysAddrModeTypeFromString; +virCPUMaxPhysAddrModeTypeToString; virCPUModeTypeToString;
diff --git a/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml b/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml new file mode 100644 index 0000000000..0b1b0f1672 --- /dev/null +++ b/tests/genericxml2xmlindata/cpu-phys-bits-emulate.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr bits='42' mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml b/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml new file mode 100644 index 0000000000..cce676eaa6 --- /dev/null +++ b/tests/genericxml2xmlindata/cpu-phys-bits-passthrough.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 5110bfba86..6c1c6e699f 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -256,6 +256,9 @@ mymain(void)
DO_TEST_BACKUP_FULL("backup-pull-internal-invalid", true);
+ DO_TEST("cpu-phys-bits-emulate"); + DO_TEST("cpu-phys-bits-passthrough"); +
virObjectUnref(caps); virObjectUnref(xmlopt);
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 2020-11-04 at 13:10 +0000, Daniel P. Berrangé wrote:
What I think we do want though is for the <capabilities> XML to report the host CPU's phys bits, so mgmt apps can query this and use it to decide on a suitable guest limit.
Right. I'll add a patch for this.
We have ABIs for checking guest ABI compatibility, which should probably validate these new settings somewhree.
Yes, all this capabilities thing is what I'm struggling the most with, currently. But I'll figure it out, I'm sure :-) Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

This patch maps /domain/cpu/maxphysaddr into -cpu parameters: - <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42 Passthrough mode can only be used if the chosen CPU model is 'host-passthrough'. The feature is available since QEMU 2.7.0. Signed-off-by: Dario Faggioli <dfaggioli@suse.com> --- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 src/qemu/qemu_command.c | 28 ++++++++++++ src/qemu/qemu_domain.c | 46 ++++++++++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate.args | 29 +++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml | 20 +++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args | 30 +++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml | 20 +++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err | 1 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml | 20 +++++++++ .../cpu-phys-bits-passthrough.args | 29 +++++++++++++ .../qemuxml2argvdata/cpu-phys-bits-passthrough.xml | 20 +++++++++ .../cpu-phys-bits-passthrough2.err | 1 .../cpu-phys-bits-passthrough2.xml | 20 +++++++++ .../cpu-phys-bits-passthrough3.err | 1 .../cpu-phys-bits-passthrough3.xml | 20 +++++++++ tests/qemuxml2argvtest.c | 7 +++ 17 files changed, 295 insertions(+) create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a67fb785b5..70adb423f1 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -603,6 +603,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "virtio-balloon.free-page-reporting", "block-export-add", "netdev.vhost-vdpa", + "host-phys-bits", ); @@ -1679,6 +1680,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsMaxCPU[] = { { "unavailable-features", QEMU_CAPS_CPU_UNAVAILABLE_FEATURES }, { "kvm-no-adjvtime", QEMU_CAPS_CPU_KVM_NO_ADJVTIME }, { "migratable", QEMU_CAPS_CPU_MIGRATABLE }, + { "host-phys-bits", QEMU_CAPS_CPU_PHYS_BITS }, }; static virQEMUCapsObjectTypeProps virQEMUCapsObjectProps[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 047ba8a0ee..0fe97d2fd1 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -583,6 +583,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_VIRTIO_BALLOON_FREE_PAGE_REPORTING, /*virtio balloon free-page-reporting */ QEMU_CAPS_BLOCK_EXPORT_ADD, /* 'block-export-add' command is supported */ QEMU_CAPS_NETDEV_VHOST_VDPA, /* -netdev vhost-vdpa*/ + QEMU_CAPS_CPU_PHYS_BITS, /* -cpu phys-bits=42 or host-phys-bits=on */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7847706594..d58f80547e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6507,6 +6507,34 @@ qemuBuildCpuCommandLine(virCommandPtr cmd, virBufferAddLit(&buf, ",l3-cache=off"); } + if (def->cpu && def->cpu->addr) { + virCPUMaxPhysAddrDefPtr addr = def->cpu->addr; + + switch (addr->mode) { + case VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH: + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CPU_PHYS_BITS)) + virBufferAddLit(&buf, ",host-phys-bits=on"); + else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Setting host physical address bits is " + "not supported by this QEMU")); + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE: + if (addr->bits != -1 && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_CPU_PHYS_BITS)) + virBufferAsprintf(&buf, ",phys-bits=%d", addr->bits); + else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Physical address bits unspecified or " + "setting it not supported by this QEMU")); + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_LAST: + break; + } + } + cpu = virBufferContentAndReset(&cpu_buf); cpu_flags = virBufferContentAndReset(&buf); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d7dbca487a..e9f20d82a1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4051,6 +4051,52 @@ qemuDomainDefCPUPostParse(virDomainDefPtr def, } } + if (def->cpu->addr) { + virCPUMaxPhysAddrDefPtr addr = def->cpu->addr; + + if (!ARCH_IS_X86(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits specification " + "is not supported for '%s' architecture"), + virArchToString(def->os.arch)); + return -1; + } + + switch (addr->mode) { + case VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH: + if (def->cpu->mode != VIR_CPU_MODE_HOST_PASSTHROUGH) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits mode '%s' " + "can only be used with '%s' CPUs"), + virCPUMaxPhysAddrModeTypeToString(addr->mode), + virCPUModeTypeToString(VIR_CPU_MODE_HOST_PASSTHROUGH)); + return -1; + } + if (addr->bits != -1) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits number " + "specification cannot be used with " + "mode='%s'"), + virCPUMaxPhysAddrModeTypeToString(VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH)); + return -1; + } + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE: + if (addr->bits == -1) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("if using CPU maximum physical address " + "mode='%s', bits= must be specified too"), + virCPUMaxPhysAddrModeTypeToString(VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE)); + return -1; + } + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_LAST: + break; + } + } + for (i = 0; i < def->cpu->nfeatures; i++) { virCPUFeatureDefPtr feature = &def->cpu->features[i]; diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args new file mode 100644 index 0000000000..5627b41b25 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu host,phys-bits=42 \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml new file mode 100644 index 0000000000..f8bd63bc68 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='emulate' bits='42'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args new file mode 100644 index 0000000000..f105f96f02 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args @@ -0,0 +1,30 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu core2duo,+ds,+acpi,+ss,+ht,+tm,+pbe,+ds_cpl,+vmx,+est,+tm2,+cx16,+xtpr,\ ++lahf_lm,phys-bits=42 \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml new file mode 100644 index 0000000000..188b3066ed --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-model'> + <maxphysaddr bits='42' mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err new file mode 100644 index 0000000000..5e21998259 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err @@ -0,0 +1 @@ +unsupported configuration: if using CPU maximum physical address mode='emulate', bits= must be specified too diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml new file mode 100644 index 0000000000..30a14894dd --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args new file mode 100644 index 0000000000..a4f3f55bb9 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu host,host-phys-bits=on \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml new file mode 100644 index 0000000000..db570beb8d --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err new file mode 100644 index 0000000000..22009cc6e6 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err @@ -0,0 +1 @@ +unsupported configuration: CPU maximum physical address bits mode 'passthrough' can only be used with 'host-passthrough' CPUs diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml new file mode 100644 index 0000000000..511bbf9949 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-model'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err new file mode 100644 index 0000000000..28f2e43432 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err @@ -0,0 +1 @@ +unsupported configuration: CPU maximum physical address bits number specification cannot be used with mode='passthrough' diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml new file mode 100644 index 0000000000..a94e567dcb --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough' bits='42'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index c5a0095e0d..fd17fea744 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -3409,6 +3409,13 @@ mymain(void) DO_TEST_CAPS_LATEST("virtio-9p-multidevs"); + DO_TEST("cpu-phys-bits-passthrough", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST("cpu-phys-bits-emulate", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST("cpu-phys-bits-emulate2", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST_PARSE_ERROR("cpu-phys-bits-emulate3", QEMU_CAPS_KVM); + DO_TEST_PARSE_ERROR("cpu-phys-bits-passthrough2", QEMU_CAPS_KVM); + DO_TEST_PARSE_ERROR("cpu-phys-bits-passthrough3", QEMU_CAPS_KVM); + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakerootdir);

On Thu, Oct 29, 2020 at 5:07 PM Dario Faggioli <dfaggioli@suse.com> wrote:
This patch maps /domain/cpu/maxphysaddr into -cpu parameters:
- <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42
I can't thank you enough Dario for starting this, I have waited for this quite a while and never found the time for it myself :-/ Looking at my todo notes I wondered if while touching it we should right away also add host-phys-bits-limit in the same spot? See https://git.qemu.org/?p=qemu.git;a=commit;h=258fe08bd341d
Passthrough mode can only be used if the chosen CPU model is 'host-passthrough'.
The feature is available since QEMU 2.7.0.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com> --- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 src/qemu/qemu_command.c | 28 ++++++++++++ src/qemu/qemu_domain.c | 46 ++++++++++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate.args | 29 +++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml | 20 +++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args | 30 +++++++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml | 20 +++++++++ tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err | 1 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml | 20 +++++++++ .../cpu-phys-bits-passthrough.args | 29 +++++++++++++ .../qemuxml2argvdata/cpu-phys-bits-passthrough.xml | 20 +++++++++ .../cpu-phys-bits-passthrough2.err | 1 .../cpu-phys-bits-passthrough2.xml | 20 +++++++++ .../cpu-phys-bits-passthrough3.err | 1 .../cpu-phys-bits-passthrough3.xml | 20 +++++++++ tests/qemuxml2argvtest.c | 7 +++ 17 files changed, 295 insertions(+) create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err create mode 100644 tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a67fb785b5..70adb423f1 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -603,6 +603,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "virtio-balloon.free-page-reporting", "block-export-add", "netdev.vhost-vdpa", + "host-phys-bits", );
@@ -1679,6 +1680,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsMaxCPU[] = { { "unavailable-features", QEMU_CAPS_CPU_UNAVAILABLE_FEATURES }, { "kvm-no-adjvtime", QEMU_CAPS_CPU_KVM_NO_ADJVTIME }, { "migratable", QEMU_CAPS_CPU_MIGRATABLE }, + { "host-phys-bits", QEMU_CAPS_CPU_PHYS_BITS }, };
static virQEMUCapsObjectTypeProps virQEMUCapsObjectProps[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 047ba8a0ee..0fe97d2fd1 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -583,6 +583,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_VIRTIO_BALLOON_FREE_PAGE_REPORTING, /*virtio balloon free-page-reporting */ QEMU_CAPS_BLOCK_EXPORT_ADD, /* 'block-export-add' command is supported */ QEMU_CAPS_NETDEV_VHOST_VDPA, /* -netdev vhost-vdpa*/ + QEMU_CAPS_CPU_PHYS_BITS, /* -cpu phys-bits=42 or host-phys-bits=on */
QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7847706594..d58f80547e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6507,6 +6507,34 @@ qemuBuildCpuCommandLine(virCommandPtr cmd, virBufferAddLit(&buf, ",l3-cache=off"); }
+ if (def->cpu && def->cpu->addr) { + virCPUMaxPhysAddrDefPtr addr = def->cpu->addr; + + switch (addr->mode) { + case VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH: + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CPU_PHYS_BITS)) + virBufferAddLit(&buf, ",host-phys-bits=on"); + else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Setting host physical address bits is " + "not supported by this QEMU")); + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE: + if (addr->bits != -1 && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_CPU_PHYS_BITS)) + virBufferAsprintf(&buf, ",phys-bits=%d", addr->bits); + else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Physical address bits unspecified or " + "setting it not supported by this QEMU")); + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_LAST: + break; + } + } + cpu = virBufferContentAndReset(&cpu_buf); cpu_flags = virBufferContentAndReset(&buf);
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d7dbca487a..e9f20d82a1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4051,6 +4051,52 @@ qemuDomainDefCPUPostParse(virDomainDefPtr def, } }
+ if (def->cpu->addr) { + virCPUMaxPhysAddrDefPtr addr = def->cpu->addr; + + if (!ARCH_IS_X86(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits specification " + "is not supported for '%s' architecture"), + virArchToString(def->os.arch)); + return -1; + } + + switch (addr->mode) { + case VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH: + if (def->cpu->mode != VIR_CPU_MODE_HOST_PASSTHROUGH) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits mode '%s' " + "can only be used with '%s' CPUs"), + virCPUMaxPhysAddrModeTypeToString(addr->mode), + virCPUModeTypeToString(VIR_CPU_MODE_HOST_PASSTHROUGH)); + return -1; + } + if (addr->bits != -1) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("CPU maximum physical address bits number " + "specification cannot be used with " + "mode='%s'"), + virCPUMaxPhysAddrModeTypeToString(VIR_CPU_MAX_PHYS_ADDR_MODE_PASSTHROUGH)); + return -1; + } + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE: + if (addr->bits == -1) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("if using CPU maximum physical address " + "mode='%s', bits= must be specified too"), + virCPUMaxPhysAddrModeTypeToString(VIR_CPU_MAX_PHYS_ADDR_MODE_EMULATE)); + return -1; + } + break; + + case VIR_CPU_MAX_PHYS_ADDR_MODE_LAST: + break; + } + } + for (i = 0; i < def->cpu->nfeatures; i++) { virCPUFeatureDefPtr feature = &def->cpu->features[i];
diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args new file mode 100644 index 0000000000..5627b41b25 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu host,phys-bits=42 \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml new file mode 100644 index 0000000000..f8bd63bc68 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='emulate' bits='42'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args new file mode 100644 index 0000000000..f105f96f02 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.args @@ -0,0 +1,30 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu core2duo,+ds,+acpi,+ss,+ht,+tm,+pbe,+ds_cpl,+vmx,+est,+tm2,+cx16,+xtpr,\ ++lahf_lm,phys-bits=42 \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml new file mode 100644 index 0000000000..188b3066ed --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate2.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-model'> + <maxphysaddr bits='42' mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err new file mode 100644 index 0000000000..5e21998259 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.err @@ -0,0 +1 @@ +unsupported configuration: if using CPU maximum physical address mode='emulate', bits= must be specified too diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml new file mode 100644 index 0000000000..30a14894dd --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-emulate3.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='emulate'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args new file mode 100644 index 0000000000..a4f3f55bb9 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-foo \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-foo/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-foo/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-foo/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name foo \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu host,host-phys-bits=on \ +-m 214 \ +-realtime mlock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-foo/monitor.sock,server,\ +nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml new file mode 100644 index 0000000000..db570beb8d --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err new file mode 100644 index 0000000000..22009cc6e6 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.err @@ -0,0 +1 @@ +unsupported configuration: CPU maximum physical address bits mode 'passthrough' can only be used with 'host-passthrough' CPUs diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml new file mode 100644 index 0000000000..511bbf9949 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough2.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-model'> + <maxphysaddr mode='passthrough'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err new file mode 100644 index 0000000000..28f2e43432 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.err @@ -0,0 +1 @@ +unsupported configuration: CPU maximum physical address bits number specification cannot be used with mode='passthrough' diff --git a/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml new file mode 100644 index 0000000000..a94e567dcb --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-phys-bits-passthrough3.xml @@ -0,0 +1,20 @@ +<domain type='kvm'> + <name>foo</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='host-passthrough'> + <maxphysaddr mode='passthrough' bits='42'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index c5a0095e0d..fd17fea744 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -3409,6 +3409,13 @@ mymain(void)
DO_TEST_CAPS_LATEST("virtio-9p-multidevs");
+ DO_TEST("cpu-phys-bits-passthrough", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST("cpu-phys-bits-emulate", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST("cpu-phys-bits-emulate2", QEMU_CAPS_KVM, QEMU_CAPS_CPU_PHYS_BITS); + DO_TEST_PARSE_ERROR("cpu-phys-bits-emulate3", QEMU_CAPS_KVM); + DO_TEST_PARSE_ERROR("cpu-phys-bits-passthrough2", QEMU_CAPS_KVM); + DO_TEST_PARSE_ERROR("cpu-phys-bits-passthrough3", QEMU_CAPS_KVM); + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakerootdir);
-- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd

On Mon, Nov 02, 2020 at 09:14:22AM +0100, Christian Ehrhardt wrote:
On Thu, Oct 29, 2020 at 5:07 PM Dario Faggioli <dfaggioli@suse.com> wrote:
This patch maps /domain/cpu/maxphysaddr into -cpu parameters:
- <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42
I can't thank you enough Dario for starting this, I have waited for this quite a while and never found the time for it myself :-/
Looking at my todo notes I wondered if while touching it we should right away also add host-phys-bits-limit in the same spot? See https://git.qemu.org/?p=qemu.git;a=commit;h=258fe08bd341d
I'm not convinced we actally need it in libvirt. Assuming we report the host phys bits in the capabilities XML, apps can already achieve this functionality with the mode=emulate bits=NN settings by populating NN based on the capabilities XML. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 2020-11-04 at 13:11 +0000, Daniel P. Berrangé wrote:
On Mon, Nov 02, 2020 at 09:14:22AM +0100, Christian Ehrhardt wrote:
Looking at my todo notes I wondered if while touching it we should right away also add host-phys-bits-limit in the same spot? See https://git.qemu.org/?p=qemu.git;a=commit;h=258fe08bd341d
I'm not convinced we actally need it in libvirt. Assuming we report the host phys bits in the capabilities XML, apps can already achieve this functionality with the mode=emulate bits=NN settings by populating NN based on the capabilities XML.
Right. So, host-phys-bits-limit seems to exist because: "Some downstream distributions of QEMU set host-phys-bits=on by default. [...] Now choosing a large phys-bits value for a VM has bigger impact: it will make KVM use 5-level EPT even when it's not really necessary." And we want to be able to, from QEMU: "keep using the host phys-bits value, but only if it's smaller than 48." So, I was thinking that, in a world where QEMU will have host-phys-bits=on by default, if using `-cpu host` you may want to be able to do the same from libvirt, i.e., "same as host, but not more than NN". In my head (and drafts), that would happen with: <maxphysaddr mode="passthrough" limit="NN"/> (1) Which is very similar and may be identical to: <maxphysaddr mode="emulate" bits="NN"/> (2) Whether or not they're actually identical, e.g., if NN > MM, depends on what QEMU does. In that case, with (1) we set MM. With (2), if QEMU let us do that, we set NN. (And when I say what QEMU does, I don't mean, right know, I mean in the particular version in use, assuming that at least in theory it can change between different versions). So, maybe having (1) may be the only way of making sure that I get min(NN,MM), irrespective of QEMU version/behavior. Or am I missing something? That's why I happen to think there could be value in having limit... Or does this all just not make sense? :-) Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

On Wed, Nov 04, 2020 at 07:00:16PM +0100, Dario Faggioli wrote:
On Wed, 2020-11-04 at 13:11 +0000, Daniel P. Berrangé wrote:
On Mon, Nov 02, 2020 at 09:14:22AM +0100, Christian Ehrhardt wrote:
Looking at my todo notes I wondered if while touching it we should right away also add host-phys-bits-limit in the same spot? See https://git.qemu.org/?p=qemu.git;a=commit;h=258fe08bd341d
I'm not convinced we actally need it in libvirt. Assuming we report the host phys bits in the capabilities XML, apps can already achieve this functionality with the mode=emulate bits=NN settings by populating NN based on the capabilities XML.
Right. So, host-phys-bits-limit seems to exist because: "Some downstream distributions of QEMU set host-phys-bits=on by default. [...] Now choosing a large phys-bits value for a VM has bigger impact: it will make KVM use 5-level EPT even when it's not really necessary."
And we want to be able to, from QEMU: "keep using the host phys-bits value, but only if it's smaller than 48."
Right so it is basically a hack to workaround problem with historical defaults in QEMU. In the libvirt case we're not dealing with defaults, we have explicit XML element to express what we want. So we can express it straight away and not need this hack.
So, I was thinking that, in a world where QEMU will have host-phys-bits=on by default, if using `-cpu host` you may want to be able to do the same from libvirt, i.e., "same as host, but not more than NN".
In my head (and drafts), that would happen with:
<maxphysaddr mode="passthrough" limit="NN"/> (1)
Which is very similar and may be identical to:
<maxphysaddr mode="emulate" bits="NN"/> (2)
Whether or not they're actually identical, e.g., if NN > MM, depends on what QEMU does. In that case, with (1) we set MM. With (2), if QEMU let us do that, we set NN. (And when I say what QEMU does, I don't mean, right know, I mean in the particular version in use, assuming that at least in theory it can change between different versions).
So, maybe having (1) may be the only way of making sure that I get min(NN,MM), irrespective of QEMU version/behavior. Or am I missing something?
I don't see any functional difference between 1 & 2 from libvirt's side. In fact (2) is probably better because it can work on any version of QEMU, even before host-phys-bits-limit was introduced. The "downside" is that an app has to read the capabilities to see the current host limit and choose the "NN" value.
That's why I happen to think there could be value in having limit... Or does this all just not make sense? :-)
I think we can ignore host-phys-bits-limit. If it later turns out we really do need it, we can add it to libvirt, but until then pretend it doesn't exist.
Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 2020-11-04 at 18:04 +0000, Daniel P. Berrangé wrote:
On Wed, Nov 04, 2020 at 07:00:16PM +0100, Dario Faggioli wrote:
Right. So, host-phys-bits-limit seems to exist because: "Some downstream distributions of QEMU set host-phys-bits=on by default. [...] Now choosing a large phys-bits value for a VM has bigger impact: it will make KVM use 5-level EPT even when it's not really necessary."
And we want to be able to, from QEMU: "keep using the host phys- bits value, but only if it's smaller than 48."
Right so it is basically a hack to workaround problem with historical defaults in QEMU. In the libvirt case we're not dealing with defaults, we have explicit XML element to express what we want. So we can express it straight away and not need this hack.
Ok.
In my head (and drafts), that would happen with:
<maxphysaddr mode="passthrough" limit="NN"/> (1)
Which is very similar and may be identical to:
<maxphysaddr mode="emulate" bits="NN"/> (2)
[...]
So, maybe having (1) may be the only way of making sure that I get min(NN,MM), irrespective of QEMU version/behavior. Or am I missing something?
I don't see any functional difference between 1 & 2 from libvirt's side. In fact (2) is probably better because it can work on any version of QEMU, even before host-phys-bits-limit was introduced.
That is indeed true.
The "downside" is that an app has to read the capabilities to see the current host limit and choose the "NN" value.
That's why I happen to think there could be value in having limit... Or does this all just not make sense? :-)
I think we can ignore host-phys-bits-limit. If it later turns out we really do need it, we can add it to libvirt, but until then pretend it doesn't exist.
Right. I think I've understood your line of reasoning and I agree. Especially on the fact that we can ignore it for now, and always add it later, if we see the need for it. And I certainly can look into adding the phys bits to the host capabilities. Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

On Mon, 2020-11-02 at 09:14 +0100, Christian Ehrhardt wrote:
On Thu, Oct 29, 2020 at 5:07 PM Dario Faggioli <dfaggioli@suse.com> wrote:
This patch maps /domain/cpu/maxphysaddr into -cpu parameters:
- <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42
I can't thank you enough Dario for starting this, I have waited for this quite a while and never found the time for it myself :-/
Hi! Yeah, it was sitting in my TODO list for a little while as well. Then I decided to mention that I'd do it in my talk at KVM Forum... Which is usually a pretty good push for actually do it! ;-P
Looking at my todo notes I wondered if while touching it we should right away also add host-phys-bits-limit in the same spot? See https://git.qemu.org/?p=qemu.git;a=commit;h=258fe08bd341d
Yes, I'm aware of it, and it looks easy to support. And I agree that we can do it contextually. I even have it in drafted patches, but I decided to fire these out as early as possible, to capture some early feedback. Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

On Thu, Oct 29, 2020 at 03:55:42PM +0000, Dario Faggioli wrote:
This patch maps /domain/cpu/maxphysaddr into -cpu parameters:
- <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42
Passthrough mode can only be used if the chosen CPU model is 'host-passthrough'.
The feature is available since QEMU 2.7.0.
I wonder if we need to care about doing any kind of host compatibility checking when these are specified. QEMU historically hasn't checked compatibility, meaning you could configure bits=48 on a host CPU that only supports bits=40. Things would seem to work unless you gave a guest RAM size that needed soo many bits. QEMU is reluctant to change things due to back compat issues with existing deployments. Libvirt wouldn't have that issue though, as we would only need to enforce it when users actually gave the <maxphysaddr> element, which is opt-in and won't be present in existing guests. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 2020-11-04 at 13:14 +0000, Daniel P. Berrangé wrote:
On Thu, Oct 29, 2020 at 03:55:42PM +0000, Dario Faggioli wrote:
This patch maps /domain/cpu/maxphysaddr into -cpu parameters:
- <maxphysaddr mode='passthrough'/> becomes host-phys-bits=on - <maxphysaddr mode='emualte' bits='42'/> becomes phys-bits=42
Passthrough mode can only be used if the chosen CPU model is 'host-passthrough'.
The feature is available since QEMU 2.7.0.
I wonder if we need to care about doing any kind of host compatibility checking when these are specified.
Ah, yes, this is a good point.
QEMU is reluctant to change things due to back compat issues with existing deployments. Libvirt wouldn't have that issue though, as we would only need to enforce it when users actually gave the <maxphysaddr> element, which is opt-in and won't be present in existing guests.
Yes, and I'm already adding the logic for checking the host phys bits, for putting it in the host capabilities, so it should be easy enough to add a check here. I'll go for it. Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

On Thu, Oct 29, 2020 at 03:55:30PM +0000, Dario Faggioli wrote:
Hello everyone,
These patches let one specify how wide the physical addresses are, in bits.
Using current QEMU's default of 40 may limit the amount of RAM the guest can see (which is the reason why, in our packages, we bump that to 42; and as far as I've understood from reading some old mailing list threads on the subject, other downstreams do something similar).
In RHEL we provide custom machine types which set host-phys-bits=on by default, which is why making it configurable in libvirt hasn't been a high priority previously.
It also can cause other problems, such as the one described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1578278#c5
Basically, the VM thinking and reporting to the user that L1TF is unmitigated, because while its RAM may fits in MAX_PHYS_ADDR (e.g., equal to 42 or 40) it does not fit in MAX_PHYS_ADDR/2, which is necessary for PTE inversion to be effective.
Yep, that's not nice.
The series alleviates the problem by providing an user with an easy way to either specify an arbitrary number of physical address bits bits for the VM (with, e.g., <maxphysaddr mode='emulate' bits='42'/>) or just using the same number of bits of the host (with <maxphysaddr mode='passthrough'/>).
This in theory is already possible, but only in an hack-ish way, such as adding:
<qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='host,host-phys-bits=on'/> </qemu:commandline>
But this is super inconvenient. :-)
And explicitly unsupported in a production deployment, so not a viable solution except for experimentation.
I have not done it such as host-phys-bits=on is automatically added when using cpu-passthrough as CPU model, as I think that that actually belongs in QEMU.
Agreed, I think I raised that as a suggestion before too. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 2020-11-04 at 13:03 +0000, Daniel P. Berrangé wrote:
On Thu, Oct 29, 2020 at 03:55:30PM +0000, Dario Faggioli wrote:
Using current QEMU's default of 40 may limit the amount of RAM the guest can see (which is the reason why, in our packages, we bump that to 42; and as far as I've understood from reading some old mailing list threads on the subject, other downstreams do something similar).
In RHEL we provide custom machine types which set host-phys-bits=on by default, which is why making it configurable in libvirt hasn't been a high priority previously.
Yeah, and as I said, we bump that 40 to 42 in our QEMU package. I guess this is why not many people have bumped into this on our side either. Until now that I bumped into it... So, sad to admit, but apparently 42 was not the answer in this case. :-P Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)
participants (3)
-
Christian Ehrhardt
-
Daniel P. Berrangé
-
Dario Faggioli