[PATCH v7 0/2] Dirty Ring support (Libvirt)

From: "Hyman Huang(黄勇)" <huangy81@chinatelecom.cn> v7 - rebase on master - modify the following points according to the advice given by Peter 1. skip the -accel switch and reuse the existing commit d20ebdda2 'qemu: Switch to -accel' 2. remove the post-parse function and do the parse work in virDomainFeaturesKVMDefParse once for all 3. throw an error if "size" not specified when kvm-dirty-ring feature enabled in xml 4. fix the memory leak when parsing xml 5. use macro VIR_ROUND_UP_POWER_OF_TWO to check power-of-two 6. put error messages in one line 7. squash the last 2 commit into 1 8. add test for kvm-dirty-ring feature Thanks for the careful reviews made by Peter. Please review, Thanks! Hyman Ping for this series. I still keep thinking the dirty ring feature is something good to have for libvirt. qemu-6.1 has supported dirty ring feature and followed up with the commit 0e21bf24 "support dirtyrate at the granualrity of vcpu", which is a typical usage scenario of dirty ring. another usage scenario may be the implementation of per-vcpu auto-converge during live migration which is already being reviewed. so we can make full use of dirty ring feature if libvirt supports. and any corrections and comments about this series would be very appreciated. Please review, Thanks! Hyman v6 - rebase on master v5,v4: blank, just make v6 be the the latest version. v3 - rebase master and fix the confilict when apply "conf: introduce dirty_ring_size in struct "_virDomainDef" to current master. v2 - split patchset into 4 patches - leave out the tcg case when building commandline. - handle the VIR_DOMAIN_KVM_DIRTY_RING case independently in , virDomainFeatureDefParse and virDomainDefFeaturesCheckABIStability, do not integrate it with other cases... - add dirty ring size check in virDomainDefFeaturesCheckABIStability - modify zero checks on integers of dirty ring size in a explicit way. - set the default value of dirty ring size in a post-parser callback. - check the absence of kvm_feature in a explicit way. - code clean of virTristateSwitchTypeToString function. this version's modification base on Peter's advices mostly, thanks a lot, please review ! v1 since qemu has introduced a dirty ring feature in 6.1.0, may be it's the right time to introduce dirty ring in libvirt meanwhile. this patch add feature named 'dirty-ring', which enable dirty ring feature when starting vm. to try this out, three things has done in this patchset: - introduce QEMU_CAPS_ACCEL so the the libvirt can use it to select the right option when specifying the accelerator type. - switch the option "-machine accel=xxx" to "-accel xxx" when specifying accelerator type once libvirt build QEMU command line, so that dirty-ring-size property can be passed to qemu when start vm. - introduce dirty_ring_size to hold the ring size configured by user and pass dirty_ring_size when building qemu commandline if dirty ring feature enabled. though dirty ring is per-cpu logically, the size of dirty ring is registered by 'struct kvm' in QEMU. so we would like to place the dirty_ring_size as a property of vm in Libvirt as the QEMU do. the dirty ring feature is disabled by default, and if enabled, the default value of ring size if 4096 if size not configured. for more details about dirty ring and "-accel" option, please refer to: https://lore.kernel.org/qemu-devel/20210108165050.406906-10-peterx@redhat.co... https://lore.kernel.org/qemu-devel/3aa73987-40e8-3619-0723-9f17f73850bd@redh... please review, Thanks! Best Regards ! Hyman Huang(黄勇) (2): qemu: support dirty ring feature tests: add test for kvm-dirty-ring feature docs/formatdomain.rst | 18 ++++--- docs/schemas/domaincommon.rng | 10 ++++ src/conf/domain_conf.c | 54 +++++++++++++++++++ src/conf/domain_conf.h | 4 ++ src/qemu/qemu_command.c | 12 +++++ tests/qemuxml2argvdata/kvm-features-off.xml | 1 + tests/qemuxml2argvdata/kvm-features.args | 2 +- tests/qemuxml2argvdata/kvm-features.xml | 1 + tests/qemuxml2xmloutdata/kvm-features-off.xml | 1 + tests/qemuxml2xmloutdata/kvm-features.xml | 1 + 10 files changed, 95 insertions(+), 9 deletions(-) -- 2.27.0

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Dirty ring feature was introduced in qemu-6.1.0, this patch add the corresponding feature named 'dirty-ring', which enable dirty ring feature when starting vm. To implement the dirty-ring feature, dirty_ring_size in struct "_virDomainDef" is introduced to hold the dirty ring size configured in xml, and it will be used as dirty-ring-size property of kvm accelerator when building qemu commandline, it is something like "-accel dirty-ring-size=xxx". To enable the feature, the following XML needs to be added to the guest's domain description: <features> <kvm> <dirty-ring state='on' size='xxx'> </kvm> </features> If property "state=on", property "size" must be specified, which should be power of 2 and range in [1024, 65526]. Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- docs/formatdomain.rst | 18 ++++++------ docs/schemas/domaincommon.rng | 10 +++++++ src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 +++ src/qemu/qemu_command.c | 12 ++++++++ 5 files changed, 90 insertions(+), 8 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index eb8c973cf1..ea69b61c70 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1843,6 +1843,7 @@ Hypervisors may allow certain CPU / machine features to be toggled on/off. <hint-dedicated state='on'/> <poll-control state='on'/> <pv-ipi state='off'/> + <dirty-ring state='on' size='4096'/> </kvm> <xen> <e820_host state='on'/> @@ -1925,14 +1926,15 @@ are: ``kvm`` Various features to change the behavior of the KVM hypervisor. - ============== ============================================================================ ======= ============================ - Feature Description Value Since - ============== ============================================================================ ======= ============================ - hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` - hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` - poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` - pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` - ============== ============================================================================ ======= ============================ + ============== ============================================================================ ====================================================== ============================ + Feature Description Value Since + ============== ============================================================================ ====================================================== ============================ + hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` + hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` + poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` + pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` + dirty-ring Enable dirty ring feature on, off; size - must be power of 2, range [1024,65536] :since:`7.10.0 (QEMU 6.1)` + ============== ============================================================================ ====================================================== ============================ ``xen`` Various features to change the behavior of the Xen hypervisor. diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f01b7a6470..5f9fe3cc58 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -7212,6 +7212,16 @@ <ref name="featurestate"/> </element> </optional> + <optional> + <element name="dirty-ring"> + <ref name="featurestate"/> + <optional> + <attribute name="size"> + <data type="unsignedInt"/> + </attribute> + </optional> + </element> + </optional> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 552d43b845..6f3c925b55 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -205,6 +205,7 @@ VIR_ENUM_IMPL(virDomainKVM, "hint-dedicated", "poll-control", "pv-ipi", + "dirty-ring", ); VIR_ENUM_IMPL(virDomainXen, @@ -17589,6 +17590,25 @@ virDomainFeaturesKVMDefParse(virDomainDef *def, def->kvm_features[feature] = value; + /* dirty ring feature should parse size property */ + if (((virDomainKVM) feature == VIR_DOMAIN_KVM_DIRTY_RING) && + (value == VIR_TRISTATE_SWITCH_ON)) { + + if (virXMLPropUInt(node, "size", 0, VIR_XML_PROP_REQUIRED, + &def->dirty_ring_size) < 0) { + return -1; + } + + if ((def->dirty_ring_size != VIR_ROUND_UP_POWER_OF_TWO(def->dirty_ring_size)) || + def->dirty_ring_size < 1024 || + def->dirty_ring_size > 65536) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("dirty ring must be power of 2 and ranges [1024, 65536]")); + + return -1; + } + } + node = xmlNextElementSibling(node); } @@ -21824,7 +21844,27 @@ virDomainDefFeaturesCheckABIStability(virDomainDef *src, virTristateSwitchTypeToString(dst->kvm_features[i])); return false; } + break; + case VIR_DOMAIN_KVM_DIRTY_RING: + if (src->kvm_features[i] != dst->kvm_features[i]) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("State of KVM feature '%s' differs: " + "source: '%s', destination: '%s'"), + virDomainKVMTypeToString(i), + virTristateSwitchTypeToString(src->kvm_features[i]), + virTristateSwitchTypeToString(dst->kvm_features[i])); + return false; + } + + if (src->dirty_ring_size != dst->dirty_ring_size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("dirty ring size of KVM feature '%s' differs: " + "source: '%d', destination: '%d'"), + virDomainKVMTypeToString(i), + src->dirty_ring_size, dst->dirty_ring_size); + return false; + } break; case VIR_DOMAIN_KVM_LAST: @@ -27872,6 +27912,20 @@ virDomainDefFormatFeatures(virBuffer *buf, def->kvm_features[j])); break; + case VIR_DOMAIN_KVM_DIRTY_RING: + if (def->kvm_features[j] != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&childBuf, "<%s state='%s'", + virDomainKVMTypeToString(j), + virTristateSwitchTypeToString(def->kvm_features[j])); + if (def->dirty_ring_size) { + virBufferAsprintf(&childBuf, " size='%d'/>\n", + def->dirty_ring_size); + } else { + virBufferAddLit(&childBuf, "/>\n"); + } + } + break; + case VIR_DOMAIN_KVM_LAST: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8634960313..026edde88f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2084,6 +2084,7 @@ typedef enum { VIR_DOMAIN_KVM_DEDICATED, VIR_DOMAIN_KVM_POLLCONTROL, VIR_DOMAIN_KVM_PVIPI, + VIR_DOMAIN_KVM_DIRTY_RING, VIR_DOMAIN_KVM_LAST } virDomainKVM; @@ -2933,6 +2934,9 @@ struct _virDomainDef { should be re-run before starting */ unsigned int scsiBusMaxUnit; + + /* size of dirty ring for each vcpu */ + unsigned int dirty_ring_size; }; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7a185061d8..18a72a79a8 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6860,6 +6860,9 @@ qemuBuildCpuCommandLine(virCommand *cmd, virBufferAddLit(&buf, ",kvm-pv-ipi=off"); break; + case VIR_DOMAIN_KVM_DIRTY_RING: + break; + case VIR_DOMAIN_KVM_LAST: break; } @@ -7280,6 +7283,15 @@ qemuBuildAccelCommandLine(virCommand *cmd, case VIR_DOMAIN_VIRT_KVM: virBufferAddLit(&buf, "kvm"); + /* + * only handle the kvm case, tcg case use the legacy style + * not that either kvm or tcg can be specified by libvirt + * so do not worry about the conflict of specifying both + * */ + if (def->features[VIR_DOMAIN_FEATURE_KVM] == VIR_TRISTATE_SWITCH_ON && + def->kvm_features[VIR_DOMAIN_KVM_DIRTY_RING] == VIR_TRISTATE_SWITCH_ON) { + virBufferAsprintf(&buf, ",dirty-ring-size=%d", def->dirty_ring_size); + } break; case VIR_DOMAIN_VIRT_KQEMU: -- 2.27.0

On 11/23/21 15:36, huangy81@chinatelecom.cn wrote:
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Dirty ring feature was introduced in qemu-6.1.0, this patch add the corresponding feature named 'dirty-ring', which enable dirty ring feature when starting vm.
To implement the dirty-ring feature, dirty_ring_size in struct "_virDomainDef" is introduced to hold the dirty ring size configured in xml, and it will be used as dirty-ring-size property of kvm accelerator when building qemu commandline, it is something like "-accel dirty-ring-size=xxx".
To enable the feature, the following XML needs to be added to the guest's domain description:
<features> <kvm> <dirty-ring state='on' size='xxx'> </kvm> </features>
If property "state=on", property "size" must be specified, which should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- docs/formatdomain.rst | 18 ++++++------ docs/schemas/domaincommon.rng | 10 +++++++ src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 +++ src/qemu/qemu_command.c | 12 ++++++++ 5 files changed, 90 insertions(+), 8 deletions(-)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index eb8c973cf1..ea69b61c70 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1843,6 +1843,7 @@ Hypervisors may allow certain CPU / machine features to be toggled on/off. <hint-dedicated state='on'/> <poll-control state='on'/> <pv-ipi state='off'/> + <dirty-ring state='on' size='4096'/>
I was confused at first, what units is @size in but looking into the qemu docs it's unit-less number: "[dirty-ring-size] it controls the size of the per-vCPU dirty page ring buffer (number of entries for each vCPU)." Therefore I'm okay with having it as a plain attribute. Otherwise for values with units (traditionally size units like KiB/MiB/...) I would advise to go with an extra element.
</kvm> <xen> <e820_host state='on'/> @@ -1925,14 +1926,15 @@ are: ``kvm`` Various features to change the behavior of the KVM hypervisor.
- ============== ============================================================================ ======= ============================ - Feature Description Value Since - ============== ============================================================================ ======= ============================ - hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` - hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` - poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` - pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` - ============== ============================================================================ ======= ============================ + ============== ============================================================================ ====================================================== ============================ + Feature Description Value Since + ============== ============================================================================ ====================================================== ============================ + hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` + hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` + poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` + pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` + dirty-ring Enable dirty ring feature on, off; size - must be power of 2, range [1024,65536] :since:`7.10.0 (QEMU 6.1)` + ============== ============================================================================ ====================================================== ============================
``xen`` Various features to change the behavior of the Xen hypervisor. diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f01b7a6470..5f9fe3cc58 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -7212,6 +7212,16 @@ <ref name="featurestate"/> </element> </optional> + <optional> + <element name="dirty-ring"> + <ref name="featurestate"/> + <optional> + <attribute name="size"> + <data type="unsignedInt"/> + </attribute> + </optional> + </element> + </optional> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 552d43b845..6f3c925b55 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -205,6 +205,7 @@ VIR_ENUM_IMPL(virDomainKVM, "hint-dedicated", "poll-control", "pv-ipi", + "dirty-ring", );
VIR_ENUM_IMPL(virDomainXen, @@ -17589,6 +17590,25 @@ virDomainFeaturesKVMDefParse(virDomainDef *def,
def->kvm_features[feature] = value;
+ /* dirty ring feature should parse size property */ + if (((virDomainKVM) feature == VIR_DOMAIN_KVM_DIRTY_RING) && + (value == VIR_TRISTATE_SWITCH_ON)) { + + if (virXMLPropUInt(node, "size", 0, VIR_XML_PROP_REQUIRED, + &def->dirty_ring_size) < 0) { + return -1; + } + + if ((def->dirty_ring_size != VIR_ROUND_UP_POWER_OF_TWO(def->dirty_ring_size)) ||
VIR_IS_POW2() which works even for other types than uint.
+ def->dirty_ring_size < 1024 || + def->dirty_ring_size > 65536) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("dirty ring must be power of 2 and ranges [1024, 65536]")); + + return -1; + } + } + node = xmlNextElementSibling(node); }
@@ -21824,7 +21844,27 @@ virDomainDefFeaturesCheckABIStability(virDomainDef *src, virTristateSwitchTypeToString(dst->kvm_features[i])); return false; } + break;
+ case VIR_DOMAIN_KVM_DIRTY_RING: + if (src->kvm_features[i] != dst->kvm_features[i]) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("State of KVM feature '%s' differs: " + "source: '%s', destination: '%s'"), + virDomainKVMTypeToString(i), + virTristateSwitchTypeToString(src->kvm_features[i]), + virTristateSwitchTypeToString(dst->kvm_features[i])); + return false; + } + + if (src->dirty_ring_size != dst->dirty_ring_size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("dirty ring size of KVM feature '%s' differs: " + "source: '%d', destination: '%d'"), + virDomainKVMTypeToString(i), + src->dirty_ring_size, dst->dirty_ring_size); + return false; + } break;
case VIR_DOMAIN_KVM_LAST: @@ -27872,6 +27912,20 @@ virDomainDefFormatFeatures(virBuffer *buf, def->kvm_features[j])); break;
+ case VIR_DOMAIN_KVM_DIRTY_RING: + if (def->kvm_features[j] != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&childBuf, "<%s state='%s'", + virDomainKVMTypeToString(j), + virTristateSwitchTypeToString(def->kvm_features[j])); + if (def->dirty_ring_size) { + virBufferAsprintf(&childBuf, " size='%d'/>\n", + def->dirty_ring_size); + } else { + virBufferAddLit(&childBuf, "/>\n"); + } + } + break; + case VIR_DOMAIN_KVM_LAST: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8634960313..026edde88f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2084,6 +2084,7 @@ typedef enum { VIR_DOMAIN_KVM_DEDICATED, VIR_DOMAIN_KVM_POLLCONTROL, VIR_DOMAIN_KVM_PVIPI, + VIR_DOMAIN_KVM_DIRTY_RING,
VIR_DOMAIN_KVM_LAST } virDomainKVM; @@ -2933,6 +2934,9 @@ struct _virDomainDef { should be re-run before starting */
unsigned int scsiBusMaxUnit; + + /* size of dirty ring for each vcpu */ + unsigned int dirty_ring_size;
This feels weird. I haven't followed previous versions that closely, but what I did for TCG features is introducing a struct that lives close to other features. This could then be something like: typedef struct _virDomainFeatureKVM virDomainFeatureKVM; struct _virDomainFeatureKVM { int features[VIR_DOMAIN_KVM_LAST]; unsigned int dirty_ring_size; /* size of dirty ring for each vCPU, no units */ }; struct _virDomainDef { ... - int kvm_features[VIR_DOMAIN_KVM_LAST]; + virDomainFeatureKVM *kvm_features; ... }; So here's what I suggest doing - let me post a patch that changes 'int kvm_features' into a separate struct. I would squash it into yours but it turned out to be quite lengthy change. Then I'll do changes necessary for your patch (which will be trivial after that). Michal

On 12/14/21 17:22, Michal Prívozník wrote:
On 11/23/21 15:36, huangy81@chinatelecom.cn wrote:
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Dirty ring feature was introduced in qemu-6.1.0, this patch add the corresponding feature named 'dirty-ring', which enable dirty ring feature when starting vm.
To implement the dirty-ring feature, dirty_ring_size in struct "_virDomainDef" is introduced to hold the dirty ring size configured in xml, and it will be used as dirty-ring-size property of kvm accelerator when building qemu commandline, it is something like "-accel dirty-ring-size=xxx".
To enable the feature, the following XML needs to be added to the guest's domain description:
<features> <kvm> <dirty-ring state='on' size='xxx'> </kvm> </features>
If property "state=on", property "size" must be specified, which should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- docs/formatdomain.rst | 18 ++++++------ docs/schemas/domaincommon.rng | 10 +++++++ src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 +++ src/qemu/qemu_command.c | 12 ++++++++ 5 files changed, 90 insertions(+), 8 deletions(-)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index eb8c973cf1..ea69b61c70 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1843,6 +1843,7 @@ Hypervisors may allow certain CPU / machine features to be toggled on/off. <hint-dedicated state='on'/> <poll-control state='on'/> <pv-ipi state='off'/> + <dirty-ring state='on' size='4096'/>
I was confused at first, what units is @size in but looking into the qemu docs it's unit-less number:
"[dirty-ring-size] it controls the size of the per-vCPU dirty page ring buffer (number of entries for each vCPU)."
Therefore I'm okay with having it as a plain attribute. Otherwise for values with units (traditionally size units like KiB/MiB/...) I would advise to go with an extra element.
</kvm> <xen> <e820_host state='on'/> @@ -1925,14 +1926,15 @@ are: ``kvm`` Various features to change the behavior of the KVM hypervisor.
- ============== ============================================================================ ======= ============================ - Feature Description Value Since - ============== ============================================================================ ======= ============================ - hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` - hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` - poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` - pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` - ============== ============================================================================ ======= ============================ + ============== ============================================================================ ====================================================== ============================ + Feature Description Value Since + ============== ============================================================================ ====================================================== ============================ + hidden Hide the KVM hypervisor from standard MSR based discovery on, off :since:`1.2.8 (QEMU 2.1.0)` + hint-dedicated Allows a guest to enable optimizations when running on dedicated vCPUs on, off :since:`5.7.0 (QEMU 2.12.0)` + poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` + pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` + dirty-ring Enable dirty ring feature on, off; size - must be power of 2, range [1024,65536] :since:`7.10.0 (QEMU 6.1)` + ============== ============================================================================ ====================================================== ============================
``xen`` Various features to change the behavior of the Xen hypervisor. diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f01b7a6470..5f9fe3cc58 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -7212,6 +7212,16 @@ <ref name="featurestate"/> </element> </optional> + <optional> + <element name="dirty-ring"> + <ref name="featurestate"/> + <optional> + <attribute name="size"> + <data type="unsignedInt"/> + </attribute> + </optional> + </element> + </optional> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 552d43b845..6f3c925b55 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -205,6 +205,7 @@ VIR_ENUM_IMPL(virDomainKVM, "hint-dedicated", "poll-control", "pv-ipi", + "dirty-ring", );
VIR_ENUM_IMPL(virDomainXen, @@ -17589,6 +17590,25 @@ virDomainFeaturesKVMDefParse(virDomainDef *def,
def->kvm_features[feature] = value;
+ /* dirty ring feature should parse size property */ + if (((virDomainKVM) feature == VIR_DOMAIN_KVM_DIRTY_RING) && + (value == VIR_TRISTATE_SWITCH_ON)) { + + if (virXMLPropUInt(node, "size", 0, VIR_XML_PROP_REQUIRED, + &def->dirty_ring_size) < 0) { + return -1; + } + + if ((def->dirty_ring_size != VIR_ROUND_UP_POWER_OF_TWO(def->dirty_ring_size)) ||
VIR_IS_POW2() which works even for other types than uint.
+ def->dirty_ring_size < 1024 || + def->dirty_ring_size > 65536) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("dirty ring must be power of 2 and ranges [1024, 65536]")); + + return -1; + } + } + node = xmlNextElementSibling(node); }
@@ -21824,7 +21844,27 @@ virDomainDefFeaturesCheckABIStability(virDomainDef *src, virTristateSwitchTypeToString(dst->kvm_features[i])); return false; } + break;
+ case VIR_DOMAIN_KVM_DIRTY_RING: + if (src->kvm_features[i] != dst->kvm_features[i]) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("State of KVM feature '%s' differs: " + "source: '%s', destination: '%s'"), + virDomainKVMTypeToString(i), + virTristateSwitchTypeToString(src->kvm_features[i]), + virTristateSwitchTypeToString(dst->kvm_features[i])); + return false; + } + + if (src->dirty_ring_size != dst->dirty_ring_size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("dirty ring size of KVM feature '%s' differs: " + "source: '%d', destination: '%d'"), + virDomainKVMTypeToString(i), + src->dirty_ring_size, dst->dirty_ring_size); + return false; + } break;
case VIR_DOMAIN_KVM_LAST: @@ -27872,6 +27912,20 @@ virDomainDefFormatFeatures(virBuffer *buf, def->kvm_features[j])); break;
+ case VIR_DOMAIN_KVM_DIRTY_RING: + if (def->kvm_features[j] != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&childBuf, "<%s state='%s'", + virDomainKVMTypeToString(j), + virTristateSwitchTypeToString(def->kvm_features[j])); + if (def->dirty_ring_size) { + virBufferAsprintf(&childBuf, " size='%d'/>\n", + def->dirty_ring_size); + } else { + virBufferAddLit(&childBuf, "/>\n"); + } + } + break; + case VIR_DOMAIN_KVM_LAST: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8634960313..026edde88f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2084,6 +2084,7 @@ typedef enum { VIR_DOMAIN_KVM_DEDICATED, VIR_DOMAIN_KVM_POLLCONTROL, VIR_DOMAIN_KVM_PVIPI, + VIR_DOMAIN_KVM_DIRTY_RING,
VIR_DOMAIN_KVM_LAST } virDomainKVM; @@ -2933,6 +2934,9 @@ struct _virDomainDef { should be re-run before starting */
unsigned int scsiBusMaxUnit; + + /* size of dirty ring for each vcpu */ + unsigned int dirty_ring_size;
This feels weird. I haven't followed previous versions that closely, but what I did for TCG features is introducing a struct that lives close to other features. This could then be something like:
typedef struct _virDomainFeatureKVM virDomainFeatureKVM; struct _virDomainFeatureKVM { int features[VIR_DOMAIN_KVM_LAST];
unsigned int dirty_ring_size; /* size of dirty ring for each vCPU, no units */ };
struct _virDomainDef { ... - int kvm_features[VIR_DOMAIN_KVM_LAST]; + virDomainFeatureKVM *kvm_features; ... };
So here's what I suggest doing - let me post a patch that changes 'int kvm_features' into a separate struct. I would squash it into yours but it turned out to be quite lengthy change. Then I'll do changes necessary for your patch (which will be trivial after that).
Michal
Ok, i'll rebase the master once the changes get merged and test if the dirty ring still works. -- Best Regards Hyman Huang(黄勇)

On 12/14/21 12:20, Hyman Huang wrote:
Ok, i'll rebase the master once the changes get merged and test if the dirty ring still works.
You can find all the patches applied in my branch: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/review/ Michal

On 12/14/21 10:22, Michal Prívozník wrote:
On 11/23/21 15:36, huangy81@chinatelecom.cn wrote:
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Dirty ring feature was introduced in qemu-6.1.0, this patch add the corresponding feature named 'dirty-ring', which enable dirty ring feature when starting vm.
To implement the dirty-ring feature, dirty_ring_size in struct "_virDomainDef" is introduced to hold the dirty ring size configured in xml, and it will be used as dirty-ring-size property of kvm accelerator when building qemu commandline, it is something like "-accel dirty-ring-size=xxx".
To enable the feature, the following XML needs to be added to the guest's domain description:
<features> <kvm> <dirty-ring state='on' size='xxx'> </kvm> </features>
If property "state=on", property "size" must be specified, which should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- docs/formatdomain.rst | 18 ++++++------ docs/schemas/domaincommon.rng | 10 +++++++ src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 +++ src/qemu/qemu_command.c | 12 ++++++++ 5 files changed, 90 insertions(+), 8 deletions(-)
So here's what I suggest doing - let me post a patch that changes 'int kvm_features' into a separate struct. I would squash it into yours but it turned out to be quite lengthy change. Then I'll do changes necessary for your patch (which will be trivial after that).
Merged now. Congratulations on your first libvirt contribution! Michal

On 12/14/21 20:21, Michal Prívozník wrote:
On 12/14/21 10:22, Michal Prívozník wrote:
On 11/23/21 15:36, huangy81@chinatelecom.cn wrote:
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Dirty ring feature was introduced in qemu-6.1.0, this patch add the corresponding feature named 'dirty-ring', which enable dirty ring feature when starting vm.
To implement the dirty-ring feature, dirty_ring_size in struct "_virDomainDef" is introduced to hold the dirty ring size configured in xml, and it will be used as dirty-ring-size property of kvm accelerator when building qemu commandline, it is something like "-accel dirty-ring-size=xxx".
To enable the feature, the following XML needs to be added to the guest's domain description:
<features> <kvm> <dirty-ring state='on' size='xxx'> </kvm> </features>
If property "state=on", property "size" must be specified, which should be power of 2 and range in [1024, 65526].
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- docs/formatdomain.rst | 18 ++++++------ docs/schemas/domaincommon.rng | 10 +++++++ src/conf/domain_conf.c | 54 +++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 +++ src/qemu/qemu_command.c | 12 ++++++++ 5 files changed, 90 insertions(+), 8 deletions(-)
So here's what I suggest doing - let me post a patch that changes 'int kvm_features' into a separate struct. I would squash it into yours but it turned out to be quite lengthy change. Then I'll do changes necessary for your patch (which will be trivial after that).
Merged now. Congratulations on your first libvirt contribution!
Michal
Thanks :) -- Best Regards Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Update the KVM feature tests for dirty ring. Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> --- tests/qemuxml2argvdata/kvm-features-off.xml | 1 + tests/qemuxml2argvdata/kvm-features.args | 2 +- tests/qemuxml2argvdata/kvm-features.xml | 1 + tests/qemuxml2xmloutdata/kvm-features-off.xml | 1 + tests/qemuxml2xmloutdata/kvm-features.xml | 1 + 5 files changed, 5 insertions(+), 1 deletion(-) diff --git a/tests/qemuxml2argvdata/kvm-features-off.xml b/tests/qemuxml2argvdata/kvm-features-off.xml index a1004a206b..fb7cbaf061 100644 --- a/tests/qemuxml2argvdata/kvm-features-off.xml +++ b/tests/qemuxml2argvdata/kvm-features-off.xml @@ -15,6 +15,7 @@ <hint-dedicated state='off'/> <poll-control state='off'/> <pv-ipi state='off'/> + <dirty-ring state='off'/> </kvm> </features> <cpu mode='host-passthrough' check='none'/> diff --git a/tests/qemuxml2argvdata/kvm-features.args b/tests/qemuxml2argvdata/kvm-features.args index 371c382b47..1789363736 100644 --- a/tests/qemuxml2argvdata/kvm-features.args +++ b/tests/qemuxml2argvdata/kvm-features.args @@ -12,7 +12,7 @@ QEMU_AUDIO_DRV=none \ -S \ -object secret,id=masterKey0,format=raw,file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ -machine pc,usb=off,dump-guest-core=off \ --accel kvm \ +-accel kvm,dirty-ring-size=4096 \ -cpu host,kvm=off,kvm-hint-dedicated=on,kvm-poll-control=on \ -m 214 \ -realtime mlock=off \ diff --git a/tests/qemuxml2argvdata/kvm-features.xml b/tests/qemuxml2argvdata/kvm-features.xml index 51229a6c37..900431c4ff 100644 --- a/tests/qemuxml2argvdata/kvm-features.xml +++ b/tests/qemuxml2argvdata/kvm-features.xml @@ -15,6 +15,7 @@ <hint-dedicated state='on'/> <poll-control state='on'/> <pv-ipi state='on'/> + <dirty-ring state='on' size='4096'/> </kvm> </features> <cpu mode='host-passthrough' check='none'/> diff --git a/tests/qemuxml2xmloutdata/kvm-features-off.xml b/tests/qemuxml2xmloutdata/kvm-features-off.xml index 52a0ef0065..7ee6525cd9 100644 --- a/tests/qemuxml2xmloutdata/kvm-features-off.xml +++ b/tests/qemuxml2xmloutdata/kvm-features-off.xml @@ -15,6 +15,7 @@ <hint-dedicated state='off'/> <poll-control state='off'/> <pv-ipi state='off'/> + <dirty-ring state='off'/> </kvm> </features> <cpu mode='host-passthrough' check='none' migratable='off'/> diff --git a/tests/qemuxml2xmloutdata/kvm-features.xml b/tests/qemuxml2xmloutdata/kvm-features.xml index 72e66fcbf5..8ce3a2b987 100644 --- a/tests/qemuxml2xmloutdata/kvm-features.xml +++ b/tests/qemuxml2xmloutdata/kvm-features.xml @@ -15,6 +15,7 @@ <hint-dedicated state='on'/> <poll-control state='on'/> <pv-ipi state='on'/> + <dirty-ring state='on' size='4096'/> </kvm> </features> <cpu mode='host-passthrough' check='none' migratable='off'/> -- 2.27.0

Ping Any corrections and suggetions are welcome. Please review, thanks! 在 2021/11/23 22:36, huangy81@chinatelecom.cn 写道:
From: "Hyman Huang(黄勇)" <huangy81@chinatelecom.cn>
v7 - rebase on master - modify the following points according to the advice given by Peter 1. skip the -accel switch and reuse the existing commit d20ebdda2 'qemu: Switch to -accel' 2. remove the post-parse function and do the parse work in virDomainFeaturesKVMDefParse once for all 3. throw an error if "size" not specified when kvm-dirty-ring feature enabled in xml 4. fix the memory leak when parsing xml 5. use macro VIR_ROUND_UP_POWER_OF_TWO to check power-of-two 6. put error messages in one line 7. squash the last 2 commit into 1 8. add test for kvm-dirty-ring feature
Thanks for the careful reviews made by Peter.
Please review, Thanks!
Hyman
Ping for this series.
I still keep thinking the dirty ring feature is something good to have for libvirt.
qemu-6.1 has supported dirty ring feature and followed up with the commit 0e21bf24 "support dirtyrate at the granualrity of vcpu", which is a typical usage scenario of dirty ring. another usage scenario may be the implementation of per-vcpu auto-converge during live migration which is already being reviewed. so we can make full use of dirty ring feature if libvirt supports. and any corrections and comments about this series would be very appreciated.
Please review, Thanks!
Hyman
v6 - rebase on master
v5,v4: blank, just make v6 be the the latest version.
v3 - rebase master and fix the confilict when apply "conf: introduce dirty_ring_size in struct "_virDomainDef" to current master.
v2 - split patchset into 4 patches
- leave out the tcg case when building commandline.
- handle the VIR_DOMAIN_KVM_DIRTY_RING case independently in , virDomainFeatureDefParse and virDomainDefFeaturesCheckABIStability, do not integrate it with other cases...
- add dirty ring size check in virDomainDefFeaturesCheckABIStability
- modify zero checks on integers of dirty ring size in a explicit way.
- set the default value of dirty ring size in a post-parser callback.
- check the absence of kvm_feature in a explicit way.
- code clean of virTristateSwitchTypeToString function.
this version's modification base on Peter's advices mostly, thanks a lot, please review !
v1 since qemu has introduced a dirty ring feature in 6.1.0, may be it's the right time to introduce dirty ring in libvirt meanwhile.
this patch add feature named 'dirty-ring', which enable dirty ring feature when starting vm. to try this out, three things has done in this patchset:
- introduce QEMU_CAPS_ACCEL so the the libvirt can use it to select the right option when specifying the accelerator type.
- switch the option "-machine accel=xxx" to "-accel xxx" when specifying accelerator type once libvirt build QEMU command line, so that dirty-ring-size property can be passed to qemu when start vm.
- introduce dirty_ring_size to hold the ring size configured by user and pass dirty_ring_size when building qemu commandline if dirty ring feature enabled.
though dirty ring is per-cpu logically, the size of dirty ring is registered by 'struct kvm' in QEMU. so we would like to place the dirty_ring_size as a property of vm in Libvirt as the QEMU do.
the dirty ring feature is disabled by default, and if enabled, the default value of ring size if 4096 if size not configured.
for more details about dirty ring and "-accel" option, please refer to: https://lore.kernel.org/qemu-devel/20210108165050.406906-10-peterx@redhat.co... https://lore.kernel.org/qemu-devel/3aa73987-40e8-3619-0723-9f17f73850bd@redh...
please review, Thanks!
Best Regards !
Hyman Huang(黄勇) (2): qemu: support dirty ring feature tests: add test for kvm-dirty-ring feature
docs/formatdomain.rst | 18 ++++--- docs/schemas/domaincommon.rng | 10 ++++ src/conf/domain_conf.c | 54 +++++++++++++++++++ src/conf/domain_conf.h | 4 ++ src/qemu/qemu_command.c | 12 +++++ tests/qemuxml2argvdata/kvm-features-off.xml | 1 + tests/qemuxml2argvdata/kvm-features.args | 2 +- tests/qemuxml2argvdata/kvm-features.xml | 1 + tests/qemuxml2xmloutdata/kvm-features-off.xml | 1 + tests/qemuxml2xmloutdata/kvm-features.xml | 1 + 10 files changed, 95 insertions(+), 9 deletions(-)
participants (4)
-
huangy81@chinatelecom.cn
-
Hyman
-
Hyman Huang
-
Michal Prívozník