June 2024 - Devel - Libvirt List Archives

[PATCH rfcv4 00/13] LIBVIRT: X86: TDX support

by Zhenzhong Duan

Hi, This series brings libvirt the x86 TDX support. * What's TDX? TDX stands for Trust Domain Extensions which isolates VMs from the virtual-machine manager (VMM)/hypervisor and any other software on the platform. To support TDX, multiple software components, not only KVM but also QEMU, guest Linux and virtual bios, need to be updated. For more details, please check link[1]. This patchset is another software component to extend libvirt to support TDX, with which one can start a TDX guest from high level rather than running qemu directly. * Misc As QEMU use a software emulated way to reset guest which isn't supported by TDX guest for security reason. We simulate reboot for TDX guest by kill and create a new one in FakeReboot framework. Complete code can be found at [2], matching qemu code can be found at [3]. There is a 'debug' property for tdx-guest object which isn't in matching qemu[3] yet. I keep them intentionally as they will be implemented in qemu as extention series of [3]. * Test start/stop/reboot with virsh stop/reboot trigger in guest stop with on_poweroff=destroy/restart reboot with on_reboot=destroy/restart * Patch organization - patch 1-4: Support query of TDX capabilities. - patch 5-8: Add TDX type to launchsecurity framework. - patch 9-11: Add reboot support to TDX guest - patch 12-13: Add test and docs TODO: - update QEMU capabilities data in tests, depending on qemu TDX merged beforehand - add reconnect logic in virsh command [1] https://lore.kernel.org/kvm/cover.1708933498.git.isaku.yamahata@intel.com [2] https://github.com/intel/libvirt-tdx/commits/tdx_for_upstream_rfcv4 [3] https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v5 Thanks Zhenzhong Changelog: rfcv4: - add a check to tools/virt-host-validate-qemu.c (Daniel) - remove check of q35 (Daniel) - model 'SocktetAddress' QAPI in xml schema (Daniel) - s/Quote-Generation-Service/quoteGenerationService/ (Daniel) - define bits in tdx->policy and add validating logic (Daniel) - presume QEMU choose split kernel irqchip for TDX guest by default (Daniel) - utilize existing FakeReboot framework to do reboot for TDX guest (Daniel) - drop patch11 'conf: Add support to keep same domid for hard reboot' (Daniel) - add test in tests/ to validate parsing and formatting logic (Daniel) - add doc in docs/formatdomain.rst (Daniel) - add R-B rfcv3: - Change to generate qemu cmdline with -bios - drop firmware auto match as -bios is used - add a hard reboot method to reboot TDX guest rfcv3: https://www.mail-archive.com/devel@lists.libvirt.org/msg00385.html rfcv2: - give up using qmp cmd and check TDX directly on host for TDX capabilities. - use launchsecurity framework to support TDX - use <os>.<loader> for general loader - add auto firmware match feature for TDX A example TDVF fimware description file 70-edk2-x86_64-tdx.json: { "description": "UEFI firmware for x86_64, supporting Intel TDX", "interface-types": [ "uefi" ], "mapping": { "device": "generic", "filename": "/usr/share/OVMF/OVMF_CODE-tdx.fd" }, "targets": [ { "architecture": "x86_64", "machines": [ "pc-q35-*" ] } ], "features": [ "intel-tdx", "verbose-dynamic" ], "tags": [ ] } rfcv2: https://www.mail-archive.com/libvir-list@redhat.com/msg219378.html Zhenzhong Duan (13): tools: Secure guest check for Intel in virt-host-validate qemu: Check if INTEL Trust Domain Extention support is enabled qemu: Add TDX capability conf: expose TDX feature in domain capabilities conf: add tdx as launch security type qemu: Add command line and validation for TDX type qemu: force special parameters enabled for TDX guest Add Intel TDX Quote Generation Service(QGS) support qemu: add FakeReboot support for TDX guest qemu: Support reboot command in guest qemu: Avoid duplicate FakeReboot for secure guest Add test cases for Intel TDX docs: domain: Add documentation for Intel TDX guest docs/formatdomain.rst | 68 ++++ docs/formatdomaincaps.rst | 1 + src/conf/domain_capabilities.c | 1 + src/conf/domain_capabilities.h | 1 + src/conf/domain_conf.c | 312 ++++++++++++++++++ src/conf/domain_conf.h | 75 +++++ src/conf/schemas/domaincaps.rng | 9 + src/conf/schemas/domaincommon.rng | 135 ++++++++ src/conf/virconftypes.h | 2 + src/qemu/qemu_capabilities.c | 36 +- src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 139 ++++++++ src/qemu/qemu_firmware.c | 1 + src/qemu/qemu_monitor.c | 28 +- src/qemu/qemu_monitor.h | 2 +- src/qemu/qemu_monitor_json.c | 6 +- src/qemu/qemu_namespace.c | 1 + src/qemu/qemu_process.c | 75 +++++ src/qemu/qemu_validate.c | 44 +++ ...unch-security-tdx-qgs-fd.x86_64-latest.xml | 77 +++++ .../launch-security-tdx-qgs-fd.xml | 30 ++ ...ch-security-tdx-qgs-inet.x86_64-latest.xml | 77 +++++ .../launch-security-tdx-qgs-inet.xml | 30 ++ ...ch-security-tdx-qgs-unix.x86_64-latest.xml | 77 +++++ .../launch-security-tdx-qgs-unix.xml | 30 ++ ...h-security-tdx-qgs-vsock.x86_64-latest.xml | 77 +++++ .../launch-security-tdx-qgs-vsock.xml | 30 ++ tests/qemuxmlconftest.c | 24 ++ tools/virt-host-validate-common.c | 22 +- tools/virt-host-validate-common.h | 1 + 30 files changed, 1407 insertions(+), 5 deletions(-) create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-fd.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-fd.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-inet.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-inet.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-unix.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-unix.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-vsock.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-vsock.xml -- 2.34.1

3 months, 2 weeks

6
45
0 / 0

[PATCH 0/4] hw/s390x: Alias @dump-skeys -> @dump-s390-skey and deprecate

by Philippe Mathieu-Daudé

We are trying to unify all qemu-system-FOO to a single binary. In order to do that we need to remove QAPI target specific code. @dump-skeys is only available on qemu-system-s390x. This series rename it as @dump-s390-skey, making it available on other binaries. We take care of backward compatibility via deprecation. Philippe Mathieu-Daudé (4): hw/s390x: Introduce the @dump-s390-skeys QMP command hw/s390x: Introduce the 'dump_s390_skeys' HMP command hw/s390x: Deprecate the HMP 'dump_skeys' command hw/s390x: Deprecate the QMP @dump-skeys command docs/about/deprecated.rst | 5 +++++ qapi/misc-target.json | 5 +++++ qapi/misc.json | 18 ++++++++++++++++++ include/monitor/hmp.h | 1 + hw/s390x/s390-skeys-stub.c | 24 ++++++++++++++++++++++++ hw/s390x/s390-skeys.c | 19 +++++++++++++++++-- hmp-commands.hx | 17 +++++++++++++++-- hw/s390x/meson.build | 5 +++++ 8 files changed, 90 insertions(+), 4 deletions(-) create mode 100644 hw/s390x/s390-skeys-stub.c -- 2.41.0

4 months, 1 week

7
24
0 / 0

[libvirt] [PATCH] Fix python error reporting for some storage operations

by Cole Robinson

In the python bindings, all vir* classes expect to be passed a virConnect object when instantiated. Before the storage stuff, these classes were only instantiated in virConnect methods, so the generator is hardcoded to pass 'self' as the connection instance to these classes. Problem is there are some methods that return pool or vol instances which aren't called from virConnect: you can lookup a storage volume's associated pool, and can lookup volumes from a pool. In these cases passing 'self' doesn't give the vir* instance a connection, so when it comes time to raise an exception crap hits the fan. Rather than rework the generator to accomodate this edge case, I just fixed the init functions for virStorage* to pull the associated connection out of the passed value if it's not a virConnect instance. Thanks, Cole diff --git a/python/generator.py b/python/generator.py index 01a17da..c706b19 100755 --- a/python/generator.py +++ b/python/generator.py @@ -962,8 +962,12 @@ def buildWrappers(): list = reference_keepers[classname] for ref in list: classes.write(" self.%s = None\n" % ref[1]) - if classname in [ "virDomain", "virNetwork", "virStoragePool", "virStorageVol" ]: + if classname in [ "virDomain", "virNetwork" ]: classes.write(" self._conn = conn\n") + elif classname in [ "virStorageVol", "virStoragePool" ]: + classes.write(" self._conn = conn\n" + \ + " if not isinstance(conn, virConnect):\n" + \ + " self._conn = conn._conn\n") classes.write(" if _obj != None:self._o = _obj;return\n") classes.write(" self._o = None\n\n"); destruct=None

4 months, 1 week

4
3
0 / 0

[RFC 0/4] meson: Enable -Wundef

by Andrea Bolognani

A few days ago I have posted a patch[1] that addresses an issue introduced when a meson check was dropped but some uses of the corresponding WITH_ macro were not removed at the same time. That got me thinking about what we can do to prevent such scenarios from happening again in the future. I have come up with something that I think would be effective, but since applying the approach throughout the entire codebase would require a non-trivial amount of work, I figured I'd ask for feedback before embarking on it. The idea is that there are two types of macros we can use for conditional compilation: external ones, coming from the OS or other libraries, and internal ones, which are the result of meson tests. The external ones (e.g. SIOCSIFFLAGS, __APPLE__) are usually only defined if they apply, so it is correct to check for their presence with #ifdef. Using #if will also work, as undefined macros evaluate to zero, but it's not good practice to use them that way. If -Wundef has been passed to the compiler, those incorrect uses will be reported (only on platforms where they are not defined, of course). The internal ones (e.g. WITH_QEMU, WITH_STRUCT_IFREQ) are similar, but in this case we control their definition. This means that using means that the feature is not available on the machine we're building on, but it could also mean that we've removed the meson check and forgot to update all users of the macro. In this case, -Wundef would work 100% reliably to detect the issue: if the meson check doesn't exist, neither will the macro, regardless of what platform we're building on. So the approach I'm suggesting is to use a syntax-check rule to ensure that internal macros are only ever checked with #if instead of Of course this requires a full sweep to fix all cases in which we're not already doing things according to the proposal. Should be fairly easy, if annoying. A couple of examples are included here for demonstration purposes. The bigger impact is going to be on the build system. Right now we generally only define WITH_ macros if the check passed, but that will have to change and the result is going to be quite a bit of additional meson code I'm afraid. Thoughts? [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/S... Andrea Bolognani (4): configmake: Check for WIN32 correctly meson: Always define WITH_*_DECL macros syntax-check: Ensure WITH_ macros are used correctly meson: Enable -Wundef build-aux/syntax-check.mk | 5 +++++ configmake.h.in | 2 +- meson.build | 3 +++ tests/virmockstathelpers.c | 28 ++++++++++++++-------------- 4 files changed, 23 insertions(+), 15 deletions(-) -- 2.43.2

4 months, 3 weeks

4
8
0 / 0

Re: [PATCH] Сheck snapshot disk is not NULL when searching it in the VM config

by Peter Krempa

On Mon, May 20, 2024 at 14:48:47 +0000, Efim Shevrin via Devel wrote: > Hello, > > > If vmdisk is NULL, shouldn't this function (qemuSnapshotDeleteValidate()) return an error? > > I think this qemuSnapshotDeleteValidate should not return an error. > > It seems to me that when vmdisk is NULL, this does not invalidate > the snapshot itself, but indicates that the config has changed since > the snapshot was done. And if the VM config has changed, this adds evidence that the snapshot should be deleted, > because the snapshot does not reflect the real vm config. > > Since we do not have an analogue of the --force option for deleting a snapshot, in the case when qemuSnapshotDeleteValidate returns > an error when vmdisk is NULL, we will never delete a snapshot which has invalid disk. Snapshot deletion does have something that can be considered force and that is the '--metadata' option that removes just the snapshot definition (metadata) and doesn't touch the disk images. > > Similarly, disk can be NULL too > Thank you for the comment regarding the disk variable. I`ve reworked patch. > > When creating a snapshot of a VM with multiple hard disks, > the snapshot takes into account the presence of all disks > in the system. If, over time, one of the disks is deleted, > the snapshot will continue to store knowledge of the deleted disk. > This results in the fact that at the moment of deleting the snapshot, > at the validation stage, a disk from the snapshot will be searched which > is not in the VM configuration. As a result, vmdisk variable will > be equal to NULL. Dereferencing a null pointer at the time of calling > virStorageSourceIsSameLocation(vmdisk->src, disk->src) > will result in SIGSEGV. Crashing is obviously not okay ... > Also, the disk variable can also be equal to NULL and this > requires to check that disk != NULL before calling the > virStorageSourceIsSameLocation function to avoid SIGSEGV. .. but going ahead with the snapshot deletion isn't always okay either. The disk isn't referenced by the VM so the disk state can't be merged, while the state would be merged for any other disk. When reverting back to a previous snapshot, which is still referencing the older state of the disk which was removed from the VM, the VM would see that the image state of disks that were present at deletion would contain the merged state, but only a partial state for the disk which was later removed.

5 months, 2 weeks

3
3
0 / 0

[libvirt PATCH] qemu_snapshot: allow reverting to external disk only snapshot

by Pavel Hrdina

When snapshot is created with disk-only flag it is always external snapshot without memory state. Historically when there was not support to revert external snapshots this produced error message. error: Failed to revert snapshot s1 error: internal error: Invalid target domain state 'disk-snapshot'. Refusing snapshot reversion Now we can simply consider this as reverting to offline snapshot as the possible damage to file system is already done at the point of snapshot creation. Resolves: https://issues.redhat.com/browse/RHEL-21549 Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com> --- src/qemu/qemu_snapshot.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 0cac0c4146..7964f70553 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -2606,6 +2606,7 @@ qemuSnapshotRevert(virDomainObj *vm, case VIR_DOMAIN_SNAPSHOT_SHUTDOWN: case VIR_DOMAIN_SNAPSHOT_SHUTOFF: case VIR_DOMAIN_SNAPSHOT_CRASHED: + case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT: ret = qemuSnapshotRevertInactive(vm, snapshot, snap, driver, cfg, &inactiveConfig, @@ -2617,8 +2618,6 @@ qemuSnapshotRevert(virDomainObj *vm, _("qemu doesn't support reversion of snapshot taken in PMSUSPENDED state")); goto endjob; - case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT: - /* Rejected earlier as an external snapshot */ case VIR_DOMAIN_SNAPSHOT_NOSTATE: case VIR_DOMAIN_SNAPSHOT_BLOCKED: case VIR_DOMAIN_SNAPSHOT_LAST: -- 2.43.0

5 months, 2 weeks

2
1
0 / 0

[PATCH 00/12] Introduce SEV-SNP support

by Michal Privoznik

SEV-SNP support just landed in QEMU. Here is the first round of patches to incorporate support into libvirt. TODOs (aka problems of future me): - Teach tools/virt-qemu-sev-validate how to deal with SEV-SNP - Try to find a SEV-SNP machine a test these patches in real worl - Write a kbase article on attestation with SEV-SNP Michal Prívozník (12): qemu_monitor_json: Report error in error paths in SEV related code conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef conf: Separate SEV formatting into a function Drop needless typecast to virDomainLaunchSecurity src: Convert some _virDomainSecDef::sectype checks to switch() qemu_monitor: Allow querying SEV-SNP state in 'query-sev' qemu: Report snp-policy in virDomainGetLaunchSecurityInfo() qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST conf: Introduce SEV-SNP support qemu: Build cmd line for SEV-SNP qemu: Allow setting launch security for SEV-SNP qemu_firmware: Pick the right firmware for SEV-SNP guests docs/formatdomain.rst | 108 ++++++++++++ include/libvirt/libvirt-domain.h | 10 ++ src/conf/domain_conf.c | 156 ++++++++++++++---- src/conf/domain_conf.h | 28 +++- src/conf/domain_validate.c | 44 +++++ src/conf/schemas/domaincommon.rng | 73 ++++++-- src/conf/virconftypes.h | 4 + src/qemu/qemu_capabilities.c | 4 + src/qemu/qemu_capabilities.h | 3 + src/qemu/qemu_cgroup.c | 19 ++- src/qemu/qemu_command.c | 56 ++++++- src/qemu/qemu_driver.c | 60 +++++-- src/qemu/qemu_firmware.c | 20 ++- src/qemu/qemu_monitor.c | 7 +- src/qemu/qemu_monitor.h | 41 ++++- src/qemu/qemu_monitor_json.c | 67 ++++++-- src/qemu/qemu_monitor_json.h | 8 +- src/qemu/qemu_namespace.c | 3 +- src/qemu/qemu_process.c | 34 ++-- src/qemu/qemu_validate.c | 13 +- src/security/security_dac.c | 34 +++- .../caps_9.1.0_x86_64.xml | 1 + .../firmware/60-edk2-ovmf-x64-amdsev.json | 1 + tests/qemumonitorjsontest.c | 65 +++++++- ...launch-security-sev-snp.x86_64-latest.args | 35 ++++ .../launch-security-sev-snp.x86_64-latest.xml | 1 + .../launch-security-sev-snp.xml | 47 ++++++ tests/qemuxmlconftest.c | 2 + 28 files changed, 817 insertions(+), 127 deletions(-) create mode 100644 tests/qemuxmlconfdata/launch-security-sev-snp.x86_64-latest.args create mode 120000 tests/qemuxmlconfdata/launch-security-sev-snp.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/launch-security-sev-snp.xml -- 2.44.2

5 months, 3 weeks

5
31
0 / 0

[PATCH v2 0/4] multiple memory backend support for CPR Live Updates

by mgalaxy＠akamai.com

From: Michael Galaxy <mgalaxy(a)akamai.com> CPR-based support for whole-hypervisor kexec-based live updates is now finally merged into QEMU. In support of this, we need NUMA to be supported in these kinds of environments. To do this we use a technology called PMEM (persistent memory) in Linux, which underpins the ability for CPR Live Updates to work so that QEMU memory can remain in RAM and be recovered after a kexec operationg has completed. Our systems are highly NUMA-aware, and so this patch series enables NUMA awareness for live updates. Further, we make a small change that allows live migrations to work between *non* PMEM-based systems and PMEM-based systems (and vice-versa). This allows for seemless upgrades from non-live-compatible systems to live-update-compatible sytems without any downtime. Michael Galaxy (4): qemu.conf changes to support multiple memory backend Support live migration between file-backed memory and anonymous memory. Update unit test to support multiple memory backends Update documentation to reflect memory_backing_dir change in qemu.conf NEWS.rst | 7 ++ docs/kbase/virtiofs.rst | 2 + src/qemu/qemu.conf.in | 2 + src/qemu/qemu_command.c | 8 ++- src/qemu/qemu_conf.c | 141 +++++++++++++++++++++++++++++++++++----- src/qemu/qemu_conf.h | 14 ++-- src/qemu/qemu_domain.c | 24 +++++-- src/qemu/qemu_driver.c | 29 +++++---- src/qemu/qemu_hotplug.c | 6 +- src/qemu/qemu_process.c | 44 +++++++------ src/qemu/qemu_process.h | 7 +- tests/testutilsqemu.c | 5 +- 12 files changed, 221 insertions(+), 68 deletions(-) -- 2.34.1

8 months, 4 weeks

3
26
0 / 0

[PATCH RFC v3 00/16] Support throttle block filters

by wucf＠linux.ibm.com

From: Chun Feng Wu <wucf(a)linux.ibm.com> Hi, I am thinking to leverage "throttle block filter" in QEMU to support more flexible I/O limits(e.g. tiered I/O groups), one sample provided by QEMU doc is: https://github.com/qemu/qemu/blob/master/docs/throttle.txt "For example, let's say that we have three different drives and we want to set I/O limits for each one of them and an additional set of limits for the combined I/O of all three drives." The implementation idea is to - Define throttle groups(limit) in domain - Define throttle filter to reference throttle group within disk - Within domain disk, throttle filters references multiple throttle groups to form filter chain to apply multiple limits in QEMU like above sample - Add new virsh cmds for throttle group management: throttlegroupset Add or update a throttling group. throttlegroupdel Delete a throttling group. throttlegroupinfo Get a throttling group. throttlegrouplist list all domain throttlegroups - Update "attach-disk" to add one more option "--throttle-groups" to apply throttle filters e.g. "virsh attach-disk $VM_ID ${DISK_PATH}/vm1_disk_2.qcow2 vdd --driver qemu --subdriver qcow2 --targetbus virtio --throttle-groups limit2,limit012" - I chose above semantics as I felt they're appropriate, if there are better ones please kindly suggest. Note, this implementation requires flag "QEMU_CAPS_OBJECT_JSON". From QMP perspective, the sample flow works this way: - Throttle group creation: virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit0","limits":{"iops-total":200,"iops-read":0,"iops-total-max":200,"iops-total-max-length":1}}}' virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit1","limits":{"iops-total":250,"iops-read":0,"iops-total-max":250,"iops-total-max-length":1}}}' virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit2","limits":{"iops-total":300,"iops-read":0,"iops-total-max":300,"iops-total-max-length":1}}}' virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit012","limits":{"iops-total":400,"iops-read":0,"iops-total-max":400,"iops-total-max-length":1}}}' - Chain up filters during attaching disk to apply two filters(limit0 and limit012): virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments": {"driver":"file","filename":"/virt/disks/vm1_disk_1.qcow2","node-name":"test-3-storage","auto-read-only":true,"discard":"unmap"}}' virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments":{"node-name":"test-4-format","read-only":false,"driver":"qcow2","file":"test-3-storage","backing":null}}' virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments":{"driver":"throttle","node-name":"libvirt-5-filter","throttle-group": "limit0","file":"test-4-format"}}' virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments": {"driver":"throttle","node-name":"libvirt-6-filter","throttle-group":"limit012","file":"libvirt-5-filter"}}' virsh qemu-monitor-command 1 '{"execute": "device_add", "arguments": {"driver":"virtio-blk-pci","scsi":false,"bus":"pci.0","addr":"0x5","drive":"libvirt-6-filter","id":"virtio-disk1"}}' This patchset includes: - Throttle group XML schema definition in patch 1 - Throttle filter XML schema definition in patch 2 - Throttle group struct definition, parsing and formating in patch 3 - Throttle filter struct definition, parsing and formating in patch 4 - New QMP processing to update and get throttle group in patch 5&6 - New API definition and implementation in patch 7 - QEMU driver implementation in patch 8 - Hotplug processing for throttle filters in patch 9 - Extract common iotune validation in patch 10 - qemuProcessLaunch flow implemenation for throttle group in patch 11 - qemuProcessLaunch flow implemenation for throttle filter in patch 12 - Domain XML test for processing throttle groups and filters in patch 13 - Test new implemented driver in patch 14 - New virsh cmd implementation for group in patch 15 - Update Virsh cmd "attach_disk" to include throttle filters in patch 16 v3 changes: - re-org commits by splitting changes containing throttle group and filters - update commits msgs - move schema commits to be the first ones - refactor "diskIoTune" to extract common schema "iotune" - add new tests for throttle groups and filters in qemuxmlconftest - check flag "QEMU_CAPS_OBJECT_JSON" when preparing "-object"(qemu: command: Support throttle groups during qemuProcessLaunch ) or creating throttle group (qemu: Implement qemu driver for throttle API) - when creating throttle group through "object-add" (qemu: Implement qemu driver for throttle API), reuse "qemuMonitorAddObject" to check if "objectAddNoWrap"("props") is requried - remove "virObject parent;" in "_virDomainThrottleFilterDef" in domain_conf.h - remove "virDomainThrottleGroupIndexByName" in both domain_conf.h and domain_conf.c - remove "virDomainThrottleFilterDefNew" in domain_conf.c - update "virDomainThrottleFilterDefFree" to use "g_free" rather than "VIR_FREE" in domain_conf.c - update "virDomainDiskThrottleFilterDefParse" to remove "xmlXPathContextPtr ctx" parameter and check NULL against "filter->group_name" in domain_conf.c - use "virBufferEscapeString" instead of "virBufferAsprintf" in "virDomainDiskDefFormatThrottleFilterChain" - use "group->val > 0" instead of "if (group->val)" in FORMAT_THROTTLE_GROUP - remove NULL check for "group->group_name" since virBufferEscapeString checked NULL already in "virDomainThrottleGroupFormat" - I haven't added new conf module (src/conf/virdomainthrottle.c/h) because "virDomainThrottleGroupDef" is alias of "_virDomainBlockIoTuneInfo", try to avoid circular dependency - remove "NULLSTR" in qemuMonitorUpdateThrottleGroup and qemuMonitorGetThrottleGroup in qemu_monitor.c - refactor "qemuMonitorMakeThrottleGroupLimits" to use virJSONValueObjectAdd in qemu_monitor_json.c - use "g_strdup_printf" to avoid static buffers in "qemuMonitorJSONGetThrottleGroup" - remove virReportError after qemuMonitorJSONGetReply in "qemuMonitorJSONGetThrottleGroup" to avoid overriding error - remove "VIR_DOMAIN_THROTTLE_GROUP" in libvirt/include/libvirt/libvirt-domain.h - update "virDomainGetThrottleGroup" to not first query the number of parameters, - update "remote_domain_get_throttle_group_args" and "remote_domain_get_throttle_group_ret" to remove "nparams" in src/remote_protocol-structs, also updated "remote_domain_get_throttle_group_args" and "remote_domain_get_throttle_group_ret" in src/remote/remote_protocol.x - update parameter "virTypedParameterPtr params" to be "virTypedParameterPtr *params" in "virDrvDomainGetThrottleGroup" in driver-hypervisor.h - update "qemuDomainSetThrottleGroup" to not query number of parameters first - remove wrapper "qemuDomainThrottleGroupByName" and "qemuDomainSetThrottleGroupDefaults" - refactor "qemuDomainSetThrottleGroup" and "qemuDomainSetBlockIoTune" to use common logic - update "qemuDomainDelThrottleGroup" to use VIR_JOB_MODIFY by referencing "qemuDomainHotplugDelIOThread" - check if group is still being used by some filter(qemuDomainCheckThrottleGroupRef) during deletion - replace "ThrottleFilterChain" with "ThrottleFilters" - update "qemuDomainDiskGetTopNodename" to take top throttle node name if disk has throttles, and reuse "qemuDomainDiskGetBackendAlias" in "qemuBuildDiskDeviceProps" to get top node name as "drive" - after enabling throttlerfilter and if disk has throttlefilters, during blockcommit, the top node name is not "libvirt-x-format" anymore, instead, top node name referencies top filter like "libvirt-x-filter" - add check "cdrom device with throttle filters isn't supported" - delete "filternodenameindex" and reuse "nodenameindex" to generate index for throttle nodes - refactor detaching filters by adding "qemuBuildThrottleFiltersDetachPrepareBlockdev" to just build parameters for "blockdev-del" - refactor "testDomainSetBlockIoTune" and "testDomainSetThrottleGroup" to use common logic Any comments/suggestions will be appriciated! Chun Feng Wu (16): schema: Add new domain elements to support multiple throttle groups schema: Add new domain elements to support multiple throttle filters config: Introduce ThrottleGroup and corresponding XML parsing config: Introduce ThrottleFilter and corresponding XML parsing qemu: monitor: Add support for ThrottleGroup operations tests: Test qemuMonitorJSONGetThrottleGroup and qemuMonitorJSONUpdateThrottleGroup remote: New APIs for ThrottleGroup lifecycle management qemu: Implement qemu driver for throttle API qemu: hotplug: Support hot attach and detach block disk along with throttle filters config: validate: Refactor disk iotune validation for reuse qemu: command: Support throttle groups during qemuProcessLaunch qemu: command: Support throttle filters during qemuProcessLaunch qemuxmlconftest: Add 'throttlefilter' tests test_driver: Test throttle group lifecycle APIs virsh: Add support for throttle group operations virsh: Add option "throttle-groups" to "attach_disk" docs/formatdomain.rst | 48 ++ include/libvirt/libvirt-domain.h | 21 + src/conf/domain_conf.c | 376 +++++++++++ src/conf/domain_conf.h | 51 ++ src/conf/domain_validate.c | 118 +++- src/conf/schemas/domaincommon.rng | 293 +++++---- src/conf/virconftypes.h | 4 + src/driver-hypervisor.h | 22 + src/libvirt-domain.c | 196 ++++++ src/libvirt_private.syms | 9 + src/libvirt_public.syms | 7 + src/qemu/qemu_block.c | 131 ++++ src/qemu/qemu_block.h | 53 ++ src/qemu/qemu_command.c | 182 ++++++ src/qemu/qemu_command.h | 12 + src/qemu/qemu_domain.c | 39 +- src/qemu/qemu_domain.h | 8 + src/qemu/qemu_driver.c | 617 +++++++++++++++--- src/qemu/qemu_hotplug.c | 33 + src/qemu/qemu_monitor.c | 34 + src/qemu/qemu_monitor.h | 14 + src/qemu/qemu_monitor_json.c | 150 +++++ src/qemu/qemu_monitor_json.h | 14 + src/remote/remote_daemon_dispatch.c | 44 ++ src/remote/remote_driver.c | 40 ++ src/remote/remote_protocol.x | 48 +- src/remote_protocol-structs | 28 + src/test/test_driver.c | 452 +++++++++---- tests/qemumonitorjsontest.c | 86 +++ .../throttlefilter.x86_64-latest.args | 43 ++ .../throttlefilter.x86_64-latest.xml | 65 ++ tests/qemuxmlconfdata/throttlefilter.xml | 55 ++ tests/qemuxmlconftest.c | 1 + tools/virsh-completer-domain.c | 64 ++ tools/virsh-completer-domain.h | 5 + tools/virsh-domain.c | 453 ++++++++++++- 36 files changed, 3447 insertions(+), 369 deletions(-) create mode 100644 tests/qemuxmlconfdata/throttlefilter.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/throttlefilter.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/throttlefilter.xml -- 2.34.1

10 months, 4 weeks

3
53
0 / 0

[PATCH RFC 0/9] qemu: Support mapped-ram migration capability

by Jim Fehlig

This series is a RFC for support of QEMU's mapped-ram migration capability [1] for saving and restoring VMs. It implements the first part of the design approach we discussed for supporting parallel save/restore [2]. In summary, the approach is 1. Add mapped-ram migration capability 2. Steal an element from save header 'unused' for a 'features' variable and bump save version to 3. 3. Add /etc/libvirt/qemu.conf knob for the save format version, defaulting to latest v3 4. Use v3 (aka mapped-ram) by default 5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2 6. include: Define constants for parallel save/restore 7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2 8. qemu: Add support for parallel restore. Implies mapped-ram. Reject if v2 9. tools: add parallel parameter to virsh save command 10. tools: add parallel parameter to virsh restore command This series implements 1-5, with the BYPASS_CACHE support in patches 8 and 9 being quite hacky. They are included to discuss approaches to make them less hacky. See the patches for details. The QEMU mapped-ram capability currently does not support directio. Fabino is working on that now [3]. This complicates merging support in libvirt. I don't think it's reasonable to enable mapped-ram by default when BYPASS_CACHE cannot be supported. Should we wait until the mapped-ram directio support is merged in QEMU before supporting mapped-ram in libvirt? For the moment, compression is ignored in the new save version. Currently, libvirt connects the output of QEMU's save stream to the specified compression program via a pipe. This approach is incompatible with mapped-ram since the fd provided to QEMU must be seekable. One option is to reopen and compress the saved image after the actual save operation has completed. This has the downside of requiring the iohelper to handle BYPASS_CACHE, which would preclude us from removing it sometime in the future. Other suggestions much welcomed. Note the logical file size of mapped-ram saved images is slightly larger than guest RAM size, so the files are often much larger than the files produced by the existing, sequential format. However, actual blocks written to disk is often lower with mapped-ram saved images. E.g. a saved image from a 30G, freshly booted, idle guest results in the following 'Size' and 'Blocks' values reported by stat(1) Size Blocks sequential 998595770 1950392 mapped-ram 34368584225 1800456 With the same guest running a workload that dirties memory Size Blocks sequential 33173330615 64791672 mapped-ram 34368578210 64706944 Thanks for any comments on this RFC! [1] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/m... [2] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/K... [3] https://mail.gnu.org/archive/html/qemu-devel/2024-05/msg04432.html Jim Fehlig (9): qemu: Enable mapped-ram migration capability qemu_fd: Add function to retrieve fdset ID qemu: Add function to get migration params for save qemu: Add a 'features' element to save image header and bump version qemu: conf: Add setting for save image version qemu: Add support for mapped-ram on save qemu: Enable mapped-ram on restore qemu: Support O_DIRECT with mapped-ram on save qemu: Support O_DIRECT with mapped-ram on restore src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf.in | 6 + src/qemu/qemu_conf.c | 8 ++ src/qemu/qemu_conf.h | 1 + src/qemu/qemu_driver.c | 25 ++-- src/qemu/qemu_fd.c | 18 +++ src/qemu/qemu_fd.h | 3 + src/qemu/qemu_migration.c | 99 ++++++++++++++- src/qemu/qemu_migration.h | 11 +- src/qemu/qemu_migration_params.c | 20 +++ src/qemu/qemu_migration_params.h | 4 + src/qemu/qemu_monitor.c | 40 ++++++ src/qemu/qemu_monitor.h | 5 + src/qemu/qemu_process.c | 63 +++++++--- src/qemu/qemu_process.h | 16 ++- src/qemu/qemu_saveimage.c | 187 +++++++++++++++++++++++------ src/qemu/qemu_saveimage.h | 20 ++- src/qemu/qemu_snapshot.c | 12 +- src/qemu/test_libvirtd_qemu.aug.in | 1 + 19 files changed, 455 insertions(+), 85 deletions(-) -- 2.44.0

11 months

3
22
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Devel June 2024