[PATCH rfcv4 00/13] LIBVIRT: X86: TDX support
by Zhenzhong Duan
Hi,
This series brings libvirt the x86 TDX support.
* What's TDX?
TDX stands for Trust Domain Extensions which isolates VMs from
the virtual-machine manager (VMM)/hypervisor and any other software on
the platform.
To support TDX, multiple software components, not only KVM but also QEMU,
guest Linux and virtual bios, need to be updated. For more details, please
check link[1].
This patchset is another software component to extend libvirt to support TDX,
with which one can start a TDX guest from high level rather than running qemu
directly.
* Misc
As QEMU use a software emulated way to reset guest which isn't supported by TDX
guest for security reason. We simulate reboot for TDX guest by kill and create a
new one in FakeReboot framework.
Complete code can be found at [2], matching qemu code can be found at [3].
There is a 'debug' property for tdx-guest object which isn't in matching qemu[3]
yet. I keep them intentionally as they will be implemented in qemu as extention
series of [3].
* Test
start/stop/reboot with virsh
stop/reboot trigger in guest
stop with on_poweroff=destroy/restart
reboot with on_reboot=destroy/restart
* Patch organization
- patch 1-4: Support query of TDX capabilities.
- patch 5-8: Add TDX type to launchsecurity framework.
- patch 9-11: Add reboot support to TDX guest
- patch 12-13: Add test and docs
TODO:
- update QEMU capabilities data in tests, depending on qemu TDX merged beforehand
- add reconnect logic in virsh command
[1] https://lore.kernel.org/kvm/cover.1708933498.git.isaku.yamahata@intel.com
[2] https://github.com/intel/libvirt-tdx/commits/tdx_for_upstream_rfcv4
[3] https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v5
Thanks
Zhenzhong
Changelog:
rfcv4:
- add a check to tools/virt-host-validate-qemu.c (Daniel)
- remove check of q35 (Daniel)
- model 'SocktetAddress' QAPI in xml schema (Daniel)
- s/Quote-Generation-Service/quoteGenerationService/ (Daniel)
- define bits in tdx->policy and add validating logic (Daniel)
- presume QEMU choose split kernel irqchip for TDX guest by default (Daniel)
- utilize existing FakeReboot framework to do reboot for TDX guest (Daniel)
- drop patch11 'conf: Add support to keep same domid for hard reboot' (Daniel)
- add test in tests/ to validate parsing and formatting logic (Daniel)
- add doc in docs/formatdomain.rst (Daniel)
- add R-B
rfcv3:
- Change to generate qemu cmdline with -bios
- drop firmware auto match as -bios is used
- add a hard reboot method to reboot TDX guest
rfcv3: https://www.mail-archive.com/devel@lists.libvirt.org/msg00385.html
rfcv2:
- give up using qmp cmd and check TDX directly on host for TDX capabilities.
- use launchsecurity framework to support TDX
- use <os>.<loader> for general loader
- add auto firmware match feature for TDX
A example TDVF fimware description file 70-edk2-x86_64-tdx.json:
{
"description": "UEFI firmware for x86_64, supporting Intel TDX",
"interface-types": [
"uefi"
],
"mapping": {
"device": "generic",
"filename": "/usr/share/OVMF/OVMF_CODE-tdx.fd"
},
"targets": [
{
"architecture": "x86_64",
"machines": [
"pc-q35-*"
]
}
],
"features": [
"intel-tdx",
"verbose-dynamic"
],
"tags": [
]
}
rfcv2: https://www.mail-archive.com/libvir-list@redhat.com/msg219378.html
Zhenzhong Duan (13):
tools: Secure guest check for Intel in virt-host-validate
qemu: Check if INTEL Trust Domain Extention support is enabled
qemu: Add TDX capability
conf: expose TDX feature in domain capabilities
conf: add tdx as launch security type
qemu: Add command line and validation for TDX type
qemu: force special parameters enabled for TDX guest
Add Intel TDX Quote Generation Service(QGS) support
qemu: add FakeReboot support for TDX guest
qemu: Support reboot command in guest
qemu: Avoid duplicate FakeReboot for secure guest
Add test cases for Intel TDX
docs: domain: Add documentation for Intel TDX guest
docs/formatdomain.rst | 68 ++++
docs/formatdomaincaps.rst | 1 +
src/conf/domain_capabilities.c | 1 +
src/conf/domain_capabilities.h | 1 +
src/conf/domain_conf.c | 312 ++++++++++++++++++
src/conf/domain_conf.h | 75 +++++
src/conf/schemas/domaincaps.rng | 9 +
src/conf/schemas/domaincommon.rng | 135 ++++++++
src/conf/virconftypes.h | 2 +
src/qemu/qemu_capabilities.c | 36 +-
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_command.c | 139 ++++++++
src/qemu/qemu_firmware.c | 1 +
src/qemu/qemu_monitor.c | 28 +-
src/qemu/qemu_monitor.h | 2 +-
src/qemu/qemu_monitor_json.c | 6 +-
src/qemu/qemu_namespace.c | 1 +
src/qemu/qemu_process.c | 75 +++++
src/qemu/qemu_validate.c | 44 +++
...unch-security-tdx-qgs-fd.x86_64-latest.xml | 77 +++++
.../launch-security-tdx-qgs-fd.xml | 30 ++
...ch-security-tdx-qgs-inet.x86_64-latest.xml | 77 +++++
.../launch-security-tdx-qgs-inet.xml | 30 ++
...ch-security-tdx-qgs-unix.x86_64-latest.xml | 77 +++++
.../launch-security-tdx-qgs-unix.xml | 30 ++
...h-security-tdx-qgs-vsock.x86_64-latest.xml | 77 +++++
.../launch-security-tdx-qgs-vsock.xml | 30 ++
tests/qemuxmlconftest.c | 24 ++
tools/virt-host-validate-common.c | 22 +-
tools/virt-host-validate-common.h | 1 +
30 files changed, 1407 insertions(+), 5 deletions(-)
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-fd.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-fd.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-inet.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-inet.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-unix.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-unix.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-vsock.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-tdx-qgs-vsock.xml
--
2.34.1
4 days, 18 hours
[PATCH 0/4] hw/s390x: Alias @dump-skeys -> @dump-s390-skey and deprecate
by Philippe Mathieu-Daudé
We are trying to unify all qemu-system-FOO to a single binary.
In order to do that we need to remove QAPI target specific code.
@dump-skeys is only available on qemu-system-s390x. This series
rename it as @dump-s390-skey, making it available on other
binaries. We take care of backward compatibility via deprecation.
Philippe Mathieu-Daudé (4):
hw/s390x: Introduce the @dump-s390-skeys QMP command
hw/s390x: Introduce the 'dump_s390_skeys' HMP command
hw/s390x: Deprecate the HMP 'dump_skeys' command
hw/s390x: Deprecate the QMP @dump-skeys command
docs/about/deprecated.rst | 5 +++++
qapi/misc-target.json | 5 +++++
qapi/misc.json | 18 ++++++++++++++++++
include/monitor/hmp.h | 1 +
hw/s390x/s390-skeys-stub.c | 24 ++++++++++++++++++++++++
hw/s390x/s390-skeys.c | 19 +++++++++++++++++--
hmp-commands.hx | 17 +++++++++++++++--
hw/s390x/meson.build | 5 +++++
8 files changed, 90 insertions(+), 4 deletions(-)
create mode 100644 hw/s390x/s390-skeys-stub.c
--
2.41.0
3 weeks, 6 days
[libvirt] [PATCH] Fix python error reporting for some storage operations
by Cole Robinson
In the python bindings, all vir* classes expect to be
passed a virConnect object when instantiated. Before
the storage stuff, these classes were only instantiated
in virConnect methods, so the generator is hardcoded to
pass 'self' as the connection instance to these classes.
Problem is there are some methods that return pool or vol
instances which aren't called from virConnect: you can
lookup a storage volume's associated pool, and can lookup
volumes from a pool. In these cases passing 'self' doesn't
give the vir* instance a connection, so when it comes time
to raise an exception crap hits the fan.
Rather than rework the generator to accomodate this edge
case, I just fixed the init functions for virStorage* to
pull the associated connection out of the passed value
if it's not a virConnect instance.
Thanks,
Cole
diff --git a/python/generator.py b/python/generator.py
index 01a17da..c706b19 100755
--- a/python/generator.py
+++ b/python/generator.py
@@ -962,8 +962,12 @@ def buildWrappers():
list = reference_keepers[classname]
for ref in list:
classes.write(" self.%s = None\n" % ref[1])
- if classname in [ "virDomain", "virNetwork", "virStoragePool", "virStorageVol" ]:
+ if classname in [ "virDomain", "virNetwork" ]:
classes.write(" self._conn = conn\n")
+ elif classname in [ "virStorageVol", "virStoragePool" ]:
+ classes.write(" self._conn = conn\n" + \
+ " if not isinstance(conn, virConnect):\n" + \
+ " self._conn = conn._conn\n")
classes.write(" if _obj != None:self._o = _obj;return\n")
classes.write(" self._o = None\n\n");
destruct=None
1 month
[RFC 0/4] meson: Enable -Wundef
by Andrea Bolognani
A few days ago I have posted a patch[1] that addresses an issue
introduced when a meson check was dropped but some uses of the
corresponding WITH_ macro were not removed at the same time.
That got me thinking about what we can do to prevent such scenarios
from happening again in the future. I have come up with something
that I think would be effective, but since applying the approach
throughout the entire codebase would require a non-trivial amount of
work, I figured I'd ask for feedback before embarking on it.
The idea is that there are two types of macros we can use for
conditional compilation: external ones, coming from the OS or other
libraries, and internal ones, which are the result of meson tests.
The external ones (e.g. SIOCSIFFLAGS, __APPLE__) are usually only
defined if they apply, so it is correct to check for their presence
with #ifdef. Using #if will also work, as undefined macros evaluate
to zero, but it's not good practice to use them that way. If -Wundef
has been passed to the compiler, those incorrect uses will be
reported (only on platforms where they are not defined, of course).
The internal ones (e.g. WITH_QEMU, WITH_STRUCT_IFREQ) are similar,
but in this case we control their definition. This means that using
means that the feature is not available on the machine we're building
on, but it could also mean that we've removed the meson check and
forgot to update all users of the macro. In this case, -Wundef would
work 100% reliably to detect the issue: if the meson check doesn't
exist, neither will the macro, regardless of what platform we're
building on.
So the approach I'm suggesting is to use a syntax-check rule to
ensure that internal macros are only ever checked with #if instead of
Of course this requires a full sweep to fix all cases in which we're
not already doing things according to the proposal. Should be fairly
easy, if annoying. A couple of examples are included here for
demonstration purposes.
The bigger impact is going to be on the build system. Right now we
generally only define WITH_ macros if the check passed, but that will
have to change and the result is going to be quite a bit of
additional meson code I'm afraid.
Thoughts?
[1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/S...
Andrea Bolognani (4):
configmake: Check for WIN32 correctly
meson: Always define WITH_*_DECL macros
syntax-check: Ensure WITH_ macros are used correctly
meson: Enable -Wundef
build-aux/syntax-check.mk | 5 +++++
configmake.h.in | 2 +-
meson.build | 3 +++
tests/virmockstathelpers.c | 28 ++++++++++++++--------------
4 files changed, 23 insertions(+), 15 deletions(-)
--
2.43.2
1 month, 1 week
Re: [PATCH] Сheck snapshot disk is not NULL when searching it in the VM config
by Peter Krempa
On Mon, May 20, 2024 at 14:48:47 +0000, Efim Shevrin via Devel wrote:
> Hello,
>
> > If vmdisk is NULL, shouldn't this function (qemuSnapshotDeleteValidate()) return an error?
>
> I think this qemuSnapshotDeleteValidate should not return an error.
>
> It seems to me that when vmdisk is NULL, this does not invalidate
> the snapshot itself, but indicates that the config has changed since
> the snapshot was done. And if the VM config has changed, this adds evidence that the snapshot should be deleted,
> because the snapshot does not reflect the real vm config.
>
> Since we do not have an analogue of the --force option for deleting a snapshot, in the case when qemuSnapshotDeleteValidate returns
> an error when vmdisk is NULL, we will never delete a snapshot which has invalid disk.
Snapshot deletion does have something that can be considered force and
that is the '--metadata' option that removes just the snapshot
definition (metadata) and doesn't touch the disk images.
> > Similarly, disk can be NULL too
> Thank you for the comment regarding the disk variable. I`ve reworked patch.
>
> When creating a snapshot of a VM with multiple hard disks,
> the snapshot takes into account the presence of all disks
> in the system. If, over time, one of the disks is deleted,
> the snapshot will continue to store knowledge of the deleted disk.
> This results in the fact that at the moment of deleting the snapshot,
> at the validation stage, a disk from the snapshot will be searched which
> is not in the VM configuration. As a result, vmdisk variable will
> be equal to NULL. Dereferencing a null pointer at the time of calling
> virStorageSourceIsSameLocation(vmdisk->src, disk->src)
> will result in SIGSEGV.
Crashing is obviously not okay ...
> Also, the disk variable can also be equal to NULL and this
> requires to check that disk != NULL before calling the
> virStorageSourceIsSameLocation function to avoid SIGSEGV.
.. but going ahead with the snapshot deletion isn't always okay either.
The disk isn't referenced by the VM so the disk state can't be merged,
while the state would be merged for any other disk.
When reverting back to a previous snapshot, which is still referencing
the older state of the disk which was removed from the VM, the VM would
see that the image state of disks that were present at deletion would
contain the merged state, but only a partial state for the disk which
was later removed.
2 months, 1 week
[libvirt PATCH] qemu_snapshot: allow reverting to external disk only snapshot
by Pavel Hrdina
When snapshot is created with disk-only flag it is always external
snapshot without memory state. Historically when there was not support
to revert external snapshots this produced error message.
error: Failed to revert snapshot s1
error: internal error: Invalid target domain state 'disk-snapshot'. Refusing snapshot reversion
Now we can simply consider this as reverting to offline snapshot as the
possible damage to file system is already done at the point of snapshot
creation.
Resolves: https://issues.redhat.com/browse/RHEL-21549
Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com>
---
src/qemu/qemu_snapshot.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
index 0cac0c4146..7964f70553 100644
--- a/src/qemu/qemu_snapshot.c
+++ b/src/qemu/qemu_snapshot.c
@@ -2606,6 +2606,7 @@ qemuSnapshotRevert(virDomainObj *vm,
case VIR_DOMAIN_SNAPSHOT_SHUTDOWN:
case VIR_DOMAIN_SNAPSHOT_SHUTOFF:
case VIR_DOMAIN_SNAPSHOT_CRASHED:
+ case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT:
ret = qemuSnapshotRevertInactive(vm, snapshot, snap,
driver, cfg,
&inactiveConfig,
@@ -2617,8 +2618,6 @@ qemuSnapshotRevert(virDomainObj *vm,
_("qemu doesn't support reversion of snapshot taken in PMSUSPENDED state"));
goto endjob;
- case VIR_DOMAIN_SNAPSHOT_DISK_SNAPSHOT:
- /* Rejected earlier as an external snapshot */
case VIR_DOMAIN_SNAPSHOT_NOSTATE:
case VIR_DOMAIN_SNAPSHOT_BLOCKED:
case VIR_DOMAIN_SNAPSHOT_LAST:
--
2.43.0
2 months, 1 week
[PATCH 00/12] Introduce SEV-SNP support
by Michal Privoznik
SEV-SNP support just landed in QEMU. Here is the first round of patches
to incorporate support into libvirt.
TODOs (aka problems of future me):
- Teach tools/virt-qemu-sev-validate how to deal with SEV-SNP
- Try to find a SEV-SNP machine a test these patches in real worl
- Write a kbase article on attestation with SEV-SNP
Michal Prívozník (12):
qemu_monitor_json: Report error in error paths in SEV related code
conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef
conf: Separate SEV formatting into a function
Drop needless typecast to virDomainLaunchSecurity
src: Convert some _virDomainSecDef::sectype checks to switch()
qemu_monitor: Allow querying SEV-SNP state in 'query-sev'
qemu: Report snp-policy in virDomainGetLaunchSecurityInfo()
qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST
conf: Introduce SEV-SNP support
qemu: Build cmd line for SEV-SNP
qemu: Allow setting launch security for SEV-SNP
qemu_firmware: Pick the right firmware for SEV-SNP guests
docs/formatdomain.rst | 108 ++++++++++++
include/libvirt/libvirt-domain.h | 10 ++
src/conf/domain_conf.c | 156 ++++++++++++++----
src/conf/domain_conf.h | 28 +++-
src/conf/domain_validate.c | 44 +++++
src/conf/schemas/domaincommon.rng | 73 ++++++--
src/conf/virconftypes.h | 4 +
src/qemu/qemu_capabilities.c | 4 +
src/qemu/qemu_capabilities.h | 3 +
src/qemu/qemu_cgroup.c | 19 ++-
src/qemu/qemu_command.c | 56 ++++++-
src/qemu/qemu_driver.c | 60 +++++--
src/qemu/qemu_firmware.c | 20 ++-
src/qemu/qemu_monitor.c | 7 +-
src/qemu/qemu_monitor.h | 41 ++++-
src/qemu/qemu_monitor_json.c | 67 ++++++--
src/qemu/qemu_monitor_json.h | 8 +-
src/qemu/qemu_namespace.c | 3 +-
src/qemu/qemu_process.c | 34 ++--
src/qemu/qemu_validate.c | 13 +-
src/security/security_dac.c | 34 +++-
.../caps_9.1.0_x86_64.xml | 1 +
.../firmware/60-edk2-ovmf-x64-amdsev.json | 1 +
tests/qemumonitorjsontest.c | 65 +++++++-
...launch-security-sev-snp.x86_64-latest.args | 35 ++++
.../launch-security-sev-snp.x86_64-latest.xml | 1 +
.../launch-security-sev-snp.xml | 47 ++++++
tests/qemuxmlconftest.c | 2 +
28 files changed, 817 insertions(+), 127 deletions(-)
create mode 100644 tests/qemuxmlconfdata/launch-security-sev-snp.x86_64-latest.args
create mode 120000 tests/qemuxmlconfdata/launch-security-sev-snp.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/launch-security-sev-snp.xml
--
2.44.2
2 months, 2 weeks
[PATCH v2 0/4] multiple memory backend support for CPR Live Updates
by mgalaxy@akamai.com
From: Michael Galaxy <mgalaxy(a)akamai.com>
CPR-based support for whole-hypervisor kexec-based live updates is
now finally merged into QEMU. In support of this, we need NUMA to be
supported in these kinds of environments. To do this we use a technology
called PMEM (persistent memory) in Linux, which underpins the ability for
CPR Live Updates to work so that QEMU memory can remain in RAM and
be recovered after a kexec operationg has completed. Our systems are highly
NUMA-aware, and so this patch series enables NUMA awareness for live updates.
Further, we make a small change that allows live migrations to work
between *non* PMEM-based systems and PMEM-based systems (and
vice-versa). This allows for seemless upgrades from non-live-compatible
systems to live-update-compatible sytems without any downtime.
Michael Galaxy (4):
qemu.conf changes to support multiple memory backend
Support live migration between file-backed memory and anonymous
memory.
Update unit test to support multiple memory backends
Update documentation to reflect memory_backing_dir change in qemu.conf
NEWS.rst | 7 ++
docs/kbase/virtiofs.rst | 2 +
src/qemu/qemu.conf.in | 2 +
src/qemu/qemu_command.c | 8 ++-
src/qemu/qemu_conf.c | 141 +++++++++++++++++++++++++++++++++++-----
src/qemu/qemu_conf.h | 14 ++--
src/qemu/qemu_domain.c | 24 +++++--
src/qemu/qemu_driver.c | 29 +++++----
src/qemu/qemu_hotplug.c | 6 +-
src/qemu/qemu_process.c | 44 +++++++------
src/qemu/qemu_process.h | 7 +-
tests/testutilsqemu.c | 5 +-
12 files changed, 221 insertions(+), 68 deletions(-)
--
2.34.1
5 months, 2 weeks
[PATCH RFC v3 00/16] Support throttle block filters
by wucf@linux.ibm.com
From: Chun Feng Wu <wucf(a)linux.ibm.com>
Hi,
I am thinking to leverage "throttle block filter" in QEMU to support more flexible I/O limits(e.g. tiered I/O groups), one sample provided by QEMU doc is:
https://github.com/qemu/qemu/blob/master/docs/throttle.txt
"For example, let's say that we have three different drives and we want to set I/O limits for
each one of them and an additional set of limits for the combined I/O of all three drives."
The implementation idea is to
- Define throttle groups(limit) in domain
- Define throttle filter to reference throttle group within disk
- Within domain disk, throttle filters references multiple throttle groups to form filter chain to apply multiple limits in QEMU like above sample
- Add new virsh cmds for throttle group management:
throttlegroupset Add or update a throttling group.
throttlegroupdel Delete a throttling group.
throttlegroupinfo Get a throttling group.
throttlegrouplist list all domain throttlegroups
- Update "attach-disk" to add one more option "--throttle-groups" to apply throttle filters e.g. "virsh attach-disk $VM_ID ${DISK_PATH}/vm1_disk_2.qcow2 vdd --driver qemu --subdriver qcow2 --targetbus virtio --throttle-groups limit2,limit012"
- I chose above semantics as I felt they're appropriate, if there are better ones please kindly suggest.
Note, this implementation requires flag "QEMU_CAPS_OBJECT_JSON".
From QMP perspective, the sample flow works this way:
- Throttle group creation:
virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit0","limits":{"iops-total":200,"iops-read":0,"iops-total-max":200,"iops-total-max-length":1}}}'
virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit1","limits":{"iops-total":250,"iops-read":0,"iops-total-max":250,"iops-total-max-length":1}}}'
virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit2","limits":{"iops-total":300,"iops-read":0,"iops-total-max":300,"iops-total-max-length":1}}}'
virsh qemu-monitor-command 1 '{"execute":"object-add", "arguments":{"qom-type":"throttle-group","id":"limit012","limits":{"iops-total":400,"iops-read":0,"iops-total-max":400,"iops-total-max-length":1}}}'
- Chain up filters during attaching disk to apply two filters(limit0 and limit012):
virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments": {"driver":"file","filename":"/virt/disks/vm1_disk_1.qcow2","node-name":"test-3-storage","auto-read-only":true,"discard":"unmap"}}'
virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments":{"node-name":"test-4-format","read-only":false,"driver":"qcow2","file":"test-3-storage","backing":null}}'
virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments":{"driver":"throttle","node-name":"libvirt-5-filter","throttle-group": "limit0","file":"test-4-format"}}'
virsh qemu-monitor-command 1 '{"execute":"blockdev-add", "arguments": {"driver":"throttle","node-name":"libvirt-6-filter","throttle-group":"limit012","file":"libvirt-5-filter"}}'
virsh qemu-monitor-command 1 '{"execute": "device_add", "arguments": {"driver":"virtio-blk-pci","scsi":false,"bus":"pci.0","addr":"0x5","drive":"libvirt-6-filter","id":"virtio-disk1"}}'
This patchset includes:
- Throttle group XML schema definition in patch 1
- Throttle filter XML schema definition in patch 2
- Throttle group struct definition, parsing and formating in patch 3
- Throttle filter struct definition, parsing and formating in patch 4
- New QMP processing to update and get throttle group in patch 5&6
- New API definition and implementation in patch 7
- QEMU driver implementation in patch 8
- Hotplug processing for throttle filters in patch 9
- Extract common iotune validation in patch 10
- qemuProcessLaunch flow implemenation for throttle group in patch 11
- qemuProcessLaunch flow implemenation for throttle filter in patch 12
- Domain XML test for processing throttle groups and filters in patch 13
- Test new implemented driver in patch 14
- New virsh cmd implementation for group in patch 15
- Update Virsh cmd "attach_disk" to include throttle filters in patch 16
v3 changes:
- re-org commits by splitting changes containing throttle group and filters
- update commits msgs
- move schema commits to be the first ones
- refactor "diskIoTune" to extract common schema "iotune"
- add new tests for throttle groups and filters in qemuxmlconftest
- check flag "QEMU_CAPS_OBJECT_JSON" when preparing "-object"(qemu: command: Support throttle groups during qemuProcessLaunch ) or creating throttle group (qemu: Implement qemu driver for throttle API)
- when creating throttle group through "object-add" (qemu: Implement qemu driver for throttle API), reuse "qemuMonitorAddObject" to check if "objectAddNoWrap"("props") is requried
- remove "virObject parent;" in "_virDomainThrottleFilterDef" in domain_conf.h
- remove "virDomainThrottleGroupIndexByName" in both domain_conf.h and domain_conf.c
- remove "virDomainThrottleFilterDefNew" in domain_conf.c
- update "virDomainThrottleFilterDefFree" to use "g_free" rather than "VIR_FREE" in domain_conf.c
- update "virDomainDiskThrottleFilterDefParse" to remove "xmlXPathContextPtr ctx" parameter and check NULL against "filter->group_name" in domain_conf.c
- use "virBufferEscapeString" instead of "virBufferAsprintf" in "virDomainDiskDefFormatThrottleFilterChain"
- use "group->val > 0" instead of "if (group->val)" in FORMAT_THROTTLE_GROUP
- remove NULL check for "group->group_name" since virBufferEscapeString checked NULL already in "virDomainThrottleGroupFormat"
- I haven't added new conf module (src/conf/virdomainthrottle.c/h) because "virDomainThrottleGroupDef" is alias of "_virDomainBlockIoTuneInfo", try to avoid circular dependency
- remove "NULLSTR" in qemuMonitorUpdateThrottleGroup and qemuMonitorGetThrottleGroup in qemu_monitor.c
- refactor "qemuMonitorMakeThrottleGroupLimits" to use virJSONValueObjectAdd in qemu_monitor_json.c
- use "g_strdup_printf" to avoid static buffers in "qemuMonitorJSONGetThrottleGroup"
- remove virReportError after qemuMonitorJSONGetReply in "qemuMonitorJSONGetThrottleGroup" to avoid overriding error
- remove "VIR_DOMAIN_THROTTLE_GROUP" in libvirt/include/libvirt/libvirt-domain.h
- update "virDomainGetThrottleGroup" to not first query the number of parameters,
- update "remote_domain_get_throttle_group_args" and "remote_domain_get_throttle_group_ret" to remove "nparams" in src/remote_protocol-structs, also updated "remote_domain_get_throttle_group_args" and "remote_domain_get_throttle_group_ret" in src/remote/remote_protocol.x
- update parameter "virTypedParameterPtr params" to be "virTypedParameterPtr *params" in "virDrvDomainGetThrottleGroup" in driver-hypervisor.h
- update "qemuDomainSetThrottleGroup" to not query number of parameters first
- remove wrapper "qemuDomainThrottleGroupByName" and "qemuDomainSetThrottleGroupDefaults"
- refactor "qemuDomainSetThrottleGroup" and "qemuDomainSetBlockIoTune" to use common logic
- update "qemuDomainDelThrottleGroup" to use VIR_JOB_MODIFY by referencing "qemuDomainHotplugDelIOThread"
- check if group is still being used by some filter(qemuDomainCheckThrottleGroupRef) during deletion
- replace "ThrottleFilterChain" with "ThrottleFilters"
- update "qemuDomainDiskGetTopNodename" to take top throttle node name if disk has throttles, and reuse "qemuDomainDiskGetBackendAlias" in "qemuBuildDiskDeviceProps" to get top node name as "drive"
- after enabling throttlerfilter and if disk has throttlefilters, during blockcommit, the top node name is not "libvirt-x-format" anymore, instead, top node name referencies top filter like "libvirt-x-filter"
- add check "cdrom device with throttle filters isn't supported"
- delete "filternodenameindex" and reuse "nodenameindex" to generate index for throttle nodes
- refactor detaching filters by adding "qemuBuildThrottleFiltersDetachPrepareBlockdev" to just build parameters for "blockdev-del"
- refactor "testDomainSetBlockIoTune" and "testDomainSetThrottleGroup" to use common logic
Any comments/suggestions will be appriciated!
Chun Feng Wu (16):
schema: Add new domain elements to support multiple throttle groups
schema: Add new domain elements to support multiple throttle filters
config: Introduce ThrottleGroup and corresponding XML parsing
config: Introduce ThrottleFilter and corresponding XML parsing
qemu: monitor: Add support for ThrottleGroup operations
tests: Test qemuMonitorJSONGetThrottleGroup and
qemuMonitorJSONUpdateThrottleGroup
remote: New APIs for ThrottleGroup lifecycle management
qemu: Implement qemu driver for throttle API
qemu: hotplug: Support hot attach and detach block disk along with
throttle filters
config: validate: Refactor disk iotune validation for reuse
qemu: command: Support throttle groups during qemuProcessLaunch
qemu: command: Support throttle filters during qemuProcessLaunch
qemuxmlconftest: Add 'throttlefilter' tests
test_driver: Test throttle group lifecycle APIs
virsh: Add support for throttle group operations
virsh: Add option "throttle-groups" to "attach_disk"
docs/formatdomain.rst | 48 ++
include/libvirt/libvirt-domain.h | 21 +
src/conf/domain_conf.c | 376 +++++++++++
src/conf/domain_conf.h | 51 ++
src/conf/domain_validate.c | 118 +++-
src/conf/schemas/domaincommon.rng | 293 +++++----
src/conf/virconftypes.h | 4 +
src/driver-hypervisor.h | 22 +
src/libvirt-domain.c | 196 ++++++
src/libvirt_private.syms | 9 +
src/libvirt_public.syms | 7 +
src/qemu/qemu_block.c | 131 ++++
src/qemu/qemu_block.h | 53 ++
src/qemu/qemu_command.c | 182 ++++++
src/qemu/qemu_command.h | 12 +
src/qemu/qemu_domain.c | 39 +-
src/qemu/qemu_domain.h | 8 +
src/qemu/qemu_driver.c | 617 +++++++++++++++---
src/qemu/qemu_hotplug.c | 33 +
src/qemu/qemu_monitor.c | 34 +
src/qemu/qemu_monitor.h | 14 +
src/qemu/qemu_monitor_json.c | 150 +++++
src/qemu/qemu_monitor_json.h | 14 +
src/remote/remote_daemon_dispatch.c | 44 ++
src/remote/remote_driver.c | 40 ++
src/remote/remote_protocol.x | 48 +-
src/remote_protocol-structs | 28 +
src/test/test_driver.c | 452 +++++++++----
tests/qemumonitorjsontest.c | 86 +++
.../throttlefilter.x86_64-latest.args | 43 ++
.../throttlefilter.x86_64-latest.xml | 65 ++
tests/qemuxmlconfdata/throttlefilter.xml | 55 ++
tests/qemuxmlconftest.c | 1 +
tools/virsh-completer-domain.c | 64 ++
tools/virsh-completer-domain.h | 5 +
tools/virsh-domain.c | 453 ++++++++++++-
36 files changed, 3447 insertions(+), 369 deletions(-)
create mode 100644 tests/qemuxmlconfdata/throttlefilter.x86_64-latest.args
create mode 100644 tests/qemuxmlconfdata/throttlefilter.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/throttlefilter.xml
--
2.34.1
7 months, 2 weeks
[PATCH RFC 0/9] qemu: Support mapped-ram migration capability
by Jim Fehlig
This series is a RFC for support of QEMU's mapped-ram migration
capability [1] for saving and restoring VMs. It implements the first
part of the design approach we discussed for supporting parallel
save/restore [2]. In summary, the approach is
1. Add mapped-ram migration capability
2. Steal an element from save header 'unused' for a 'features' variable
and bump save version to 3.
3. Add /etc/libvirt/qemu.conf knob for the save format version,
defaulting to latest v3
4. Use v3 (aka mapped-ram) by default
5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2
6. include: Define constants for parallel save/restore
7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2
8. qemu: Add support for parallel restore. Implies mapped-ram.
Reject if v2
9. tools: add parallel parameter to virsh save command
10. tools: add parallel parameter to virsh restore command
This series implements 1-5, with the BYPASS_CACHE support in patches 8
and 9 being quite hacky. They are included to discuss approaches to make
them less hacky. See the patches for details.
The QEMU mapped-ram capability currently does not support directio.
Fabino is working on that now [3]. This complicates merging support
in libvirt. I don't think it's reasonable to enable mapped-ram by
default when BYPASS_CACHE cannot be supported. Should we wait until
the mapped-ram directio support is merged in QEMU before supporting
mapped-ram in libvirt?
For the moment, compression is ignored in the new save version.
Currently, libvirt connects the output of QEMU's save stream to the
specified compression program via a pipe. This approach is incompatible
with mapped-ram since the fd provided to QEMU must be seekable. One
option is to reopen and compress the saved image after the actual save
operation has completed. This has the downside of requiring the iohelper
to handle BYPASS_CACHE, which would preclude us from removing it
sometime in the future. Other suggestions much welcomed.
Note the logical file size of mapped-ram saved images is slightly
larger than guest RAM size, so the files are often much larger than the
files produced by the existing, sequential format. However, actual blocks
written to disk is often lower with mapped-ram saved images. E.g. a saved
image from a 30G, freshly booted, idle guest results in the following
'Size' and 'Blocks' values reported by stat(1)
Size Blocks
sequential 998595770 1950392
mapped-ram 34368584225 1800456
With the same guest running a workload that dirties memory
Size Blocks
sequential 33173330615 64791672
mapped-ram 34368578210 64706944
Thanks for any comments on this RFC!
[1] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/m...
[2] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/K...
[3] https://mail.gnu.org/archive/html/qemu-devel/2024-05/msg04432.html
Jim Fehlig (9):
qemu: Enable mapped-ram migration capability
qemu_fd: Add function to retrieve fdset ID
qemu: Add function to get migration params for save
qemu: Add a 'features' element to save image header and bump version
qemu: conf: Add setting for save image version
qemu: Add support for mapped-ram on save
qemu: Enable mapped-ram on restore
qemu: Support O_DIRECT with mapped-ram on save
qemu: Support O_DIRECT with mapped-ram on restore
src/qemu/libvirtd_qemu.aug | 1 +
src/qemu/qemu.conf.in | 6 +
src/qemu/qemu_conf.c | 8 ++
src/qemu/qemu_conf.h | 1 +
src/qemu/qemu_driver.c | 25 ++--
src/qemu/qemu_fd.c | 18 +++
src/qemu/qemu_fd.h | 3 +
src/qemu/qemu_migration.c | 99 ++++++++++++++-
src/qemu/qemu_migration.h | 11 +-
src/qemu/qemu_migration_params.c | 20 +++
src/qemu/qemu_migration_params.h | 4 +
src/qemu/qemu_monitor.c | 40 ++++++
src/qemu/qemu_monitor.h | 5 +
src/qemu/qemu_process.c | 63 +++++++---
src/qemu/qemu_process.h | 16 ++-
src/qemu/qemu_saveimage.c | 187 +++++++++++++++++++++++------
src/qemu/qemu_saveimage.h | 20 ++-
src/qemu/qemu_snapshot.c | 12 +-
src/qemu/test_libvirtd_qemu.aug.in | 1 +
19 files changed, 455 insertions(+), 85 deletions(-)
--
2.44.0
7 months, 3 weeks