[PATCH] qemu: Report error from both sides of migration
by Jiri Denemark
When migration fails in Perform phase, we call Finish on the destination
host with cancelled=1 and get the error from there and report it to the
user. This works well if the error on the destination caused the
migration to fail. But in other cases the main error may reported by the
source and the destination would just be complaining about broken
migration stream.
In other words, we don't really know which error caused the migration to
fail and we have no way of detecting that. So instead of choosing one
error, this patch will combine the error messages from both sides of
migration into a single message and report it to the user. The result
would be, for example:
operation failed: migration failed. Message from the source host:
operation failed: job 'migration out' failed: Certificate does not
match the hostname ble.bla. Message from the destination host:
operation failed: job 'migration in' failed: load of migration
failed: Invalid argument
And yes, this is ugly, but I wasn't able to come up with a better way of
fixing this issue.
https://issues.redhat.com/browse/RHEL-58933
Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
---
src/libvirt-domain.c | 26 +++++++++++++-------------
src/qemu/qemu_migration.c | 26 +++++++++++++-------------
2 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
index e8e5379672..efccafc4d2 100644
--- a/src/libvirt-domain.c
+++ b/src/libvirt-domain.c
@@ -3430,26 +3430,26 @@ virDomainMigrateVersion3Full(virDomainPtr domain,
if (ddomain) {
VIR_ERROR(_("finish step ignored that migration was cancelled"));
} else {
- /* If Finish reported a useful error, use it instead of the
- * original "migration unexpectedly failed" error.
+ virErrorPtr err = virGetLastError();
+ /* When both Confirm and Finish reported an error in QEMU driver,
+ * we don't really know which error is the root cause. Let's report
+ * both errors to the user.
*
* This is ugly but we can't do better with the APIs we have. We
* only replace the error if Finish was called with cancelled == 1
* and reported a real error (old libvirt would report an error
- * from RPC instead of MIGRATE_FINISH_OK), which only happens when
- * the domain died on destination. To further reduce a possibility
- * of false positives we also check that Perform returned
- * VIR_ERR_OPERATION_FAILED.
+ * from RPC instead of MIGRATE_FINISH_OK).
*/
if (orig_err &&
orig_err->domain == VIR_FROM_QEMU &&
- orig_err->code == VIR_ERR_OPERATION_FAILED) {
- virErrorPtr err = virGetLastError();
- if (err &&
- err->domain == VIR_FROM_QEMU &&
- err->code != VIR_ERR_MIGRATE_FINISH_OK) {
- g_clear_pointer(&orig_err, virFreeError);
- }
+ orig_err->code == VIR_ERR_OPERATION_FAILED &&
+ err &&
+ err->domain == VIR_FROM_QEMU &&
+ err->code != VIR_ERR_MIGRATE_FINISH_OK) {
+ virReportError(VIR_ERR_OPERATION_FAILED,
+ _("migration failed. Message from the source host: %1$s. Message from the destination host: %2$s"),
+ orig_err->message, err->message);
+ g_clear_pointer(&orig_err, virFreeError);
}
}
}
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index 1582a738a3..8c0e522828 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -5904,26 +5904,26 @@ qemuMigrationSrcPerformPeer2Peer3(virQEMUDriver *driver,
if (ddomain) {
VIR_ERROR(_("finish step ignored that migration was cancelled"));
} else {
- /* If Finish reported a useful error, use it instead of the
- * original "migration unexpectedly failed" error.
+ virErrorPtr err = virGetLastError();
+ /* When both Confirm and Finish reported an error in QEMU driver,
+ * we don't really know which error is the root cause. Let's report
+ * both errors to the user.
*
* This is ugly but we can't do better with the APIs we have. We
* only replace the error if Finish was called with cancelled == 1
* and reported a real error (old libvirt would report an error
- * from RPC instead of MIGRATE_FINISH_OK), which only happens when
- * the domain died on destination. To further reduce a possibility
- * of false positives we also check that Perform returned
- * VIR_ERR_OPERATION_FAILED.
+ * from RPC instead of MIGRATE_FINISH_OK).
*/
if (orig_err &&
orig_err->domain == VIR_FROM_QEMU &&
- orig_err->code == VIR_ERR_OPERATION_FAILED) {
- virErrorPtr err = virGetLastError();
- if (err &&
- err->domain == VIR_FROM_QEMU &&
- err->code != VIR_ERR_MIGRATE_FINISH_OK) {
- g_clear_pointer(&orig_err, virFreeError);
- }
+ orig_err->code == VIR_ERR_OPERATION_FAILED &&
+ err &&
+ err->domain == VIR_FROM_QEMU &&
+ err->code != VIR_ERR_MIGRATE_FINISH_OK) {
+ virReportError(VIR_ERR_OPERATION_FAILED,
+ _("migration failed. Message from the source host: %1$s. Message from the destination host: %2$s"),
+ orig_err->message, err->message);
+ g_clear_pointer(&orig_err, virFreeError);
}
}
}
--
2.47.1
3 weeks, 4 days
[PATCH 0/2] tests: Add qemu-9.2 dev cycle caps on s390x
by Michal Privoznik
NB, the capabilities were captured in a VM running on an actual s390x
machine so there might be some CPU features (incorrectly) missing. But
it's the best I can do.
Michal Prívozník (2):
qemuxmlconftest: Switch s390-default-cpu-...-virtio-2.7 to virtio-2.9
tests: qemucapabilities: Add test data for the qemu-9.2 dev cycle on
s390x
tests/domaincapsdata/qemu_9.3.0.s390x.xml | 436 +
.../caps_9.3.0_s390x.replies | 38101 ++++++++++++++++
.../qemucapabilitiesdata/caps_9.3.0_s390x.xml | 4268 ++
...-deprecated-features-off.s390x-latest.args | 2 +-
...default-video-type-s390x.s390x-latest.args | 2 +-
...vfio-zpci-ccw-memballoon.s390x-latest.args | 2 +-
.../launch-security-s390-pv.s390x-latest.args | 2 +-
...-cpu-kvm-ccw-virtio-2.9.s390x-latest.args} | 4 +-
...t-cpu-kvm-ccw-virtio-2.9.s390x-latest.xml} | 4 +-
...> s390-default-cpu-kvm-ccw-virtio-2.9.xml} | 2 +-
...t-cpu-kvm-ccw-virtio-4.2.s390x-latest.args | 2 +-
...-cpu-tcg-ccw-virtio-2.9.s390x-latest.args} | 2 +-
...t-cpu-tcg-ccw-virtio-2.9.s390x-latest.xml} | 2 +-
...> s390-default-cpu-tcg-ccw-virtio-2.9.xml} | 2 +-
.../s390-defaultconsole.s390x-latest.args | 2 +-
.../s390-panic.s390x-latest.args | 2 +-
tests/qemuxmlconftest.c | 4 +-
17 files changed, 42822 insertions(+), 17 deletions(-)
create mode 100644 tests/domaincapsdata/qemu_9.3.0.s390x.xml
create mode 100644 tests/qemucapabilitiesdata/caps_9.3.0_s390x.replies
create mode 100644 tests/qemucapabilitiesdata/caps_9.3.0_s390x.xml
rename tests/qemuxmlconfdata/{s390-default-cpu-kvm-ccw-virtio-2.7.s390x-latest.args => s390-default-cpu-kvm-ccw-virtio-2.9.s390x-latest.args} (70%)
rename tests/qemuxmlconfdata/{s390-default-cpu-kvm-ccw-virtio-2.7.s390x-latest.xml => s390-default-cpu-kvm-ccw-virtio-2.9.s390x-latest.xml} (86%)
rename tests/qemuxmlconfdata/{s390-default-cpu-kvm-ccw-virtio-2.7.xml => s390-default-cpu-kvm-ccw-virtio-2.9.xml} (85%)
rename tests/qemuxmlconfdata/{s390-default-cpu-tcg-ccw-virtio-2.7.s390x-latest.args => s390-default-cpu-tcg-ccw-virtio-2.9.s390x-latest.args} (94%)
rename tests/qemuxmlconfdata/{s390-default-cpu-tcg-ccw-virtio-2.7.s390x-latest.xml => s390-default-cpu-tcg-ccw-virtio-2.9.s390x-latest.xml} (92%)
rename tests/qemuxmlconfdata/{s390-default-cpu-tcg-ccw-virtio-2.7.xml => s390-default-cpu-tcg-ccw-virtio-2.9.xml} (85%)
--
2.45.2
3 weeks, 4 days
Release of libvirt-11.0.0
by Jiri Denemark
The 11.0.0 release of both libvirt and libvirt-python is tagged and
signed tarballs are available at
https://download.libvirt.org/
https://download.libvirt.org/python/
Thanks everybody who helped with this release by sending patches,
reviewing, testing, or providing feedback. Your work is greatly
appreciated.
* New features
* network/qemu/lxc: support vlans on standard Linux host bridges
The network, qemu, and lxc drivers now support (using the
``<vlan>`` subelement) vlan tagging and trunking on network
interfaces connected to a standard Linux host bridge.
* qemu: Add support for direct and extended tlbflush features
Domains can now utilise more tlbflush hyperv features.
* Improvements
* ch: Enable user aliases
User can now specify custom aliases for devices in domain XML
* qemu: Grab a QUERY job when formatting domain XML
Under some specific conditions it might have happened that domain XML did
not contain runtime information or returned an XML that's in process of
changing (e.g. by a thread that's hotplugging a device). Formatting domain
XML now serializes properly with other threads.
* virtiofs: Allow read only mode
The ``<filesystem/>`` with `virtiofsd` backend can now use ``<readonly/>``
tag to export underlying filesystem in read only mode.
* qemu: allow migration of vGPU from mdev device <-> SRIOV VF device
Some GPU vendors are switching from using vGPUs creating using
mdev and identified with a uuid, to vGPUs created as SRIOV VFs and
identified by their PCI address, and want to support live
migration from a host using one type of vGPU to the other
type. This is now possible.
* Bug fixes
* qemu: tpm: do not update profile name for transient domains
Fix a possible crash when starting a transient domain which was
introduced in the previous release.
* qemu: Fix snapshot to not delete disk image with internal snapshot
When a VM has internal snapshot that is parent to external snapshot and user
reverts to the internal snapshot and deletes the external snapshot libvirt
would delete the disk image containing the internal snapshot. This would
result in data loss.
* qemu: Do not format invalid XML with hyperv features in passthrough mode
When hyperv features were specified together with ``mode="passthrough"``
libvirt parsed and formatted such features in the domain XML even though
they were not used at all, resulting in XML that is not valid based on our
schema. This is now fixed by not parsing any specified features when the
passthrough mode is used.
* qemu: Fix a crash when starting a domain with ovs bridge and QOS
* cpu: Add missing -v1 variants for CPU models
Some CPU models (mostly old ones) were missed when versioned CPU model
names were introduced in the previous release.
* qemu: Fix false error when recovering failed post-copy migration
In some cases libvirt would report a failure to recover post-copy migration
even though the recovery started just fine and migration would eventually
successfully finish.
Enjoy.
Jirka
3 weeks, 5 days
[PATCH 00/20] qemu: support mapped-ram+directio+mulitfd
by Jim Fehlig
This series is essentially V1 of a prior RFC [1] to support QEMU's
mapped-ram stream format [2] and migration capability. Along with
supporting mapped-ram, it implements a design approach we discussed
for supporting parallel save/restore [3]. In summary, the approach is
1. Add mapped-ram migration capability
2. Steal an element from save header 'unused' for a 'features' variable
and bump save version to 3.
3. Add /etc/libvirt/qemu.conf knob for the save format version,
defaulting to latest v3
4. Use v3 (aka mapped-ram) by default
5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2
6. include: Define constants for parallel save/restore
7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2
8. qemu: Add support for parallel restore. Implies mapped-ram.
Reject if v2
9. tools: add parallel parameter to virsh save command
10. tools: add parallel parameter to virsh restore command
With this series, saving and restoring using mapped-ram is enabled by
default if the underlying QEMU advertises the mapped-ram migration
capability. It can be disabled by changing the 'save_image_version'
setting in qemu.conf.
To use mapped-ram with QEMU:
- The 'mapped-ram' migration capability must be set to true
- The 'multifd' migration capability must be set to true and
the 'multifd-channels' migration parameter must set to a
value >= 1
- QEMU must be provided an fdset containing the migration fd(s)
- The 'migrate' qmp command is invoked with a URI referencing the fdset
and an offset where to start reading or writing the data stream, e.g.
{"execute":"migrate",
"arguments":{"detach":true,"resume":false,
"uri":"file:/dev/fdset/0,offset=0x11921"}}
The mapped-ram stream, in conjunction with direct IO and multifd, can
significantly improve the time required to save VM memory state. The
following tables compare mapped-ram with the existing, sequential save
stream. In all cases, the save and restore operations are to/from a
block device comprised of two NVMe disks in RAID0 configuration with
xfs (~8600MiB/s). The values in the 'save time' and 'restore time'
columns were scraped from the 'real' time reported by time(1). The
'Size' and 'Blocks' columns were provided by the corresponding
outputs of stat(1).
VM: 32G RAM, 1 vcpu, idle (shortly after boot)
| save | restore |
| time | time | Size | Blocks
-----------------------+---------+---------+--------------+--------
legacy | 6.193s | 4.399s | 985744812 | 1925288
-----------------------+---------+---------+--------------+--------
mapped-ram | 5.109s | 1.176s | 34368554354 | 1774472
-----------------------+---------+---------+--------------+--------
legacy + direct IO | 5.725s | 4.512s | 985765251 | 1925328
-----------------------+---------+---------+--------------+--------
mapped-ram + direct IO | 4.627s | 1.490s | 34368554354 | 1774304
-----------------------+---------+---------+--------------+--------
mapped-ram + direct IO | | | |
+ multifd-channels=8 | 4.421s | 0.845s | 34368554318 | 1774312
-------------------------------------------------------------------
VM: 32G RAM, 30G dirty, 1 vcpu in tight loop dirtying memory
| save | restore |
| time | time | Size | Blocks
-----------------------+---------+---------+--------------+---------
legacy | 25.800s | 14.332s | 33154309983 | 64754512
-----------------------+---------+---------+--------------+---------
mapped-ram | 18.742s | 15.027s | 34368559228 | 64617160
-----------------------+---------+---------+--------------+---------
legacy + direct IO | 13.115s | 18.050s | 33154310496 | 64754520
-----------------------+---------+---------+--------------+---------
mapped-ram + direct IO | 13.623s | 15.959s | 34368557392 | 64662040
-----------------------+-------- +---------+--------------+---------
mapped-ram + direct IO | | | |
+ multifd-channels=8 | 6.994s | 6.470s | 34368554980 | 64665776
--------------------------------------------------------------------
As can be seen from the tables, one caveat of mapped-ram is the logical file
size of a saved image is basically equivalent to the VM memory size. Note
however that mapped-ram typically uses fewer blocks on disk.
Support for mapped-ram+direct-io only recently landed in upstream QEMU
and will first appear in the 9.1 release, which may complicate merging
support in libvirt. Specifically, I'm not sure how to detect if the
combination is supported by QEMU. Suggestions welcomed.
Similar to the RFC, V1 ignores compression. libvirt currently supports
compression by connecting the output of QEMU's save stream to the specified
compression program via a pipe. This approach is incompatible with mapped-ram
since the fd provided to QEMU must be seekable. In general, we can consider
mapped-ram and compression incompatible and document they cannot be used
together.
[1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/E...
[2] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/m...
[3] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/K...
Claudio Fontana (2):
include: Define constants for parallel save/restore
tools: add parallel parameter to virsh restore command
Jim Fehlig (17):
lib: virDomainSaveParams: Ensure absolute save path
qemu_fd: Add function to retrieve fdset ID
qemu: Add function to check capability in migration params
qemu: Add function to get bool value from migration params
qemu: Add mapped-ram migration capability
qemu: Add function to get migration params for save
qemu: QEMU_SAVE_VERSION: Bump to version 3
qemu: conf: Add setting for save image version
qemu: Add helper function for creating save image fd
qemu: Add support for mapped-ram on save
qemu: Decompose qemuSaveImageOpen
qemu: Move creation of qemuProcessIncomingDef struct
qemu: Apply migration parameters in qemuMigrationDstRun
qemu: Add support for mapped-ram on restore
qemu: Support O_DIRECT with mapped-ram on save
qemu: Support O_DIRECT with mapped-ram on restore
qemu: Add support for parallel save and restore
Li Zhang (1):
tools: add parallel parameter to virsh save command
docs/manpages/virsh.rst | 9 +-
include/libvirt/libvirt-domain.h | 13 ++
src/libvirt-domain.c | 52 +++++--
src/qemu/libvirtd_qemu.aug | 1 +
src/qemu/qemu.conf.in | 6 +
src/qemu/qemu_conf.c | 16 +++
src/qemu/qemu_conf.h | 5 +
src/qemu/qemu_driver.c | 104 +++++++++-----
src/qemu/qemu_fd.c | 18 +++
src/qemu/qemu_fd.h | 3 +
src/qemu/qemu_migration.c | 192 +++++++++++++++++--------
src/qemu/qemu_migration.h | 9 +-
src/qemu/qemu_migration_params.c | 86 ++++++++++++
src/qemu/qemu_migration_params.h | 17 +++
src/qemu/qemu_monitor.c | 39 ++++++
src/qemu/qemu_monitor.h | 5 +
src/qemu/qemu_process.c | 120 +++++++++++-----
src/qemu/qemu_process.h | 19 ++-
src/qemu/qemu_saveimage.c | 216 ++++++++++++++++++++---------
src/qemu/qemu_saveimage.h | 35 +++--
src/qemu/qemu_snapshot.c | 26 ++--
src/qemu/test_libvirtd_qemu.aug.in | 1 +
tools/virsh-domain.c | 79 +++++++++--
23 files changed, 827 insertions(+), 244 deletions(-)
--
2.35.3
3 weeks, 6 days
[PATCH] NEWS: Document some of my fixes in this release
by Jiri Denemark
Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
---
NEWS.rst | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst
index 15042595ed..1d3e3c3cff 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -61,6 +61,19 @@ v11.0.0 (unreleased)
schema. This is now fixed by not parsing any specified features when the
passthrough mode is used.
+ * qemu: Fix a crash when starting a domain with ovs bridge and QOS
+
+ * cpu: Add missing -v1 variants for CPU models
+
+ Some CPU models (mostly old ones) were missed when versioned CPU model
+ names were introduced in the previous release.
+
+ * qemu: Fix false error when recovering failed post-copy migration
+
+ In some cases libvirt would report a failure to recover post-copy migration
+ even though the recovery started just fine and migration would eventually
+ successfully finish.
+
v10.10.0 (2024-12-02)
=====================
--
2.47.1
3 weeks, 6 days
[PATCH] NEWS: Add few things I changed this release
by Martin Kletzander
Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
---
NEWS.rst | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst
index 9f0804ec67ff..02679602cc6a 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -17,6 +17,10 @@ v11.0.0 (unreleased)
* **New features**
+ * qemu: Add support for direct and extended tlbflush features
+
+ Domains can now utilise more tlbflush hyperv features.
+
* **Improvements**
* ch: Enable user aliases
@@ -42,6 +46,14 @@ v11.0.0 (unreleased)
Fix a possible crash when starting a transient domain which was
introduced in the previous release.
+ * qemu: Do not format invalid XML with hyperv features in passthrough mode
+
+ When hyperv features were specified together with ``mode="passthrough"``
+ libvirt parsed and formatted such features in the domain XML even though
+ they were not used at all, resulting in XML that is not valid based on our
+ schema. This is now fixed by not parsing any specified features when the
+ passthrough mode is used.
+
v10.10.0 (2024-12-02)
=====================
--
2.48.0
3 weeks, 6 days
[PATCH] NEWS: Document features/improvements/bug fixes I've participated in
by Michal Privoznik
There are some features/improvements/bug fixes I've either
contributed or reviewed/merged. Document them for upcoming
release.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
NEWS.rst | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst
index eedb6008bd..9f0804ec67 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -19,6 +19,22 @@ v11.0.0 (unreleased)
* **Improvements**
+ * ch: Enable user aliases
+
+ User can now specify custom aliases for devices in domain XML
+
+ * qemu: Grab a QUERY job when formatting domain XML
+
+ Under some specific conditions it might have happened that domain XML did
+ not contain runtime information or returned an XML that's in process of
+ changing (e.g. by a thread that's hotplugging a device). Formatting domain
+ XML now serializes properly with other threads.
+
+ * virtiofs: Allow read only mode
+
+ The ``<filesystem/>`` with `virtiofsd` backend can now use ``<readonly/>``
+ tag to export underlying filesystem in read only mode.
+
* **Bug fixes**
* qemu: tpm: do not update profile name for transient domains
--
2.45.2
3 weeks, 6 days
[libvirt PATCH] NEWS: document bug fix for snapshots
by Pavel Hrdina
Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com>
---
NEWS.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst
index 08d1d24d48..98f489dfb0 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -21,6 +21,13 @@ v11.0.0 (unreleased)
* **Bug fixes**
+ * qemu: Fix snapshot to not delete disk image with internal snapshot
+
+ When a VM has internal snapshot that is parent to external snapshot and user
+ reverts to the internal snapshot and deletes the external snapshot Libvirt
+ would delete the disk image containing the internal snapshot. This would result
+ in data loss.
+
v10.10.0 (2024-12-02)
=====================
--
2.47.1
3 weeks, 6 days
[libvirt PATCH] NEWS: document fix for starting transient domains
by Ján Tomko
Signed-off-by: Ján Tomko <jtomko(a)redhat.com>
---
NEWS.rst | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst
index 08d1d24d48..eedb6008bd 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -21,6 +21,11 @@ v11.0.0 (unreleased)
* **Bug fixes**
+ * qemu: tpm: do not update profile name for transient domains
+
+ Fix a possible crash when starting a transient domain which was
+ introduced in the previous release.
+
v10.10.0 (2024-12-02)
=====================
--
2.47.0
3 weeks, 6 days