Hi
Thanks for following up, and sorry for the delay in getting back to you.
You're right to suspect the issue might be related to device changes. Here’s how the
crash can be triggered:
The VM initially uses a SATA controller, with a disk defined as:
xml
复制编辑
<controller type="scsi" index="0"
model="lsilogic"></controller> <disk type='file'
device='disk'> <driver name='qemu' type='qcow2'/>
<source file='/var/lib/libvirt/images/Testguest.qcow2'/> <target
dev='sda' bus='sata'/> </disk>
A snapshot is created at this point — which records the disk as sda.
Later, the VM is reconfigured to use a virtio controller, and the disk is now assigned as
vda.
When the VM is running and the snapshot is deleted, the snapshot code still expects to
find a disk named sda in the current VM definition.
Because of this mismatch, qemuDomainDiskByName() returns NULL, and the crash occurs when
the result is used without a null check.
This can easily happen during controller or disk bus reconfiguration between snapshot and
deletion. The patch adds sanity checks to ensure we don’t dereference a null pointer in
this situation.
Let me know if you’d like me to adjust the wording in the error messages or add a
reproducer for automated testing.
Best regards,
kaihuan
Martin Kletzander <mkletzan(a)redhat.com> 于2025年1月22日周三 21:05写道:
On Fri, Nov 29, 2024 at 10:56:45PM +0800, kaihuan wrote:
qemuDomainDiskByName() can return a NULL pointer on failure.
But this returned value in qemuSnapshotDeleteValidate is not checked.It will make libvirtd
crash.
Hi, looking through old unreviewed patches I found this one. Sorry for
the wait. Can you explain whether you found out how to trigger that?
This looks like there is some other issue that causes a snapshot having
a disk that is not in the domain. Does that happen when you want to
delete a snapshot after unplugging a disk?
Signed-off-by: kaihuan <jungleman759(a)gmail.com>
---
src/qemu/qemu_snapshot.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
index 18b2e478f6..bcbd913073 100644
--- a/src/qemu/qemu_snapshot.c
+++ b/src/qemu/qemu_snapshot.c
@@ -4242,8 +4242,19 @@ qemuSnapshotDeleteValidate(virDomainObj *vm,
virDomainDiskDef *vmdisk = NULL;
virDomainDiskDef *disk = NULL;
- vmdisk = qemuDomainDiskByName(vm->def, snapDisk->name);
- disk = qemuDomainDiskByName(snapdef->parent.dom, snapDisk->name);
The
function calls already report an error when the disk is not found,
so there is not need to rewrite that error in my opinion.
+ if (!(vmdisk = qemuDomainDiskByName(vm->def, snapDisk->name)))
{
+ virReportError(VIR_ERR_OPERATION_FAILED,
+ _("disk '%1$s' referenced by snapshot '%2$s' not found in the
current definition"),
+ snapDisk->name, snap->def->name);
+ return -1;
+ }
+
+ if (!(disk = qemuDomainDiskByName(snapdef->parent.dom, snapDisk->name))) {
+ virReportError(VIR_ERR_OPERATION_FAILED,
+ _("disk '%1$s' referenced by snapshot '%2$s' not found in the VM
definition of the deleted snapshot"),
+ snapDisk->name, snap->def->name);
+ return -1;
+ }
if (!virStorageSourceIsSameLocation(vmdisk->src, disk->src)) {
virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
--
2.33.1.windows.1