Posted to
https://bugzilla.redhat.com/show_bug.cgi?id=1217185
I just stumbled on another bug while snapshotting and think it's related
to 1210903 and 1197592 as it seems like some sort of race condition
because it depends on what logging is in place and doesn't happen every
time.
Here are the details:
I wrote this test script to snapshot and commit over and over:
#!/bin/sh
while [ 1 ]; do
echo "Starting snapshot test `date`"
virsh snapshot-create-as test 20150429 20150429-backup --disk-only
--atomic
virsh domblklist test
virsh blockcommit test vda --active --pivot --verbose
virsh snapshot-delete test 20150429 --metadata
virsh domblklist test
rm /glustervol1/vm/test/test.20150429
echo "Ending snapshot test `date`"
echo
echo
sleep 2
done
If I run libvirtd in the foreground with debug set to 1 I can't get it
to fail, it does what it's supposed to do, snapshot and commit over and
over.
If I run libvirtd in the foreground with debug set to 3, then I will
always eventually get this:
Starting snapshot test Wed Apr 29 09:34:34 AKDT 2015
Domain snapshot 20150429 created
Target Source
------------------------------------------------
vda /glustervol1/vm/test/test.20150429
hdc /dev/sr0
Block Commit: [100 %]
Successfully pivoted
Domain snapshot 20150429 deleted
Target Source
------------------------------------------------
vda /glustervol1/vm/test/test.20150429
hdc /dev/sr0
Ending snapshot test Wed Apr 29 09:34:35 AKDT 2015
Starting snapshot test Wed Apr 29 09:34:37 AKDT 2015
error: unsupported configuration: source for disk 'vda' is not a regular
file; refusing to generate external snapshot name
Target Source
------------------------------------------------
vda /glustervol1/vm/test/test.20150429
hdc /dev/sr0
error: internal error: qemu block name '/glustervol1/vm/test/test.qcow2'
doesn't match expected '/glustervol1/vm/test/test.20150429'
error: Domain snapshot not found: no domain snapshot with matching name
'20150429'
Target Source
------------------------------------------------
vda /glustervol1/vm/test/test.20150429
hdc /dev/sr0
rm: can't remove '/glustervol1/vm/test/test.20150429': No such file or
directory
Ending snapshot test Wed Apr 29 09:34:37 AKDT 2015
At this point libvirt is confused about which file is the backing store
because the first run did pivot after blockcommit, but didn't update the
block file. From the logs:
2015-04-29 17:33:41.052+0000: 25192: warning : qemuDomainObjTaint:1972 :
Domain id=2 name='test' uuid=4b9cc25b-68d1-4ce8-8a65-2a378e255e36 is
tainted: high-privileges
2015-04-29 17:34:37.322+0000: 25191: error :
virDomainSnapshotAlignDisks:609 : unsupported configuration: source for
disk 'vda' is not a regular file; refusing to generate external snapshot
name
2015-04-29 17:34:37.352+0000: 25194: error :
qemuMonitorJSONDiskNameLookup:3977 : internal error: unable to find
backing name for device drive-virtio-disk0
2015-04-29 17:34:37.354+0000: 25194: error :
qemuMonitorJSONDiskNameLookupOne:3914 : internal error: qemu block name
'/glustervol1/vm/test/test.qcow2' doesn't match expected
'/glustervol1/vm/test/test.20150429'
So libvirt insists that the block file is:
root@wasvirt2:/glustervol1/vm/waspbx# virsh domblklist test
Target Source
------------------------------------------------
vda /glustervol1/vm/test/test.20150429
hdc /dev/sr0
But that file isn't in use and isn't what qemu is using:
root@wasvirt2:/glustervol1/vm/waspbx# lsof | grep test
25300 /usr/bin/qemu-system-x86_64 /var/log/libvirt/qemu/test.log
25300 /usr/bin/qemu-system-x86_64 /var/log/libvirt/qemu/test.log
25300 /usr/bin/qemu-system-x86_64 /glustervol1/vm/test/test.qcow2
The only way to straighten this out is to destroy and start the domain.