Hi everyone,
we are suddenly having a problem with executing our backup jobs. For a
long time, we have used a shell script which contains the following code
to backup all our virtual machines:
for domain in Testserver Faktura Fileserver Gitolite Jenkins
Nexus SimpleHelp VpnGateway Wiki; do
echo -n "$(date +"%Y-%m-%d %H:%M:%S") starting backup
for vm ${domain} ... " >> ${vmlog}
virsh dumpxml --security-info ${domain} >
${vmdir}/${domain}.xml
virsh undefine ${domain} >> ${vmlog}
virsh blockcopy ${domain}
/var/lib/libvirt/images/${domain}.img ${vmdir}/${domain}.img --wait
--finish >> ${vmlog}
virsh define ${vmdir}/${domain}.xml >> ${vmlog}
done
This has worked great for us, but all of the sudden (eventually
triggered by an update, since of course we do regular security/package
updates on this machine) we are having problems. For some virtual
machines, it still works perfectly, but for others, virsh tells us that
a blockjob is still active and therefore the backup fails. This seems to
happen to machines at random. However, when we then try to query the
active blockjob, virsh tells us that no blockjob is active. Consider the
following log from the shell:
root@gfii-host:~# virsh undefine Gitolite
error: Failed to undefine domain Gitolite
error: Requested operation is not valid: cannot
undefine transient domain
root@gfii-host:~# virsh blockcopy Gitolite
/var/lib/libvirt/images/Gitolite.img /tmp/test-blockcopy-gitolite.img
--wait --finish
error: block copy still active: disk 'vda' already in
active block job
root@gfii-host:~# virsh blockjob Gitolite
/var/lib/libvirt/images/Gitolite.img
No current block job for
/var/lib/libvirt/images/Gitolite.img
root@gfii-host:~# virsh define
/var/local/backup/vms/2016-06-22T013001/Gitolite.xml
error: Failed to define domain from
/var/local/backup/vms/2016-06-22T013001/Gitolite.xml
error: block copy still active: domain has active
block job
Of course we tried to start/stop the virtual machines, rebooted the
whole host multiple times etc., but the problem comes back every night.
The machine is a Debian Wheezy machine with current updates. We are
using the qemu-kvm package from wheezy-backports to enable blockcopy
support.
Best regards
Markus