[libvirt-users] domain has active block job

Hi there, I receive this error when I run nova image-create <VM name> <Vm Sanpshot name>: Exception during message handling: block copy still active: domain has active block job In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job Libvirt is 1.2.7 version, linux system is Debian Wheezy Please, what does it mean ? Regards Fiorenza -- Spazio Web S.r.l. V. Dante, 10 13900 Biella Tel.: +39 015 2431982 Fax.: +39 015 2522600 Numero d'Iscrizione al Registro Imprese presso CCIAA Biella, Cod.Fisc.e P.Iva: 02414430021 Iscriz. REA: BI - 188936 Cap. Soc.: €. 30.000 i.v.

On Tue, Jan 13, 2015 at 10:10:53AM +0100, Fiorenza Meini wrote:
Hi there, I receive this error when I run nova image-create <VM name> <Vm Sanpshot name>:
Okay, you're talking in the context of OpenStack. You can also check the Nova compute.log for more contextual details of why the operation failed.
Exception during message handling: block copy still active: domain has active block job
In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job
Libvirt is 1.2.7 version, linux system is Debian Wheezy
Please, what does it mean ?
It means, according to libvirt, there's an unfinished `blockcopy` (Copy a disk backing image chain to destination ) operation. If you can find the associated libvirt guest name, you can try the below maybe. For the Nova libvirt guest that you're trying to create a snapshot of, find the location of its current block device: $ virsh domblklist instance-YYYYYYYYY Then check if there are active block operaations for that disk: $ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --info If there is any operation and if it is a test environment, you can run the below to abort the block operation in progress: $ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --abort. Please the man page of `virsh` for more details. -- /kashyap

Il 13/01/2015 10:51, Kashyap Chamarthy ha scritto:
On Tue, Jan 13, 2015 at 10:10:53AM +0100, Fiorenza Meini wrote:
Hi there, I receive this error when I run nova image-create <VM name> <Vm Sanpshot name>:
Okay, you're talking in the context of OpenStack.
You can also check the Nova compute.log for more contextual details of why the operation failed.
Exception during message handling: block copy still active: domain has active block job
In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job
Libvirt is 1.2.7 version, linux system is Debian Wheezy
Please, what does it mean ?
It means, according to libvirt, there's an unfinished `blockcopy` (Copy a disk backing image chain to destination ) operation.
If you can find the associated libvirt guest name, you can try the below maybe.
For the Nova libvirt guest that you're trying to create a snapshot of, find the location of its current block device:
$ virsh domblklist instance-YYYYYYYYY
Then check if there are active block operaations for that disk:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --info
If there is any operation and if it is a test environment, you can run the below to abort the block operation in progress:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --abort.
Please the man page of `virsh` for more details.
Thank for your response, I'm quite new on libvirt... 1) virsh blockjob ... -info gave me an empyt list 2( virsh blockjob ... -abort gave me this error: error: Requested operation is not valid: another job on disk 'vda' is still being ended.... ..but what ? I'm looking at the man page of virsh to find an useful option for me... Regards Fiorenza -- Spazio Web S.r.l. V. Dante, 10 13900 Biella Tel.: +39 015 2431982 Fax.: +39 015 2522600 Numero d'Iscrizione al Registro Imprese presso CCIAA Biella, Cod.Fisc.e P.Iva: 02414430021 Iscriz. REA: BI - 188936 Cap. Soc.: €. 30.000 i.v.

On Tue, Jan 13, 2015 at 03:07:07PM +0100, Fiorenza Meini wrote:
Il 13/01/2015 10:51, Kashyap Chamarthy ha scritto:
[. . .]
In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job
Libvirt is 1.2.7 version, linux system is Debian Wheezy
Please, what does it mean ?
It means, according to libvirt, there's an unfinished `blockcopy` (Copy a disk backing image chain to destination ) operation.
If you can find the associated libvirt guest name, you can try the below maybe.
For the Nova libvirt guest that you're trying to create a snapshot of, find the location of its current block device:
$ virsh domblklist instance-YYYYYYYYY
Then check if there are active block operaations for that disk:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --info
If there is any operation and if it is a test environment, you can run the below to abort the block operation in progress:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --abort.
Please the man page of `virsh` for more details.
Thank for your response, I'm quite new on libvirt... 1) virsh blockjob ... -info gave me an empyt list 2( virsh blockjob ... -abort gave me this error: error: Requested operation is not valid: another job on disk 'vda' is still being ended....
..but what ?
Seems like you're hitting an old bug[1] where 'blockcopy' (or 'blockcommit') missed to execute a cleanup routine which destroys a reference to the active block operation -- resulting in the error you're seeing when you attempted to 'abort' the block operation manually. This bug is fixed in libvirt-1.2.8 and above. I see you're using libvirt-1.2.7, if you can update libvirt in your environment, that should fix your issue. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1135169 -- blockcopy job was cancel by "CTRL+C" while it show there still be one block job in background -- /kashyap

Il 13/01/2015 15:21, Kashyap Chamarthy ha scritto:
On Tue, Jan 13, 2015 at 03:07:07PM +0100, Fiorenza Meini wrote:
Il 13/01/2015 10:51, Kashyap Chamarthy ha scritto:
[. . .]
In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job
Libvirt is 1.2.7 version, linux system is Debian Wheezy
Please, what does it mean ?
It means, according to libvirt, there's an unfinished `blockcopy` (Copy a disk backing image chain to destination ) operation.
If you can find the associated libvirt guest name, you can try the below maybe.
For the Nova libvirt guest that you're trying to create a snapshot of, find the location of its current block device:
$ virsh domblklist instance-YYYYYYYYY
Then check if there are active block operaations for that disk:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --info
If there is any operation and if it is a test environment, you can run the below to abort the block operation in progress:
$ virsh blockjob instance-YYYYYYYYY /path/to/libvirt/disk/ --abort.
Please the man page of `virsh` for more details.
Thank for your response, I'm quite new on libvirt... 1) virsh blockjob ... -info gave me an empyt list 2( virsh blockjob ... -abort gave me this error: error: Requested operation is not valid: another job on disk 'vda' is still being ended....
..but what ?
Seems like you're hitting an old bug[1] where 'blockcopy' (or 'blockcommit') missed to execute a cleanup routine which destroys a reference to the active block operation -- resulting in the error you're seeing when you attempted to 'abort' the block operation manually.
This bug is fixed in libvirt-1.2.8 and above. I see you're using libvirt-1.2.7, if you can update libvirt in your environment, that should fix your issue.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1135169 -- blockcopy job was cancel by "CTRL+C" while it show there still be one block job in background
Ok, thank you. I'll update libvirt as soon as possible, as I can see there is an update on my system for libvirt... Regards Fiorenza Meini -- Spazio Web S.r.l. V. Dante, 10 13900 Biella Tel.: +39 015 2431982 Fax.: +39 015 2522600 Numero d'Iscrizione al Registro Imprese presso CCIAA Biella, Cod.Fisc.e P.Iva: 02414430021 Iscriz. REA: BI - 188936 Cap. Soc.: €. 30.000 i.v.

On 01/13/2015 07:21 AM, Kashyap Chamarthy wrote:
On Tue, Jan 13, 2015 at 03:07:07PM +0100, Fiorenza Meini wrote:
Il 13/01/2015 10:51, Kashyap Chamarthy ha scritto:
[. . .]
In libvirt log file I can see: error : qemuDomainDefineXML:6312 : block copy still active: domain has active block job
Seems like you're hitting an old bug[1] where 'blockcopy' (or 'blockcommit') missed to execute a cleanup routine which destroys a reference to the active block operation -- resulting in the error you're seeing when you attempted to 'abort' the block operation manually.
This bug is fixed in libvirt-1.2.8 and above. I see you're using libvirt-1.2.7, if you can update libvirt in your environment, that should fix your issue.
Are you using a pre-built distro libvirt? If so, which one? We should figure out how to get that vendor to backport the right fix for this issue. Also, I just now committed another related fix; so even the latest 1.2.11 release has an issue where libvirt can get into weird states if parallel block job attempts are made. See commit e1125ce.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1135169 -- blockcopy job was cancel by "CTRL+C" while it show there still be one block job in background
That was against RHEL 7; but I don't know if any Fedora releases suffer from the same issue. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Jan 13, 2015 at 08:49:53AM -0700, Eric Blake wrote:
On 01/13/2015 07:21 AM, Kashyap Chamarthy wrote:
[. . .]
Seems like you're hitting an old bug[1] where 'blockcopy' (or 'blockcommit') missed to execute a cleanup routine which destroys a reference to the active block operation -- resulting in the error you're seeing when you attempted to 'abort' the block operation manually.
This bug is fixed in libvirt-1.2.8 and above. I see you're using libvirt-1.2.7, if you can update libvirt in your environment, that should fix your issue.
Are you using a pre-built distro libvirt? If so, which one? We should figure out how to get that vendor to backport the right fix for this issue.
Also, I just now committed another related fix; so even the latest 1.2.11 release has an issue where libvirt can get into weird states if parallel block job attempts are made. See commit e1125ce.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1135169 -- blockcopy job was cancel by "CTRL+C" while it show there still be one block job in background
That was against RHEL 7; but I don't know if any Fedora releases suffer from the same issue.
Right, I looked for a Fedora bug before referring to it. Even the libvirt master git history refers to this bug with the fixed commit: commit 8e23e0e977fbcc4a7880e187a63c509d6e6879c6 Author: Erik Skultety <eskultet@redhat.com> Date: Thu Nov 27 13:29:42 2014 +0100 qemu: fix block{commit,copy} abort handling When a block{commit,copy} job was aborted on a domain, block job handler did not process it correctly, leaving a phantom job in the background. Any further calls to any blockjob causes "block <jobtype> still active" error. This patch fixes the blockjob handler so that it checks not only for VIR_DOMAIN_BLOCK_JOB_FAILED status, but VIR_DOMAIN_BLOCK_JOB_CANCELED status as well, followed by our existing cleanup routine. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135169 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> -- /kashyap

Il 13/01/2015 16:56, Kashyap Chamarthy ha scritto:
On Tue, Jan 13, 2015 at 08:49:53AM -0700, Eric Blake wrote:
On 01/13/2015 07:21 AM, Kashyap Chamarthy wrote:
[. . .]
Seems like you're hitting an old bug[1] where 'blockcopy' (or 'blockcommit') missed to execute a cleanup routine which destroys a reference to the active block operation -- resulting in the error you're seeing when you attempted to 'abort' the block operation manually.
This bug is fixed in libvirt-1.2.8 and above. I see you're using libvirt-1.2.7, if you can update libvirt in your environment, that should fix your issue.
Are you using a pre-built distro libvirt? If so, which one? We should figure out how to get that vendor to backport the right fix for this issue.
Also, I just now committed another related fix; so even the latest 1.2.11 release has an issue where libvirt can get into weird states if parallel block job attempts are made. See commit e1125ce.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1135169 -- blockcopy job was cancel by "CTRL+C" while it show there still be one block job in background
That was against RHEL 7; but I don't know if any Fedora releases suffer from the same issue.
Right, I looked for a Fedora bug before referring to it. Even the libvirt master git history refers to this bug with the fixed commit:
commit 8e23e0e977fbcc4a7880e187a63c509d6e6879c6 Author: Erik Skultety <eskultet@redhat.com> Date: Thu Nov 27 13:29:42 2014 +0100
qemu: fix block{commit,copy} abort handling
When a block{commit,copy} job was aborted on a domain, block job handler did not process it correctly, leaving a phantom job in the background. Any further calls to any blockjob causes "block <jobtype> still active" error. This patch fixes the blockjob handler so that it checks not only for VIR_DOMAIN_BLOCK_JOB_FAILED status, but VIR_DOMAIN_BLOCK_JOB_CANCELED status as well, followed by our existing cleanup routine.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135169
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
I'm working on Debian Wheezy, upgrading to 1.29 libvirt version libvirtd service doesn't start any more, so at the moment it isn't upgradable.... Regards Fiorenza Meini -- Spazio Web S.r.l. V. Dante, 10 13900 Biella Tel.: +39 015 2431982 Fax.: +39 015 2522600 Numero d'Iscrizione al Registro Imprese presso CCIAA Biella, Cod.Fisc.e P.Iva: 02414430021 Iscriz. REA: BI - 188936 Cap. Soc.: €. 30.000 i.v.
participants (3)
-
Eric Blake
-
Fiorenza Meini
-
Kashyap Chamarthy