On Wed, Dec 14, 2011 at 10:05:03 -0700, Eric Blake wrote:
On 12/14/2011 03:29 AM, Jiri Denemark wrote:
> If an async job run on a domain will stop the domain at the end of the
> job, a concurrently run query job can hang in qemu monitor nothing can
> be done with that domain from this point on. An attempt to start such
> domain results in "Timed out during operation: cannot acquire state
> change lock" error.
...
> src/qemu/qemu_monitor.c | 14 ++++++++++++++
> 1 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
> index 4141fb7..35648ae 100644
> --- a/src/qemu/qemu_monitor.c
> +++ b/src/qemu/qemu_monitor.c
> @@ -750,6 +750,20 @@ void qemuMonitorClose(qemuMonitorPtr mon)
> VIR_FORCE_CLOSE(mon->fd);
> }
>
> + /* In case another thread is waiting for its monitor command to be
> + * processed, we need to wake it up with appropriate error set.
> + */
> + if (mon->msg) {
> + if (mon->lastError.code == VIR_ERR_OK) {
> + qemuReportError(VIR_ERR_OPERATION_FAILED,
> + _("Qemu monitor was closed"));
> + virCopyLastError(&mon->lastError);
> + virResetLastError();
> + }
> + mon->msg->finished = 1;
> + virCondSignal(&mon->notify);
> + }
ACK.
Thanks, although I figured that qemuMonitorClose could likely be called in
error paths and this code could thus end up resetting any error we already
had. So I squashed the following patch in:
diff --git i/src/qemu/qemu_monitor.c w/src/qemu/qemu_monitor.c
index 35648ae..ad7e2a5 100644
--- i/src/qemu/qemu_monitor.c
+++ w/src/qemu/qemu_monitor.c
@@ -755,10 +755,17 @@ void qemuMonitorClose(qemuMonitorPtr mon)
*/
if (mon->msg) {
if (mon->lastError.code == VIR_ERR_OK) {
+ virErrorPtr err = virSaveLastError();
+
qemuReportError(VIR_ERR_OPERATION_FAILED,
_("Qemu monitor was closed"));
virCopyLastError(&mon->lastError);
- virResetLastError();
+ if (err) {
+ virSetError(err);
+ virFreeError(err);
+ } else {
+ virResetLastError();
+ }
}
mon->msg->finished = 1;
virCondSignal(&mon->notify);
And pushed.
Jirka