[libvirt] [PATCH 0/2] More graceful handing of monitor failures

Currently when libvirt has a serious error doing I/O and/or parsing of the QEMU monitor, it will kill off the guest. Application developers have expressed a desire for more graceful handling of this scenario. In particular to allow the guest OS to continue to run, without any further monitor interactons, and then kill/restart it at a time which is convenient to the guest admin/apps. This is a proof of concept of that.

This introduces a new domain VIR_DOMAIN_EVENT_ID_VMM_ERROR With a callback typedef void (*virConnectDomainEventVMMErrorCallback)(virConnectPtr conn, virDomainPtr dom, int type, void *opaque); This event is intended to be emitted when there is a failure in some part of the domain virtualization system. Whether the domain continues to run/exist after the failure is an implementation detail specific to the hypervisor. The idea is that with some types of failure, hypervisors may prefer to leave the domain running in a "degraded" mode of operation. For example, if something goes wrong with the QEMU monitor, it is possible to leave the guest OS running quite happily. The mgmt app will simply loose the ability todo various tasks. The mgmt app can then choose how/when to deal with the failure that occured. Currently the event has one 'type' defined VIR_DOMAIN_EVENT_VMM_ERROR_CONTROL which indicates that the primary VM control capabilities have failed (ie the QEMU monitor). Other failure types may perhaps include failures of SPICE/VNC remote desktop, host audio attachment, network attachment, etc. * daemon/remote.c: Dispatch of new event * examples/domain-events/events-c/event-test.c: Demo catch of event * include/libvirt/libvirt.h.in: Define event ID and callback * src/conf/domain_event.c, src/conf/domain_event.h: Internal event handling * src/remote/remote_driver.c: Receipt of new event from daemon * src/remote/remote_protocol.x: Wire protocol for new event --- daemon/remote.c | 31 +++++++++++++++++++++ examples/domain-events/events-c/event-test.c | 20 ++++++++++++++ include/libvirt/libvirt.h.in | 26 ++++++++++++++++++ src/conf/domain_event.c | 37 ++++++++++++++++++++++++++ src/conf/domain_event.h | 4 +++ src/libvirt_private.syms | 2 + src/remote/remote_driver.c | 32 ++++++++++++++++++++++ src/remote/remote_protocol.x | 9 +++++- 8 files changed, 160 insertions(+), 1 deletions(-) diff --git a/daemon/remote.c b/daemon/remote.c index 2220655..b4cf2fe 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -379,6 +379,36 @@ static int remoteRelayDomainEventGraphics(virConnectPtr conn ATTRIBUTE_UNUSED, } +static int remoteRelayDomainEventVMMError(virConnectPtr conn ATTRIBUTE_UNUSED, + virDomainPtr dom, + int type, + void *opaque) +{ + struct qemud_client *client = opaque; + remote_domain_event_vmm_error_msg data; + + if (!client) + return -1; + + VIR_DEBUG("Relaying domain VMM error %s %d %d", dom->name, dom->id, type); + + virMutexLock(&client->lock); + + /* build return data */ + memset(&data, 0, sizeof data); + make_nonnull_domain(&data.dom, dom); + data.type = type; + + remoteDispatchDomainEventSend(client, + REMOTE_PROC_DOMAIN_EVENT_VMM_ERROR, + (xdrproc_t)xdr_remote_domain_event_vmm_error_msg, &data); + + virMutexUnlock(&client->lock); + + return 0; +} + + static virConnectDomainEventGenericCallback domainEventCallbacks[] = { VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventLifecycle), VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventReboot), @@ -387,6 +417,7 @@ static virConnectDomainEventGenericCallback domainEventCallbacks[] = { VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventIOError), VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventGraphics), VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventIOErrorReason), + VIR_DOMAIN_EVENT_CALLBACK(remoteRelayDomainEventVMMError), }; verify(ARRAY_CARDINALITY(domainEventCallbacks) == VIR_DOMAIN_EVENT_ID_LAST); diff --git a/examples/domain-events/events-c/event-test.c b/examples/domain-events/events-c/event-test.c index 1f46d42..678e510 100644 --- a/examples/domain-events/events-c/event-test.c +++ b/examples/domain-events/events-c/event-test.c @@ -248,6 +248,18 @@ static int myDomainEventGraphicsCallback(virConnectPtr conn ATTRIBUTE_UNUSED, return 0; } +static int myDomainEventVMMErrorCallback(virConnectPtr conn ATTRIBUTE_UNUSED, + virDomainPtr dom, + int type, + void *opaque ATTRIBUTE_UNUSED) +{ + printf("%s EVENT: Domain %s(%d) vmm error type=%d\n", __func__, virDomainGetName(dom), + virDomainGetID(dom), type); + + return 0; +} + + static void myFreeFunc(void *opaque) { char *str = opaque; @@ -281,6 +293,7 @@ int main(int argc, char **argv) int callback5ret = -1; int callback6ret = -1; int callback7ret = -1; + int callback8ret = -1; struct sigaction action_stop; memset(&action_stop, 0, sizeof action_stop); @@ -339,6 +352,11 @@ int main(int argc, char **argv) VIR_DOMAIN_EVENT_ID_GRAPHICS, VIR_DOMAIN_EVENT_CALLBACK(myDomainEventGraphicsCallback), strdup("callback graphics"), myFreeFunc); + callback8ret = virConnectDomainEventRegisterAny(dconn, + NULL, + VIR_DOMAIN_EVENT_ID_VMM_ERROR, + VIR_DOMAIN_EVENT_CALLBACK(myDomainEventVMMErrorCallback), + strdup("callback VMM error"), myFreeFunc); if ((callback1ret != -1) && (callback2ret != -1) && @@ -363,6 +381,8 @@ int main(int argc, char **argv) virConnectDomainEventDeregisterAny(dconn, callback5ret); virConnectDomainEventDeregisterAny(dconn, callback6ret); virConnectDomainEventDeregisterAny(dconn, callback7ret); + if (callback8ret != -1) + virConnectDomainEventDeregisterAny(dconn, callback8ret); } VIR_DEBUG0("Closing connection"); diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 0e1e27a..7caf828 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2413,6 +2413,31 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, void *opaque); /** + * virDomainEventVMMErrorType: + * + * The reason for the VMM error + */ +typedef enum { + VIR_DOMAIN_EVENT_VMM_ERROR_CONTROL = 0, /* Control channel is broken */ +} virDomainEventVMMErrorReason; + + +/** + * virConnectDomainEventVMMErrorCallback: + * @conn: connection object + * @dom: domain on which the event occurred + * @type: the type of the error + * @opaque: application specified data + * + * The callback signature to use when registering for an event of type + * VIR_DOMAIN_EVENT_ID_VMM_ERROR with virConnectDomainEventRegisterAny() + */ +typedef void (*virConnectDomainEventVMMErrorCallback)(virConnectPtr conn, + virDomainPtr dom, + int type, + void *opaque); + +/** * VIR_DOMAIN_EVENT_CALLBACK: * * Used to cast the event specific callback into the generic one @@ -2429,6 +2454,7 @@ typedef enum { VIR_DOMAIN_EVENT_ID_IO_ERROR = 4, /* virConnectDomainEventIOErrorCallback */ VIR_DOMAIN_EVENT_ID_GRAPHICS = 5, /* virConnectDomainEventGraphicsCallback */ VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON = 6, /* virConnectDomainEventIOErrorReasonCallback */ + VIR_DOMAIN_EVENT_ID_VMM_ERROR = 7, /* virConnectDomainEventVMMErrorCallback */ /* * NB: this enum value will increase over time as new events are diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c index 688bf6c..fdbe630 100644 --- a/src/conf/domain_event.c +++ b/src/conf/domain_event.c @@ -83,6 +83,9 @@ struct _virDomainEvent { char *authScheme; virDomainEventGraphicsSubjectPtr subject; } graphics; + struct { + int type; + } vmmError; } data; }; @@ -783,6 +786,34 @@ virDomainEventPtr virDomainEventGraphicsNewFromObj(virDomainObjPtr obj, } +virDomainEventPtr virDomainEventVMMErrorNewFromDom(virDomainPtr dom, + int type) +{ + virDomainEventPtr ev = + virDomainEventNewInternal(VIR_DOMAIN_EVENT_ID_VMM_ERROR, + dom->id, dom->name, dom->uuid); + + if (ev) + ev->data.vmmError.type = type; + + return ev; +} + + +virDomainEventPtr virDomainEventVMMErrorNewFromObj(virDomainObjPtr obj, + int type) +{ + virDomainEventPtr ev = + virDomainEventNewInternal(VIR_DOMAIN_EVENT_ID_VMM_ERROR, + obj->def->id, obj->def->name, obj->def->uuid); + + if (ev) + ev->data.vmmError.type = type; + + return ev; +} + + /** * virDomainEventQueueFree: * @queue: pointer to the queue @@ -932,6 +963,12 @@ void virDomainEventDispatchDefaultFunc(virConnectPtr conn, cbopaque); break; + case VIR_DOMAIN_EVENT_ID_VMM_ERROR: + ((virConnectDomainEventVMMErrorCallback)cb)(conn, dom, + event->data.vmmError.type, + cbopaque); + break; + default: VIR_WARN("Unexpected event ID %d", event->eventID); break; diff --git a/src/conf/domain_event.h b/src/conf/domain_event.h index c03a159..c1ff871 100644 --- a/src/conf/domain_event.h +++ b/src/conf/domain_event.h @@ -153,6 +153,10 @@ virDomainEventPtr virDomainEventGraphicsNewFromObj(virDomainObjPtr obj, virDomainEventGraphicsAddressPtr remote, const char *authScheme, virDomainEventGraphicsSubjectPtr subject); +virDomainEventPtr virDomainEventVMMErrorNewFromDom(virDomainPtr dom, + int reason); +virDomainEventPtr virDomainEventVMMErrorNewFromObj(virDomainObjPtr obj, + int reason); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 7e5b1d7..e3e4a34 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -389,6 +389,8 @@ virDomainEventRTCChangeNewFromObj; virDomainEventRebootNew; virDomainEventRebootNewFromDom; virDomainEventRebootNewFromObj; +virDomainEventVMMErrorNewFromDom; +virDomainEventVMMErrorNewFromObj; virDomainEventWatchdogNewFromDom; virDomainEventWatchdogNewFromObj; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 37940f3..11ab095 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -4121,6 +4121,34 @@ no_memory: } +static virDomainEventPtr +remoteDomainReadEventVMMError(virConnectPtr conn, XDR *xdr) +{ + remote_domain_event_vmm_error_msg msg; + virDomainPtr dom; + virDomainEventPtr event = NULL; + memset (&msg, 0, sizeof msg); + + /* unmarshall parameters, and process it*/ + if (! xdr_remote_domain_event_vmm_error_msg(xdr, &msg) ) { + remoteError(VIR_ERR_RPC, "%s", + _("unable to demarshall reboot event")); + return NULL; + } + + dom = get_nonnull_domain(conn,msg.dom); + if (!dom) + return NULL; + + event = virDomainEventVMMErrorNewFromDom(dom, + msg.type); + xdr_free ((xdrproc_t) &xdr_remote_domain_event_vmm_error_msg, (char *) &msg); + + virDomainFree(dom); + return event; +} + + static virDrvOpenStatus ATTRIBUTE_NONNULL (1) remoteSecretOpen(virConnectPtr conn, virConnectAuthPtr auth, int flags) { @@ -5544,6 +5572,10 @@ processCallDispatchMessage(virConnectPtr conn, struct private_data *priv, event = remoteDomainReadEventGraphics(conn, xdr); break; + case REMOTE_PROC_DOMAIN_EVENT_VMM_ERROR: + event = remoteDomainReadEventVMMError(conn, xdr); + break; + default: VIR_DEBUG("Unexpected event proc %d", hdr->proc); break; diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 2cf6022..d12438a 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -1945,6 +1945,11 @@ struct remote_storage_vol_download_args { unsigned int flags; }; +struct remote_domain_event_vmm_error_msg { + remote_nonnull_domain dom; + int type; +}; + /*----- Protocol. -----*/ @@ -2182,7 +2187,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_MIGRATE_SET_MAX_SPEED = 207, REMOTE_PROC_STORAGE_VOL_UPLOAD = 208, REMOTE_PROC_STORAGE_VOL_DOWNLOAD = 209, - REMOTE_PROC_DOMAIN_INJECT_NMI = 210 + REMOTE_PROC_DOMAIN_INJECT_NMI = 210, + + REMOTE_PROC_DOMAIN_EVENT_VMM_ERROR = 211 /* * Notice how the entries are grouped in sets of 10 ? -- 1.7.4.4

Currently whenever there is any failure with parsing the monitor, this is treated in the same was as end-of-file (ie QEMU quit). The domain is terminated, if not already dead. With this change, failures in parsing the monitor stream do not result in the death of QEMU. The guest continues running unchanged, but all further use of the monitor will be disabled. The VMM_FAILURE event will be emitted, and the mgmt application can decide when to kill/restart the guest to re-gain control * src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Run a different callback for monitor EOF vs error conditions. * src/qemu/qemu_process.c: Emit VMM_FAILURE event when monitor fails --- src/qemu/qemu_monitor.c | 45 +++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 7 ++++--- src/qemu/qemu_process.c | 46 +++++++++++++++++++++++++++++++++++++--------- 3 files changed, 68 insertions(+), 30 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 9f0f20d..6e7e3d6 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -517,7 +517,8 @@ static void qemuMonitorUpdateWatch(qemuMonitorPtr mon) static void qemuMonitorIO(int watch, int fd, int events, void *opaque) { qemuMonitorPtr mon = opaque; - int quit = 0, failed = 0; + bool error = false; + bool eof = false; /* lock access to the monitor and protect fd */ qemuMonitorLock(mon); @@ -528,27 +529,27 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) { if (mon->fd != fd || mon->watch != watch) { VIR_ERROR(_("event from unexpected fd %d!=%d / watch %d!=%d"), mon->fd, fd, mon->watch, watch); - failed = 1; + error = true; } else { if (!mon->lastErrno && events & VIR_EVENT_HANDLE_WRITABLE) { int done = qemuMonitorIOWrite(mon); if (done < 0) - failed = 1; + error = 1; events &= ~VIR_EVENT_HANDLE_WRITABLE; } if (!mon->lastErrno && events & VIR_EVENT_HANDLE_READABLE) { int got = qemuMonitorIORead(mon); if (got < 0) - failed = 1; + error = true; /* Ignore hangup/error events if we read some data, to * give time for that data to be consumed */ if (got > 0) { events = 0; if (qemuMonitorIOProcess(mon) < 0) - failed = 1; + error = true; } else events &= ~VIR_EVENT_HANDLE_READABLE; } @@ -572,36 +573,44 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) { mon->msg->lastErrno = EIO; virCondSignal(&mon->notify); } - quit = 1; + eof = 1; } else if (events) { VIR_ERROR(_("unhandled fd event %d for monitor fd %d"), events, mon->fd); - failed = 1; + error = 1; } } + if (eof || error) + mon->lastErrno = EIO; + + qemuMonitorUpdateWatch(mon); + /* We have to unlock to avoid deadlock against command thread, * but is this safe ? I think it is, because the callback * will try to acquire the virDomainObjPtr mutex next */ - if (failed || quit) { - void (*eofNotify)(qemuMonitorPtr, virDomainObjPtr, int) + if (eof) { + void (*eofNotify)(qemuMonitorPtr, virDomainObjPtr) = mon->cb->eofNotify; virDomainObjPtr vm = mon->vm; - /* If qemu quited unexpectedly, and we may try to send monitor - * command later. But we have no chance to wake up it. So set - * mon->lastErrno to EIO, and check it before sending monitor - * command. - */ - if (!mon->lastErrno) - mon->lastErrno = EIO; + /* Make sure anyone waiting wakes up now */ + virCondSignal(&mon->notify); + if (qemuMonitorUnref(mon) > 0) + qemuMonitorUnlock(mon); + VIR_DEBUG0("Triggering EOF callback"); + (eofNotify)(mon, vm); + } else if (error) { + void (*errorNotify)(qemuMonitorPtr, virDomainObjPtr) + = mon->cb->errorNotify; + virDomainObjPtr vm = mon->vm; /* Make sure anyone waiting wakes up now */ virCondSignal(&mon->notify); if (qemuMonitorUnref(mon) > 0) qemuMonitorUnlock(mon); - VIR_DEBUG("Triggering EOF callback error? %d", failed); - (eofNotify)(mon, vm, failed); + VIR_DEBUG0("Triggering error callback"); + (errorNotify)(mon, vm); } else { if (qemuMonitorUnref(mon) > 0) qemuMonitorUnlock(mon); diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index b84e230..47e0f4f 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -67,9 +67,10 @@ struct _qemuMonitorCallbacks { virDomainObjPtr vm); void (*eofNotify)(qemuMonitorPtr mon, - virDomainObjPtr vm, - int withError); - /* XXX we'd really like to avoid virCOnnectPtr here + virDomainObjPtr vm); + void (*errorNotify)(qemuMonitorPtr mon, + virDomainObjPtr vm); + /* XXX we'd really like to avoid virConnectPtr here * It is required so the callback can find the active * secret driver. Need to change this to work like the * security drivers do, to avoid this diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index de728a2..4293f59 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -100,12 +100,13 @@ extern struct qemud_driver *qemu_driver; */ static void qemuProcessHandleMonitorEOF(qemuMonitorPtr mon ATTRIBUTE_UNUSED, - virDomainObjPtr vm, - int hasError) + virDomainObjPtr vm) { struct qemud_driver *driver = qemu_driver; virDomainEventPtr event = NULL; qemuDomainObjPrivatePtr priv; + int eventReason = VIR_DOMAIN_EVENT_STOPPED_SHUTDOWN; + const char *auditReason = "shutdown"; VIR_DEBUG("Received EOF on %p '%s'", vm, vm->def->name); @@ -120,20 +121,18 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon ATTRIBUTE_UNUSED, } priv = vm->privateData; - if (!hasError && priv->monJSON && !priv->gotShutdown) { + if (priv->monJSON && !priv->gotShutdown) { VIR_DEBUG("Monitor connection to '%s' closed without SHUTDOWN event; " "assuming the domain crashed", vm->def->name); - hasError = 1; + eventReason = VIR_DOMAIN_EVENT_STOPPED_FAILED; + auditReason = "failed"; } event = virDomainEventNewFromObj(vm, - VIR_DOMAIN_EVENT_STOPPED, - hasError ? - VIR_DOMAIN_EVENT_STOPPED_FAILED : - VIR_DOMAIN_EVENT_STOPPED_SHUTDOWN); + VIR_DOMAIN_EVENT_STOPPED, eventReason); qemuProcessStop(driver, vm, 0); - qemuAuditDomainStop(vm, hasError ? "failed" : "shutdown"); + qemuAuditDomainStop(vm, auditReason); if (!vm->persistent) virDomainRemoveInactive(&driver->domains, vm); @@ -147,6 +146,34 @@ qemuProcessHandleMonitorEOF(qemuMonitorPtr mon ATTRIBUTE_UNUSED, } +/* + * This is invoked when there is some kind of error + * parsing data to/from the monitor. The VM can continue + * to run, but no further monitor commands will be + * allowed + */ +static void +qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virDomainObjPtr vm) +{ + struct qemud_driver *driver = qemu_driver; + virDomainEventPtr event = NULL; + + VIR_DEBUG("Received error on %p '%s'", vm, vm->def->name); + + qemuDriverLock(driver); + virDomainObjLock(vm); + + event = virDomainEventVMMErrorNewFromObj(vm, + VIR_DOMAIN_EVENT_VMM_ERROR_CONTROL); + if (event) + qemuDomainEventQueue(driver, event); + + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); +} + + static virDomainDiskDefPtr qemuProcessFindDomainDiskByPath(virDomainObjPtr vm, const char *path) @@ -623,6 +650,7 @@ static void qemuProcessHandleMonitorDestroy(qemuMonitorPtr mon, static qemuMonitorCallbacks monitorCallbacks = { .destroy = qemuProcessHandleMonitorDestroy, .eofNotify = qemuProcessHandleMonitorEOF, + .errorNotify = qemuProcessHandleMonitorError, .diskSecretLookup = qemuProcessFindVolumeQcowPassphrase, .domainShutdown = qemuProcessHandleShutdown, .domainStop = qemuProcessHandleStop, -- 1.7.4.4

On Wed, May 11, 2011 at 12:59:07PM +0100, Daniel P. Berrange wrote:
+/* + * This is invoked when there is some kind of error + * parsing data to/from the monitor. The VM can continue + * to run, but no further monitor commands will be + * allowed + */ +static void +qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virDomainObjPtr vm)
I'm all for being graceful and polite, so this sounds good. However, events are bound to be lost. The solution would be more robust if there was a way to query the state of the monitor (though I'm not sure it is worth the hassle). PS, I wonder what VMM stands for in this context. Not virtual memory manager, I suppose. Since I've only read the comments, not the code, I wonder if the 'quit' command is still passed to the monitro even after an error occured. Without it, we'd have to resort to SIGTERM/SIGKILL, which is less graceful.

On Wed, May 11, 2011 at 16:15:42 +0300, Dan Kenigsberg wrote:
On Wed, May 11, 2011 at 12:59:07PM +0100, Daniel P. Berrange wrote:
+/* + * This is invoked when there is some kind of error + * parsing data to/from the monitor. The VM can continue + * to run, but no further monitor commands will be + * allowed + */ +static void +qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virDomainObjPtr vm)
I'm all for being graceful and polite, so this sounds good.
However, events are bound to be lost. The solution would be more robust if there was a way to query the state of the monitor (though I'm not sure it is worth the hassle).
I plan to add an API for querying monitor status so that an app can detect whether libvirt currently waits for a reply for qemu and how long and extending that to report that qemu monitor is broken will be quite straightforward.
PS, I wonder what VMM stands for in this context. Not virtual memory manager, I suppose.
VMM stands virtual machine manager, aka hypervisor. Jirka

On Wed, May 11, 2011 at 04:15:42PM +0300, Dan Kenigsberg wrote:
On Wed, May 11, 2011 at 12:59:07PM +0100, Daniel P. Berrange wrote:
+/* + * This is invoked when there is some kind of error + * parsing data to/from the monitor. The VM can continue + * to run, but no further monitor commands will be + * allowed + */ +static void +qemuProcessHandleMonitorError(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virDomainObjPtr vm)
I'm all for being graceful and polite, so this sounds good.
However, events are bound to be lost. The solution would be more robust if there was a way to query the state of the monitor (though I'm not sure it is worth the hassle).
That's certainly possible.
PS, I wonder what VMM stands for in this context. Not virtual memory manager, I suppose.
As Jiri says its "Virtual Machine Manager" aka the hypervisor emulation support infrastructure for a single VM. If someone has a better suggestion, I'm open to change it. I could just remove 'VMM' part and say 'qemuDomainEventErrorCallback'. Or I could kill the 'type' field and call it 'qemuDomainEventControlErrorCallback' and if we have other types of errors in the future, introduce further events for them
Since I've only read the comments, not the code, I wonder if the 'quit' command is still passed to the monitro even after an error occured. Without it, we'd have to resort to SIGTERM/SIGKILL, which is less graceful.
QEMU treats 'SIGTERM' in the same way as 'quit' monitor command. The key thing is to give QEMU time to process SIGTERM before resorting to SIGKILL. NB, we don't actually use 'quit' at all ... yet :-) Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (3)
-
Dan Kenigsberg
-
Daniel P. Berrange
-
Jiri Denemark