[libvirt] [PATCH 0/2] Support of auto-dump on watchdog event in libvirtd

This patch series adds support of auto-dump on watchdog event in libvirtd. If a qemu guest is configured with a watchdog device and libvirtd receives a watchdog event from the guest, it can do something on the guest, currently `dump' action is defined. This patch series adds a new watchdog action `managed', it has a subaction wich indicates the action libvirtd will take on a watchdog event. The meaning of subaction_arg depends on subaction. If subaction is `dump', then subaction_arg defines a folder path into which libvirtd saves dumpfile. An example: <watchdog model='i6300esb' action='managed' subaction='dump' subaction_arg='/mnt/data/dumps'> In order to make the function work, there must be a watchdog device added to guest, and guest must have a watchdog daemon running, for example, /etc/init.d/watchdog start or auto-started on boot. Suggestions are welcome! Hu Tao (2): Add a thread pool implementation Add a watchdog action `managed'. src/Makefile.am | 3 +- src/conf/domain_conf.c | 27 ++++++++- src/conf/domain_conf.h | 11 ++++- src/qemu/qemu_conf.c | 2 +- src/qemu/qemu_conf.h | 6 ++ src/qemu/qemu_driver.c | 138 ++++++++++++++++++++++++++++++++++++++++++----- src/util/threadpool.c | 119 +++++++++++++++++++++++++++++++++++++++++ src/util/threadpool.h | 34 ++++++++++++ 8 files changed, 319 insertions(+), 21 deletions(-) create mode 100644 src/util/threadpool.c create mode 100644 src/util/threadpool.h -- 1.7.3 -- Thanks, Hu Tao

--- src/Makefile.am | 1 + src/util/threadpool.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++ src/util/threadpool.h | 34 ++++++++++++++ 3 files changed, 154 insertions(+), 0 deletions(-) create mode 100644 src/util/threadpool.c create mode 100644 src/util/threadpool.h diff --git a/src/Makefile.am b/src/Makefile.am index 20c0c9f..27bda63 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -84,6 +84,7 @@ UTIL_SOURCES = \ util/uuid.c util/uuid.h \ util/util.c util/util.h \ util/xml.c util/xml.h \ + util/threadpool.c util/threadpool.h \ util/virtaudit.c util/virtaudit.h \ util/virterror.c util/virterror_internal.h diff --git a/src/util/threadpool.c b/src/util/threadpool.c new file mode 100644 index 0000000..7f64a82 --- /dev/null +++ b/src/util/threadpool.c @@ -0,0 +1,119 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "threadpool.h" + +static void *workerHandleJob(void *data) +{ + struct virData *localData = NULL; + struct virWorkerPool *pool = data; + + pthread_mutex_lock(&pool->mutex); + + while (1) { + while (!pool->quit && !pool->dataList) { + pool->nFreeWorker++; + pthread_cond_signal(&pool->worker_cond); + pthread_cond_wait(&pool->cond, &pool->mutex); + pool->nFreeWorker--; + } + + while ((localData = pool->dataList) != NULL) { + pool->dataList = pool->dataList->next; + localData->next = NULL; + + pthread_mutex_unlock(&pool->mutex); + + (pool->func)(localData->data); + free(localData); + + pthread_mutex_lock(&pool->mutex); + } + + if (pool->quit) + break; + } + + pool->nWorker--; + if (pool->nWorker == 0) + pthread_cond_signal(&pool->quit_cond); + pthread_mutex_unlock(&pool->mutex); + + return NULL; +} + +struct virWorkerPool *virWorkerPoolNew(int nWorker, int maxWorker, virWorkerFunc func) +{ + struct virWorkerPool *pool; + pthread_t pid; + int i; + + if (nWorker > maxWorker) + return NULL; + + pool = malloc(sizeof(*pool)); + if (!pool) + return NULL; + + memset(pool, 0, sizeof(*pool)); + pool->func = func; + pthread_mutex_init(&pool->mutex, NULL); + pthread_cond_init(&pool->cond, NULL); + pthread_cond_init(&pool->worker_cond, NULL); + pthread_cond_init(&pool->quit_cond, NULL); + + for (i = 0; i < nWorker; i++) { + pthread_create(&pid, NULL, workerHandleJob, pool); + } + + pool->nFreeWorker = 0; + pool->nWorker = nWorker; + pool->nMaxWorker = maxWorker; + + return pool; +} + +void virWorkerPoolFree(struct virWorkerPool *pool) +{ + pthread_mutex_lock(&pool->mutex); + pool->quit = 1; + if (pool->nWorker > 0) { + pthread_cond_broadcast(&pool->cond); + pthread_cond_wait(&pool->quit_cond, &pool->mutex); + } + pthread_mutex_unlock(&pool->mutex); + free(pool); +} + +int virWorkerPoolSendJob(struct virWorkerPool *pool, void *data) +{ + pthread_t pid; + struct virData *localData; + + localData = malloc(sizeof(*localData)); + if (!localData) + return -1; + + localData->data = data; + + pthread_mutex_lock(&pool->mutex); + if (pool->quit) { + pthread_mutex_unlock(&pool->mutex); + return -1; + } + + localData->next = pool->dataList; + pool->dataList = localData; + + if (pool->nFreeWorker == 0 && pool->nWorker < pool->nMaxWorker) { + pthread_create(&pid, NULL, workerHandleJob, pool); + pool->nWorker++; + } + + pthread_cond_signal(&pool->cond); + + pthread_mutex_unlock(&pool->mutex); + + return 0; +} diff --git a/src/util/threadpool.h b/src/util/threadpool.h new file mode 100644 index 0000000..e39b2da --- /dev/null +++ b/src/util/threadpool.h @@ -0,0 +1,34 @@ +#ifndef __THREADPOOL_H__ +#define __THREADPOOL_H__ + +#include <pthread.h> + +typedef void (*virWorkerFunc)(void *); + +struct virData { + struct virData *next; + + void *data; +}; + +struct virWorkerPool { + int nWorker; + int nMaxWorker; + int nFreeWorker; + + int quit; + + virWorkerFunc func; + struct virData *dataList; + + pthread_mutex_t mutex; + pthread_cond_t cond; + pthread_cond_t worker_cond; + pthread_cond_t quit_cond; +}; + +struct virWorkerPool *virWorkerPoolNew(int nWorker, int nMaxWorker, virWorkerFunc func); +void virWorkerPoolFree(struct virWorkerPool *pool); +int virWorkerPoolSendJob(struct virWorkerPool *wp, void *data); + +#endif -- 1.7.3 -- Thanks, Hu Tao

`managed' watchdog action lets libvirtd decide what to do if a guest watchdog fires. It has a subaction argument which is the action libvirtd will take. Currently only `dump' subaction is defined, it tells libvirtd to dump the guest on a watchdog event. `managed' watchdog action is mapped to `none' qemu watchdog action, thus qemu will not get in the way of libvirtd's handling of watchdog events. Currently only qemu is supported. --- src/Makefile.am | 2 +- src/conf/domain_conf.c | 27 ++++++++- src/conf/domain_conf.h | 11 ++++- src/qemu/qemu_conf.c | 2 +- src/qemu/qemu_conf.h | 6 ++ src/qemu/qemu_driver.c | 138 ++++++++++++++++++++++++++++++++++++++++++----- 6 files changed, 165 insertions(+), 21 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index 27bda63..c495e34 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1097,7 +1097,7 @@ libvirt_test_la_LIBADD = $(libvirt_la_LIBADD) libvirt_test_la_LDFLAGS = $(test_LDFLAGS) $(AM_LDFLAGS) libvirt_test_la_CFLAGS = $(AM_CFLAGS) -libvirt_qemu_la_SOURCES = libvirt-qemu.c +libvirt_qemu_la_SOURCES = libvirt-qemu.c util/threadpool.c libvirt_qemu_la_LDFLAGS = $(VERSION_SCRIPT_FLAGS)$(LIBVIRT_QEMU_SYMBOL_FILE) \ -version-info $(LIBVIRT_VERSION_INFO) \ $(CYGWIN_EXTRA_LDFLAGS) $(MINGW_EXTRA_LDFLAGS) \ diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 30c27db..795cd54 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -245,7 +245,11 @@ VIR_ENUM_IMPL(virDomainWatchdogAction, VIR_DOMAIN_WATCHDOG_ACTION_LAST, "shutdown", "poweroff", "pause", - "none") + "none", + "managed") + +VIR_ENUM_IMPL(virDomainWatchdogSubaction, VIR_DOMAIN_WATCHDOG_SUBACTION_LAST, + "dump") VIR_ENUM_IMPL(virDomainVideo, VIR_DOMAIN_VIDEO_TYPE_LAST, "vga", @@ -3423,6 +3427,7 @@ virDomainWatchdogDefParseXML(const xmlNodePtr node, char *model = NULL; char *action = NULL; + char *subaction = NULL; virDomainWatchdogDefPtr def; if (VIR_ALLOC (def) < 0) { @@ -3453,6 +3458,13 @@ virDomainWatchdogDefParseXML(const xmlNodePtr node, _("unknown watchdog action '%s'"), action); goto error; } + if (def->action == VIR_DOMAIN_WATCHDOG_ACTION_MANAGED) { + subaction = virXMLPropString(node, "subaction"); + if (subaction) { + def->subaction = virDomainWatchdogSubactionTypeFromString(subaction); + def->subactionArg = virXMLPropString(node, "subaction_arg"); + } + } } if (virDomainDeviceInfoParseXML(node, &def->info, flags) < 0) @@ -3460,6 +3472,7 @@ virDomainWatchdogDefParseXML(const xmlNodePtr node, cleanup: VIR_FREE (action); + VIR_FREE (subaction); VIR_FREE (model); return def; @@ -6398,6 +6411,7 @@ virDomainWatchdogDefFormat(virBufferPtr buf, { const char *model = virDomainWatchdogModelTypeToString (def->model); const char *action = virDomainWatchdogActionTypeToString (def->action); + const char *subaction = virDomainWatchdogSubactionTypeToString (def->subaction); if (!model) { virDomainReportError(VIR_ERR_INTERNAL_ERROR, @@ -6411,8 +6425,15 @@ virDomainWatchdogDefFormat(virBufferPtr buf, return -1; } - virBufferVSprintf(buf, " <watchdog model='%s' action='%s'", - model, action); + if (subaction && !def->subactionArg) + virBufferVSprintf(buf, " <watchdog model='%s' action='%s' subaction='%s'", + model, action, subaction); + else if (subaction && def->subactionArg) + virBufferVSprintf(buf, " <watchdog model='%s' action='%s' subaction='%s' subaction_arg='%s'", + model, action, subaction, (char *)def->subactionArg); + else + virBufferVSprintf(buf, " <watchdog model='%s' action='%s'", + model, action); if (virDomainDeviceInfoIsSet(&def->info)) { virBufferAddLit(buf, ">\n"); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7d2d6f5..2cc6893 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -463,19 +463,27 @@ enum virDomainWatchdogAction { VIR_DOMAIN_WATCHDOG_ACTION_POWEROFF, VIR_DOMAIN_WATCHDOG_ACTION_PAUSE, VIR_DOMAIN_WATCHDOG_ACTION_NONE, + VIR_DOMAIN_WATCHDOG_ACTION_MANAGED, VIR_DOMAIN_WATCHDOG_ACTION_LAST }; +enum virDomainWatchdogSubAction { + VIR_DOMAIN_WATCHDOG_SUBACTION_DUMP, + + VIR_DOMAIN_WATCHDOG_SUBACTION_LAST +}; + typedef struct _virDomainWatchdogDef virDomainWatchdogDef; typedef virDomainWatchdogDef *virDomainWatchdogDefPtr; struct _virDomainWatchdogDef { int model; int action; + int subaction; + void *subactionArg; virDomainDeviceInfo info; }; - enum virDomainVideoType { VIR_DOMAIN_VIDEO_TYPE_VGA, VIR_DOMAIN_VIDEO_TYPE_CIRRUS, @@ -1247,6 +1255,7 @@ VIR_ENUM_DECL(virDomainSysinfo) VIR_ENUM_DECL(virDomainSmbiosMode) VIR_ENUM_DECL(virDomainWatchdogModel) VIR_ENUM_DECL(virDomainWatchdogAction) +VIR_ENUM_DECL(virDomainWatchdogSubaction) VIR_ENUM_DECL(virDomainVideo) VIR_ENUM_DECL(virDomainHostdevMode) VIR_ENUM_DECL(virDomainHostdevSubsys) diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 83a117a..f09a72d 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -5367,7 +5367,7 @@ int qemudBuildCommandLine(virConnectPtr conn, } ADD_ARG(optstr); - const char *action = virDomainWatchdogActionTypeToString(watchdog->action); + const char *action = virDomainWatchdogActionTypeToString(watchdog->action == VIR_DOMAIN_WATCHDOG_ACTION_MANAGED ? VIR_DOMAIN_WATCHDOG_ACTION_NONE : watchdog->action); if (!action) { qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("invalid watchdog action")); diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 790ce98..49584c8 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -106,6 +106,12 @@ enum qemud_cmd_flags { struct qemud_driver { virMutex lock; + virMutex workerPoolLock; + int poolRef; + struct virWorkerPool *workerPool; + int watchdogEventCallbackID; + virConnectPtr conn; + int privileged; uid_t user; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 449534a..349cb0c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -84,6 +84,7 @@ #include "virtaudit.h" #include "files.h" #include "fdstream.h" +#include "threadpool.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -858,18 +859,13 @@ qemudAutostartConfigs(struct qemud_driver *driver) { * to lookup the bridge associated with a virtual * network */ - virConnectPtr conn = virConnectOpen(driver->privileged ? - "qemu:///system" : - "qemu:///session"); /* Ignoring NULL conn which is mostly harmless here */ - struct qemuAutostartData data = { driver, conn }; + struct qemuAutostartData data = { driver, driver->conn }; qemuDriverLock(driver); virHashForEach(driver->domains.objs, qemuAutostartDomain, &data); - qemuDriverUnlock(driver); - if (conn) - virConnectClose(conn); + qemuDriverUnlock(driver); } @@ -1722,7 +1718,6 @@ qemudStartup(int privileged) { char *base = NULL; char driverConf[PATH_MAX]; int rc; - virConnectPtr conn = NULL; if (VIR_ALLOC(qemu_driver) < 0) return -1; @@ -1732,6 +1727,11 @@ qemudStartup(int privileged) { VIR_FREE(qemu_driver); return -1; } + if (virMutexInit(&qemu_driver->workerPoolLock)) { + VIR_ERROR0(_("cannot initialize mutex")); + VIR_FREE(qemu_driver); + return -1; + } qemuDriverLock(qemu_driver); qemu_driver->privileged = privileged; @@ -1948,11 +1948,11 @@ qemudStartup(int privileged) { 1, NULL, NULL) < 0) goto error; - conn = virConnectOpen(qemu_driver->privileged ? + qemu_driver->conn = virConnectOpen(qemu_driver->privileged ? "qemu:///system" : "qemu:///session"); - qemuReconnectDomains(conn, qemu_driver); + qemuReconnectDomains(qemu_driver->conn, qemu_driver); /* Then inactive persistent configs */ if (virDomainLoadAllConfigs(qemu_driver->caps, @@ -1970,9 +1970,6 @@ qemudStartup(int privileged) { qemudAutostartConfigs(qemu_driver); - if (conn) - virConnectClose(conn); - return 0; out_of_memory: @@ -1980,8 +1977,6 @@ out_of_memory: error: if (qemu_driver) qemuDriverUnlock(qemu_driver); - if (conn) - virConnectClose(conn); VIR_FREE(base); qemudShutdown(); return -1; @@ -2059,7 +2054,11 @@ qemudShutdown(void) { if (!qemu_driver) return -1; + if (qemu_driver->conn) + virConnectClose(qemu_driver->conn); + qemuDriverLock(qemu_driver); + pciDeviceListFree(qemu_driver->activePciHostdevs); virCapabilitiesFree(qemu_driver->caps); @@ -4425,6 +4424,82 @@ retry: } } +struct watchdogEvent +{ + virDomain dom; + int action; + void *actionArg; +}; + +static void processWatchdogEvent(void *data) +{ + struct watchdogEvent *wdEvent = data; + + switch (wdEvent->action) { + case VIR_DOMAIN_WATCHDOG_SUBACTION_DUMP: + { + char *path = wdEvent->actionArg; + char dumpfilePath[4096]; + int i; + + i = snprintf(dumpfilePath, 4096, "%s/%s-%u", path, wdEvent->dom.name, (unsigned int)time(NULL)); + dumpfilePath[i] = '\0'; + virDomainCoreDump(&wdEvent->dom, dumpfilePath, 0); + } + break; + } + + free(wdEvent->dom.name); + free(wdEvent); +} + +static int onWatchdogEvent(virConnectPtr conn, + virDomainPtr dom, + int action, + void *opaque ATTRIBUTE_UNUSED) +{ + struct qemud_driver *driver = conn->privateData; + struct watchdogEvent *wdEvent; + virDomainObjPtr vm; + + wdEvent = malloc(sizeof(*wdEvent)); + if (!wdEvent) + return -1; + + vm = virDomainFindByID(&driver->domains, dom->id); + if (!vm || !vm->def->watchdog) { + free(wdEvent); + return -1; + } + + if (vm->def->watchdog->action != VIR_DOMAIN_WATCHDOG_ACTION_MANAGED) { + virDomainObjUnlock(vm); + free(wdEvent); + return 0; + } + + if (action != 0) + return 0; + + wdEvent->dom = *dom; + wdEvent->dom.name = malloc(strlen(dom->name)); + if (!wdEvent->dom.name) { + virDomainObjUnlock(vm); + free(wdEvent); + return -1; + } + strcpy(wdEvent->dom.name, dom->name); + + wdEvent->action = vm->def->watchdog->subaction; + wdEvent->actionArg = vm->def->watchdog->subactionArg; + + virWorkerPoolSendJob(driver->workerPool, wdEvent); + virDomainObjUnlock(vm); + + return 0; +} + + static virDrvOpenStatus qemudOpen(virConnectPtr conn, virConnectAuthPtr auth ATTRIBUTE_UNUSED, int flags ATTRIBUTE_UNUSED) { @@ -4481,6 +4556,30 @@ static virDrvOpenStatus qemudOpen(virConnectPtr conn, } } } + + virMutexLock(&qemu_driver->workerPoolLock); + if (!qemu_driver->workerPool) { + qemu_driver->workerPool = virWorkerPoolNew(0, 1, processWatchdogEvent); + if (qemu_driver->workerPool) { + qemu_driver->watchdogEventCallbackID = virDomainEventCallbackListAddID(conn, + qemu_driver->domainEventCallbacks, + NULL, + VIR_DOMAIN_EVENT_ID_WATCHDOG, + VIR_DOMAIN_EVENT_CALLBACK(onWatchdogEvent), + NULL, + NULL); + if (qemu_driver->watchdogEventCallbackID == -1) { + virWorkerPoolFree(qemu_driver->workerPool); + qemu_driver->workerPool = NULL; + } else { + qemu_driver->poolRef = 1; + conn->refs--; + } + } + } else + qemu_driver->poolRef++; + virMutexUnlock(&qemu_driver->workerPoolLock); + conn->privateData = qemu_driver; return VIR_DRV_OPEN_SUCCESS; @@ -4489,6 +4588,15 @@ static virDrvOpenStatus qemudOpen(virConnectPtr conn, static int qemudClose(virConnectPtr conn) { struct qemud_driver *driver = conn->privateData; + virMutexLock(&driver->workerPoolLock); + if (--driver->poolRef == 0) { + conn->refs--; + virDomainEventCallbackListRemoveID(conn, driver->domainEventCallbacks, driver->watchdogEventCallbackID); + virWorkerPoolFree(driver->workerPool); + driver->workerPool = NULL; + } + virMutexUnlock(&driver->workerPoolLock); + /* Get rid of callbacks registered for this conn */ qemuDriverLock(driver); virDomainEventCallbackListRemoveConn(conn, driver->domainEventCallbacks); -- 1.7.3 -- Thanks, Hu Tao

On Wed, Nov 17, 2010 at 03:45:28PM +0800, Hu Tao wrote:
`managed' watchdog action lets libvirtd decide what to do if a guest watchdog fires. It has a subaction argument which is the action libvirtd will take. Currently only `dump' subaction is defined, it tells libvirtd to dump the guest on a watchdog event.
`managed' watchdog action is mapped to `none' qemu watchdog action, thus qemu will not get in the way of libvirtd's handling of watchdog events.
Won't that mean that the guest will continue to execute ? I think we'd want to map it to 'pause', so the guest stops executing while we dump.
diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 790ce98..49584c8 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -106,6 +106,12 @@ enum qemud_cmd_flags { struct qemud_driver { virMutex lock;
+ virMutex workerPoolLock; + int poolRef; + struct virWorkerPool *workerPool; + int watchdogEventCallbackID; + virConnectPtr conn;
We shouldn't be storing a virConnectPtr in here since that object for the public APIs - we should directly use the internal APIs.
@@ -4425,6 +4424,82 @@ retry: } }
+struct watchdogEvent +{ + virDomain dom; + int action; + void *actionArg; +}; + +static void processWatchdogEvent(void *data) +{ + struct watchdogEvent *wdEvent = data; + + switch (wdEvent->action) { + case VIR_DOMAIN_WATCHDOG_SUBACTION_DUMP: + { + char *path = wdEvent->actionArg; + char dumpfilePath[4096]; + int i; + + i = snprintf(dumpfilePath, 4096, "%s/%s-%u", path, wdEvent->dom.name, (unsigned int)time(NULL)); + dumpfilePath[i] = '\0';
It is preferrable to use 'virAsprintf()' here instead of a fixed buffer size.
+ virDomainCoreDump(&wdEvent->dom, dumpfilePath, 0);
Calling out to public APIs isn't allowed from the internal code. The functionality you want is mostly in the method qemudDomainCoreDump(), but it takes a virDomainPtr and you instead need one that has a virDomainObjPtr. So I'd pull the code out of qemudDomainCoreDump() into a static function you can call from here and qemudDomainCoreDump().
+ } + break; + } + + free(wdEvent->dom.name); + free(wdEvent); +}
VIR_FREE() is required here.
+ +static int onWatchdogEvent(virConnectPtr conn, + virDomainPtr dom, + int action, + void *opaque ATTRIBUTE_UNUSED) +{ + struct qemud_driver *driver = conn->privateData; + struct watchdogEvent *wdEvent; + virDomainObjPtr vm; + + wdEvent = malloc(sizeof(*wdEvent)); + if (!wdEvent) + return -1;
malloc() isn't allowed in libvirt code - use VIR_ALLOC() and checkout the 'HACKING' file for guidelines. 'make syntax-check' should warn you about coding style issues like this.
+ + vm = virDomainFindByID(&driver->domains, dom->id); + if (!vm || !vm->def->watchdog) { + free(wdEvent); + return -1; + } + + if (vm->def->watchdog->action != VIR_DOMAIN_WATCHDOG_ACTION_MANAGED) { + virDomainObjUnlock(vm); + free(wdEvent); + return 0; + } + + if (action != 0) + return 0; + + wdEvent->dom = *dom; + wdEvent->dom.name = malloc(strlen(dom->name)); + if (!wdEvent->dom.name) { + virDomainObjUnlock(vm); + free(wdEvent); + return -1; + } + strcpy(wdEvent->dom.name, dom->name); + + wdEvent->action = vm->def->watchdog->subaction; + wdEvent->actionArg = vm->def->watchdog->subactionArg; + + virWorkerPoolSendJob(driver->workerPool, wdEvent); + virDomainObjUnlock(vm); + + return 0; +} + + static virDrvOpenStatus qemudOpen(virConnectPtr conn, virConnectAuthPtr auth ATTRIBUTE_UNUSED, int flags ATTRIBUTE_UNUSED) { @@ -4481,6 +4556,30 @@ static virDrvOpenStatus qemudOpen(virConnectPtr conn, } } } + + virMutexLock(&qemu_driver->workerPoolLock); + if (!qemu_driver->workerPool) { + qemu_driver->workerPool = virWorkerPoolNew(0, 1, processWatchdogEvent); + if (qemu_driver->workerPool) { + qemu_driver->watchdogEventCallbackID = virDomainEventCallbackListAddID(conn, + qemu_driver->domainEventCallbacks, + NULL, + VIR_DOMAIN_EVENT_ID_WATCHDOG, + VIR_DOMAIN_EVENT_CALLBACK(onWatchdogEvent), + NULL, + NULL); + if (qemu_driver->watchdogEventCallbackID == -1) { + virWorkerPoolFree(qemu_driver->workerPool); + qemu_driver->workerPool = NULL; + } else { + qemu_driver->poolRef = 1; + conn->refs--; + } + } + } else + qemu_driver->poolRef++; + virMutexUnlock(&qemu_driver->workerPoolLock);
This is where most of the problems arise. 'qemudOpen' is only run when a client app connects to libvirt. We obviously want automatic core dumps to run any time a guest crashes, not only when a client app is connected to libvirt. Instead of using the virDomainEvent... APIs, you want to directly add code to the internal method qemuHandleDomainWatchdog() which is where the watchdog events originate. The worker pool can be created right at the start of QEMU driver startup in qemudStartup() avoiding the need for any ref counting on it.
+ conn->privateData = qemu_driver;
return VIR_DRV_OPEN_SUCCESS; @@ -4489,6 +4588,15 @@ static virDrvOpenStatus qemudOpen(virConnectPtr conn, static int qemudClose(virConnectPtr conn) { struct qemud_driver *driver = conn->privateData;
+ virMutexLock(&driver->workerPoolLock); + if (--driver->poolRef == 0) { + conn->refs--; + virDomainEventCallbackListRemoveID(conn, driver->domainEventCallbacks, driver->watchdogEventCallbackID); + virWorkerPoolFree(driver->workerPool); + driver->workerPool = NULL; + } + virMutexUnlock(&driver->workerPoolLock);
This chunk won't be needed at all if everything is done from the qemuHandleDomainWatchdog() method. Also a separate 'workerPoolLock' is probably overkill - just protect it with the regular qemu driver lock Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel, Thanks for your suggestion! I'll do a v2 of this patch. -- Thanks, Hu Tao

On Wed, Nov 17, 2010 at 03:45:05PM +0800, Hu Tao wrote:
This patch series adds support of auto-dump on watchdog event in libvirtd. If a qemu guest is configured with a watchdog device and libvirtd receives a watchdog event from the guest, it can do something on the guest, currently `dump' action is defined.
This patch series adds a new watchdog action `managed', it has a subaction wich indicates the action libvirtd will take on a watchdog event.
The meaning of subaction_arg depends on subaction. If subaction is `dump', then subaction_arg defines a folder path into which libvirtd saves dumpfile.
An example:
<watchdog model='i6300esb' action='managed' subaction='dump' subaction_arg='/mnt/data/dumps'>
Is there any need to have both 'action' and 'subaction' here ? I would think we could just have action='dump' on its own. One interesting side not, from the Xen driver we already have a bit of XML syntax <on_crash>destroy|restart|rename|preserve|coredump-destroy|coredump-restart</on_crash> Which we never use with the QEMU driver so far, since we can't detect whether a guest kernel has crashed or not. I'm wondering if we shouldn't let one of the watchdog actions be 'crash' and then take whatever action is specified in <on_crash>. For the dump location, I think it would suffice to just have a parameter set globally in /etc/libvirt/qemu.conf (cf XenD which just always dumps core to /var/lib/xen/dump/$NAME) Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Nov 18, 2010 at 10:56:47AM +0000, Daniel P. Berrange wrote:
On Wed, Nov 17, 2010 at 03:45:05PM +0800, Hu Tao wrote:
This patch series adds support of auto-dump on watchdog event in libvirtd. If a qemu guest is configured with a watchdog device and libvirtd receives a watchdog event from the guest, it can do something on the guest, currently `dump' action is defined.
This patch series adds a new watchdog action `managed', it has a subaction wich indicates the action libvirtd will take on a watchdog event.
The meaning of subaction_arg depends on subaction. If subaction is `dump', then subaction_arg defines a folder path into which libvirtd saves dumpfile.
An example:
<watchdog model='i6300esb' action='managed' subaction='dump' subaction_arg='/mnt/data/dumps'>
Is there any need to have both 'action' and 'subaction' here ? I would think we could just have action='dump' on its own.
Yes, it's better than having an extra subaction. I'll change this.
One interesting side not, from the Xen driver we already have a bit of XML syntax
<on_crash>destroy|restart|rename|preserve|coredump-destroy|coredump-restart</on_crash>
Which we never use with the QEMU driver so far, since we can't detect whether a guest kernel has crashed or not. I'm wondering if we shouldn't let one of the watchdog actions be 'crash' and then take whatever action is specified in <on_crash>.
For qemu we can only tell if a guest is crashed or not by configuring a watchdog device (or someone can tell me is there a better way to do this?), this is the reason I define another action for watchdog. If we only specify an action in <on_crash>(dump in this example), but at the same time watchdog action is destroy, then we will have no chance to dump guest on a watchdog event because qemu have already destroyed it.
For the dump location, I think it would suffice to just have a parameter set globally in /etc/libvirt/qemu.conf (cf XenD which just always dumps core to /var/lib/xen/dump/$NAME)
-- Thanks, Hu Tao
participants (2)
-
Daniel P. Berrange
-
Hu Tao