On Fri, Sep 07, 2012 at 02:44:03PM +0200, Christophe Fergeau wrote:
On Fri, Sep 07, 2012 at 01:24:35PM +0100, Daniel P. Berrange wrote:
> A nice long detailed explanation. I agree that this scenario you
> outline is plausible as an explanation for why Boxes sometimes
> stops getting events from libvirtd.
I've ran more tests in the mean time without this patch applied, but
with the one below to add some debugging:
diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c
index 43ecdcf..33d90fb 100644
--- a/src/conf/domain_event.c
+++ b/src/conf/domain_event.c
@@ -1501,7 +1501,13 @@ virDomainEventStateRegisterID(virConnectPtr conn,
int ret = -1;
virDomainEventStateLock(state);
+ VIR_WARN("RegisterID");
+ if ((state->callbacks->count == 0) && (state->timer == -1)) {
+ if (state->queue->count != 0) {
+ VIR_WARN("REG: queue's not empty: %d",
state->queue->count);
+ }
+ }
if ((state->callbacks->count == 0) &&
(state->timer == -1) &&
(state->timer = virEventAddTimeout(-1,
@@ -1584,6 +1590,7 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
{
int ret;
+ VIR_WARN("DeregisterID");
virDomainEventStateLock(state);
if (state->isDispatching)
ret = virDomainEventCallbackListMarkDeleteID(conn,
@@ -1596,6 +1603,9 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
state->timer != -1) {
virEventRemoveTimeout(state->timer);
state->timer = -1;
+ if (state->queue->count != 0) {
+ VIR_WARN("DEREG: queue's not empty: %d",
state->queue->count);
+ }
}
virDomainEventStateUnlock(state);
I've hit the event lost issue once, and right when this started happening,
the log was:
2012-09-06 11:37:06.094+0000: 30498: warning :
virDomainEventStateDeregisterID:1593 : DeregisterID
2012-09-06 11:37:06.094+0000: 30498: warning :
virDomainEventStateDeregisterID:1607 : DEREG: queue's not empty: 1
2012-09-06 11:45:42.363+0000: 30502: warning :
virDomainEventStateRegisterID:1504 : RegisterID
2012-09-06 11:45:42.363+0000: 30502: warning :
virDomainEventStateRegisterID:1508 : REG: queue's not empty: 1
and after that, no events and these warnings kept happening with an
increasing number of queued events which is consistent with the hypothesis I made
in this patch.
Great, that's pretty encouraging then.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|