
On Tue, Nov 25, 2008 at 11:39:57AM +0000, Daniel P. Berrange wrote:
On Fri, Nov 21, 2008 at 11:13:04PM +0100, Guido G?nther wrote:
Hi, I just ran across these oddities when using a bit more libvirt+xen:
1.) virsh setmaxmem:
On a running domain: # virsh setmaxmem domain 256000 completes but virsh dumpxml as well as the config.sxp still shows the old amount of memory. Looks as the set_maxmem hypercall simply gets ignored. xm mem-max works as expected. Smells like a bug in the ioctl?
The setmaxmem API is not performance critical, so it sounds like we should first try setting it via XenD, and use Hypervisor as the fallback instead. I tried that and it worked as you suggested. However, checking the "old" method of using HV out of a sudden works too now, no idea why that reliably failed the last time and doesn't do so now (the machine has been rebooted in the meantime though). I keep the patched package around in case this pops up again.
2.) virsh list:
Sometimes (didn't find a pattern yet) when shutting down a running domains and restarting it I'm seeing:
Id Name State ---------------------------------- 0 Domain-0 running 2 foo idle libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: 7 bar idle
Note that the number of errors the corresponds to the number of shutdowns. VirXen_getdomaininfolist returns 7 in the above case. virDomainLookupByID later on fails for these "additional" domains.
This is basically a XenD bug. What's happening is that the domain has been shutdown, and got most of the way through cleanup, as far as the hypervisor is concerned. But something is still hanging around keeping the domain from being completely terminated. In this case XenD takes the dubious approach of just pretending the domain does not exist. So libvirt sees it exists in the hypervisor, but when asking XenD for more data, it gets that error. This really really sucks.
THere's not really much we can do about it when XenD is just plain lieing about what exists. We explicitly don't ask XenD for the list of domain IDs because it is incredibly slow, hence we use the HV.
The only idea I can think of is to ask XenStore for the list of domain IDs. This is still dramatically faster than asking XenD, but not quite as fast as the Hypervisor.
3.) virsh list: Duplicate domains:
Id Name State ---------------------------------- 0 Domain-0 running libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
2A> libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: 14 bar no state libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon: 16 bar idle
Domain 14 can't be shut down (xm list only lists domain 16).
Could be a similar problem as the above.
Yeha, this is almost certainly just another example of XenD not properly cleaning up / destroying domains. If you still have a machine which shows this behaviour, then I'd recommend trying this change to our Xen impl
In xen_unified.c, find the method xenUnifiedListDomains and make it first call xenStoreListDomains() and then fallback to trying HV & XenD drivers. If we're lucky this will help....
Yes this helps indeed, thanks a lot. Possible patch attached. -- Guido