
On Thu, Jul 19, 2007 at 06:03:29PM +0100, Richard W.M. Jones wrote:
Some observations about Xen migration and error handling.
The Xen migration protocol isn't stable between releases. It changed between 3.0.3 and 3.1.0. There doesn't seem to be any versioning, and incompatible versions of Xen seem happy to attempt migrations between them, even though these will certainly fail.
The source host's xend forks an xc_save process. It appears to me that xc_save will happily write to _anything_ listening on port 8002, even if that thing closes the socket prematurely. (Try running 'xm migrate' on one host and at the same time 'nc -l 8002 > /dev/null' on the target host. The 'xm migrate' will happily complete without error. Meanwhile the domain you "migrated" just gets deleted.)
Partly because of the lack of error reporting, and partly because the xend -> xc_save fork will make error reporting difficult to add, libvirt has a hard time displaying errors in this situation. It is quite possible to call virDomainMigrate and get a domain back which shortly afterwards "disappears", all without any indication of error.
We need to be careful about where we draw the line here. We can jump through all sorts of hoops in libvirt, but at the end of the day there is some majorly broken stuff in Xen really just needs fixing rather than working around. I'd rather submit fixes to upstream Xen where needed to make migration more reliable than put too much complexity into libvirt, even if it means we have more limited error reporting for current XenD. The mailing list archive links eludes me right now, but upstream Xen was reasonably receptive to the idea of bringing xc_save/restore back into XenD process which would resolve a huge class of error reporting problems. The original motivation for making them separate processes was that the code was fragile and crashed XenD a fair bit, but that's likely no longer a problem. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|