
On Tue, Sep 15, 2009 at 02:35:02PM +0200, Chris Lalancette wrote:
Daniel P. Berrange wrote:
The immediate use case for this data stream code is Chris' QEMU migration patchset.
The next use case is to allow serial console access to be tunnelled over libvirtd, eg to make 'virsh console GUEST' work remotely. This use case is why I included the support for non-blocking data streams and event loop integration (not required for Chris' migration use case)
Anyway, assuming Chris confirms that I've not broken his code, then patches 1-6 are targetted for this next release.
I'm sorry for the very long delay in getting back to this. I've been playing around with my tunnelled migration patches on top of this code, and I just can't seem to make the new nonblocking stuff work properly. I'm getting a couple of behaviors that are highly undesirable:
1) Immediately after starting the stream, I get a virStreamRecv() callback on the destination side. The problem is that this is wrong for migration; there's no data that I can read *from* the destination qemu process which makes any sense. While I could implement the method and just throw away the data, that doesn't seem right to me. This leads to...
I realize this is due to the remoteAddClientStream() method in qemud/stream.c. It unconditionally sets 'stream->tx' to 1. I didn't notice the problem myself, since the test driver is using pipes which are unidirectional, but yor UNIX domain socket is bi-directional. We could either add a flag to remoteAddClientStream() to indicate whether the stream should allow read or write or both. Or you might be able to call shutdown(sockfd, SHUT_RD) on your UNIX socket to indicate that its only going to be used for write effectively making it unidirectional. A 3rd option is to define more flags for virStreamNew(), one for READ, one for WRITE, and have the remote daemon pay attention to these
2) A crash in libvirtd on the source side of the destination. It doesn't happen every single time, but when it has happened I've traced it down to the fact that src/remote_internal.c:remoteDomainEventFired() can get called *after* conn->privateData has been set to NULL, leading to a SEGV on NULL pointer dereference. I can provide a core-dump for this if needed.
I don't have any explanation for this - its a little wierd and we ought to try and figure it out if possible.
3) (minor) The python bindings refuse to build with these patches in place.
Yeah I completely forgot to add rules for virSecret APIs there Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|