[Libvir] PATCH: Process QEMU monitor at startup

With the current QEMU driver we simply spawn the QEMU process and return to the caller immediately. If the QEMU process dies immediately for some reason the user will never no - their domain will simply never appear. To address this, the attached patch will spawn the QEMU process and then immeditely process stderr to look for the monitor PTY - or EOF indicating failure. If it finds the monitor it will also try to connect to it - if this fails in any way then the domain is terminated with extreme prejudice - we absolutely need the monitor connection for other operations to work. This should at least cause a 'Domain startup failed' message to be returned to the user although more work is needed to try and extract a useful error message from the stuff written to stderr. Also in this patch I have reversed the order in which file descriptors are added to the poll array. Currently they are added server, then client, then VMs. The trouble with this is that when processing the server event, if a new client is added to the daemon then processing of subsequent FDs is out of sync. Likewise if a client message causes a new VM to startup then processing of subsquent VM FDs is out of sync. Reversing the order trivially solves it, because VM FDs s are now processed before client FDs, and thus any new VMs started by a client don't mess things up. Likewise client FDs are processed before server FDs. A more robust long term fix would be to carry around some extra metadata instead of relying on ordering when dispatching events. Finally this patch also fixes the 'resume' operation, and allows for the VM state to be reported correctly - ie we can now report 'paused' as well as shutdown/running. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Hi Dan, On Fri, 2007-03-02 at 14:35 +0000, Daniel P. Berrange wrote:
+static int qemudOpenMonitor(struct qemud_vm *vm, const char *monitor) {
#define MONITOR_POLL_TIMEOUT 5
+ int monfd; + char buffer[1024]; + int got = 0; + + if (!(monfd = open(monitor, O_RDWR))) { + return -1; + } + if (qemudSetCloseExec(monfd) < 0) + goto error; + if (qemudSetNonBlock(monfd) < 0) + goto error; + + /* Consume & discard the initial greeting */ + for(;;) { + int ret; + + ret = read(monfd, buffer+got, sizeof(buffer)-got-1);
What happens if the buffer fills up ?
+ if (ret == 0) + goto error; + if (ret < 0) { + struct pollfd fd = { .fd = monfd, .events = POLLIN }; + if (errno != EAGAIN && + errno != EINTR) + goto error; + + ret = poll(&fd, 1, 5);
ret = poll(&fd, 1, MONITOR_POLL_TIMEOUT); Giving up after 5 milliseconds? Are we that afraid of commitment?
+static int qemudWaitForMonitor(struct qemud_vm *vm) {
We don't set an error in here, so how about returning the appropriate errno from the function?
+ char buffer[1024]; /* Plenty of space to get startup greeting */ + int got = 0; + + for (;;) { + int ret; + + ret = read(vm->stderr, buffer+got, sizeof(buffer)-got-1); + if (ret == 0) { + return -1; + } + if (ret < 0) { + struct pollfd fd = { .fd = vm->stderr, .events = POLLIN }; + if (errno != EAGAIN && + errno != EINTR) { + return -1; + } + + ret = poll(&fd, 1, 5000);
Again, a #define for the timeout would be nice
+ if (ret == 0) { + return -1; + } else if (ret < 0) { + if (errno != EINTR) + return -1; + } else if (fd.revents != POLLIN) { + return -1; + } + } else { + char monitor[100]; + char *tmp = buffer; + got += ret; + buffer[got] = '\0'; + while (tmp && *tmp) { + if (sscanf(tmp, "char device redirected to %19s", monitor) == 1) {
Path length of 100, maximum field with of 19? That's all rather voodooish ... hope about strstr() to find it, buffer length of PATH_MAX and then just strncpy()? Also, it'd be nice to split that out into e.g. qemudMonitorPathFromStr(buffer, monitorPath, PATH_MAX)
+ if (qemudOpenMonitor(vm, monitor) < 0) + return -1; + return 0; + } + tmp = index(tmp, '\n');
index() is a bit odd, why not strstr() ?
+static int qemudNextFreeVNCPort(struct qemud_server *server ATTRIBUTE_UNUSED) {
I don't really know the context, but it'd be much nicer if we could just QEMU find a free port itself and then we query the port - e.g. the obvious race condition.
+ int i; + + for (i = 5900 ; i < 6000 ; i++) { + int fd; + struct sockaddr_in addr; + addr.sin_family = AF_INET; + addr.sin_port = htons(i); + addr.sin_addr.s_addr = htonl(INADDR_ANY); + fd = socket(PF_INET, SOCK_STREAM, 0); + if (fd < 0) + return -1; + + if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == 0)
Um, why not bind() (with SO_REUSEADDR)?
ret = 0; + + if (qemudWaitForMonitor(vm) < 0) {
Set an error based on the returned errno ...
struct qemud_vm *next = vm->next; @@ -1356,6 +1426,7 @@ static int qemudDispatchPoll(struct qemu if (stderrfd != -1) { if (!failed) { if (fds[fd].revents) { + printf("%d\n", fds[fd].revents);
Kill this Cheers, Mark.

On Fri, Mar 02, 2007 at 03:06:56PM +0000, Mark McLoughlin wrote:
Hi Dan,
On Fri, 2007-03-02 at 14:35 +0000, Daniel P. Berrange wrote:
+static int qemudOpenMonitor(struct qemud_vm *vm, const char *monitor) {
#define MONITOR_POLL_TIMEOUT 5
+ int monfd; + char buffer[1024]; + int got = 0; + + if (!(monfd = open(monitor, O_RDWR))) { + return -1; + } + if (qemudSetCloseExec(monfd) < 0) + goto error; + if (qemudSetNonBlock(monfd) < 0) + goto error; + + /* Consume & discard the initial greeting */ + for(;;) { + int ret; + + ret = read(monfd, buffer+got, sizeof(buffer)-got-1);
What happens if the buffer fills up ?
Sorry, wrong version of the patch - the for(;;) should have been while (got < (sizeof(buffer)-1)) { I'll repost the correct patch when I've done of the other fixes.
+ if (ret == 0) + goto error; + if (ret < 0) { + struct pollfd fd = { .fd = monfd, .events = POLLIN }; + if (errno != EAGAIN && + errno != EINTR) + goto error; + + ret = poll(&fd, 1, 5);
ret = poll(&fd, 1, MONITOR_POLL_TIMEOUT);
Giving up after 5 milliseconds? Are we that afraid of commitment?
Again, wrong patch - it should have been 5 seconds.
+static int qemudWaitForMonitor(struct qemud_vm *vm) {
We don't set an error in here, so how about returning the appropriate errno from the function?
I'll add in some error reporting.
+ if (sscanf(tmp, "char device redirected to %19s", monitor) == 1) {
Path length of 100, maximum field with of 19? That's all rather voodooish ... hope about strstr() to find it, buffer length of PATH_MAX and then just strncpy()? Also, it'd be nice to split that out into e.g. qemudMonitorPathFromStr(buffer, monitorPath, PATH_MAX)
I'll have a go at splitting it out.
+ if (qemudOpenMonitor(vm, monitor) < 0) + return -1; + return 0; + } + tmp = index(tmp, '\n');
index() is a bit odd, why not strstr() ?
Well we're only looking for a single character match - index() is for single characters, strstr() is for strings.
+static int qemudNextFreeVNCPort(struct qemud_server *server ATTRIBUTE_UNUSED) {
I don't really know the context, but it'd be much nicer if we could just QEMU find a free port itself and then we query the port - e.g. the obvious race condition.
Currently QEMU can't auto-allocate itself a patch - I'm planning to port the Xen auto-allocation patch to mainline QEMU. Until then, this code is better than the the current situation, even though there is an obvious race.
+ if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == 0)
Um, why not bind() (with SO_REUSEADDR)?
No particular reason - either will work just fine. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Fri, 2007-03-02 at 15:34 +0000, Daniel P. Berrange wrote:
On Fri, Mar 02, 2007 at 03:06:56PM +0000, Mark McLoughlin wrote:
index() is a bit odd, why not strstr() ?
Well we're only looking for a single character match - index() is for single characters, strstr() is for strings.
Sorry, strchr() (which is in the C standard itself, whereas index() is marked legacy in POSIX ... apparently)
+ if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == 0)
Um, why not bind() (with SO_REUSEADDR)?
No particular reason - either will work just fine.
I've never noticed it done with connect() anywhere else, I guess because you'd have the same obvious race ... I dunno, bind() will fail quickly when we probe a port, whereas connect() will succeed more slowly and cause another context switch etc. Not a big deal, but ... Cheers, Mark.

On Fri, Mar 02, 2007 at 03:34:33PM +0000, Daniel P. Berrange wrote:
On Fri, Mar 02, 2007 at 03:06:56PM +0000, Mark McLoughlin wrote:
+static int qemudWaitForMonitor(struct qemud_vm *vm) {
We don't set an error in here, so how about returning the appropriate errno from the function?
I'll add in some error reporting.
We now get errors back like: $ ./virsh --connect qemu:///session start Fedora libvir: QEMU error : internal error Domain shutdown before sending monitor PTY path error: Failed to start domain Fedora
+ if (sscanf(tmp, "char device redirected to %19s", monitor) == 1) {
Path length of 100, maximum field with of 19? That's all rather voodooish ... hope about strstr() to find it, buffer length of PATH_MAX and then just strncpy()? Also, it'd be nice to split that out into e.g. qemudMonitorPathFromStr(buffer, monitorPath, PATH_MAX)
I'll have a go at splitting it out.
This is now done.
index() is a bit odd, why not strstr() ?
Well we're only looking for a single character match - index() is for single characters, strstr() is for strings.
Changed to use strchr() - although there's several other bits in existing code which already using 'index()' which also need updating.
+ if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == 0)
Um, why not bind() (with SO_REUSEADDR)?
No particular reason - either will work just fine.
Actually one case where bind() is better is if the server is not accepting new connections - connect() will block indefinitely unless the server actually promptly accepts them. So I've changed to bind() + REUSEADDR. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Hi Dan, Looks fine to commit, but how about re-factoring out the read/poll/check loop? Suggested, but untested, patch attached. Cheers, Mark.

On Mon, Mar 05, 2007 at 09:10:03AM +0000, Mark McLoughlin wrote:
Hi Dan, Looks fine to commit, but how about re-factoring out the read/poll/check loop? Suggested, but untested, patch attached.
I commited the patch with your suggested re-factoring applied too - I made one tiny change to allow us to distinguish errors reading the STDERR from errors reading the monitor PTY data. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Fri, 2007-03-02 at 16:47 +0000, Daniel P. Berrange wrote:
@@ -432,7 +441,7 @@ int qemudDomainDumpXML(struct qemud_serv strncpy(xml, vmxml, xmllen); xml[xmllen-1] = '\0';
- free(xml); + free(vmxml);
I hit this myself, so I committed the fix along with the same fix for qemudNetworkDumpXML() Cheers, Mark.

Hi Dan, On the topic of error reporting, I just noticed that without QEMU installed you just get: libvirt.libvirtError: virDomainCreateLinux() failed (Haven't looked into it myself) Cheers, Mark.

On Mon, Mar 05, 2007 at 02:18:20PM +0000, Mark McLoughlin wrote:
Hi Dan, On the topic of error reporting, I just noticed that without QEMU installed you just get:
libvirt.libvirtError: virDomainCreateLinux() failed
(Haven't looked into it myself)
There is an 'XXX' in the method which turns the QEMU command name into the full pathname to exec. It needs to actually look for it in $PATH instead of using /usr, and it once its doing that, the method can actually return NULL and thus we can provide a useful message that QEMU is not installed. Shouldn't be much work to fix it - I'll look at it later until you beat me to it Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (2)
-
Daniel P. Berrange
-
Mark McLoughlin