
On Sat, Feb 07, 2009 at 06:05:51PM +0100, Guido G?nther wrote:
On Fri, Feb 06, 2009 at 04:48:09PM +0000, Daniel P. Berrange wrote:
That will cause libvirtd to spin 100% cpu forever, if a guest fails to start up. eg disk to mis-configured disk
The core problem here, is that ret == 0 has 2 possible implications
- QEMU has exited, and no more data will be written - QEMU is still starting up, and we have read all the data written so far, but more may arrive soon.
The current code there is correct for the first scenario, but even removing it, is not entirely correct for the 2nd scenario. If we hit ret == 0, and QEMU is still running, we shouldn't spin 100% CPU in read - we should poll() to wait for more data.
As a quick fix though, we could probably detect whether QEMU has exited by doing 'kill(vm->pid, 0)' and check for errno == ESRCH - indicates that the process no longer exists.
eg,
} else if (ret == 0) { if (kill(vm->pid, 0) == -1) { if (errno == ERSCH) return 0; else return -1; } else { continue; /* should really go into poll() at this point */ } } else { The problem is that we're using this function for two purposes: To read from a logfile where poll()'ing on POLLIN always returns "data readable" which leaves us spinning (and EOF might get in our way) and also using it on the monitor PTY itself where it works as expected. Why not simply introduce qemudReadLogOutput? This at least allows us to usleep while waiting for the log file to fill up with data. I couldn't make libvirtd spin with this patch anymore. O.k. to apply?
Ah of course, I forgot poll() isn't particularly useful on regular files. For a non-spinning version we'd need to integrate with inotify to monitor changes. ACK to your proposed patch for now. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|