[Libvir] [QEMU 3/3] the qemu driver & daemon

Attached is the patch for the qemu driver & daemon process Since there is no scalable way to determine active standalone QEMU instances, or access their monitor connection, the backend implements a management daemon - called 'qemud' - which takes care of managing the complete lifecycle of QEMU virtual machines. The libvirt driver is a thin shim which talks to qemud via a UNIX domain socket - the wire protocol is modelled after the protocol used for libvirt_proxy, but obviously contains alot more functions since its full read/write, not just read-only. There is intended to be one qemud instance per UNIX user - the so called 'session' instance, and optionally one system global instance runing as root (or a system daemon account?) - the so called 'system' instance. When using libvirt, the URIs for each are 'qemu:///session' and 'qemu:///system' respectively. The UNIX socket for session instances is $HOME/.qemud and is chmod 0700 - no support is provided for users connecting to each other's instances. Although we could trivially extend to create a read-only socket $HOME/.qemud-ro if desired. The system instance is currently not finished but intended to run in /var/run/qemud or some such. The backend code is in src/qemu_internal.h, src/qemu_internal.c, the daemon code is in qemud/* with qemud/protocol.h defining the network wire protocol. This latter header file is also included by src/qemu_internal.c - qemud.c - the main daemon management code - handles all network I/O and forking / cleanup of QEMU processes. The implementation is poll() based & completely non-blocking. All requests/replies from a single client are, however, serialized. - dispatch.c - takes care of serializing/de-serializing requests/replies from/to libvirt - called from qemud.c when a complete message is available. - driver.c - the implementation of each request from libvirt - called from dispatch.c - config.c - manages reading & writing of XML files for inactive domains in the $HOME/.qemud.d/*.xml. Also takes care of generating a suitable command line for execing qemu. In terms of hardware, there is support for - harddisk - floppy, cdrom, harddisk - graphics - VNC - memory - virtual CPUs In particular no support for - networking - sound - non-native CPU types - boot order But its enough to demonstrate the practice. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Aug 28, 2006 at 12:18:52AM +0100, Daniel P. Berrange wrote:
Attached is the patch for the qemu driver & daemon process
Cool pathes! On first look (I hope I'm not too pedantic:-):
+static int qemudParseUUID(const char *uuid, + unsigned char *rawuuid) {
We have virParseUUID() in xml.c, I think we should maintain more than one version of same code.
+ if (!(*argv = malloc(sizeof(char *) * *argc))) + return -1;
Please, virQemuError(VIR_ERR_NO_MEMORY ....) (same or calloc())
+ printf("Load VM %s\n", file); + if (!(xml = xmlReadDoc(BAD_CAST doc, file ? file : "domain.xml", NULL, + XML_PARSE_NOENT | XML_PARSE_NONET | + XML_PARSE_NOERROR | XML_PARSE_NOWARNING))) { + printf("malformed\n"); + return NULL; + }
printf()... it seems you forgot there your debug messages ;-)
+static void qemudLoadConfig(struct qemud_server *server, + const char *file) { + FILE *fh; + struct stat st; + struct qemud_vm *vm; + char xml[QEMUD_MAX_XML_LEN]; + int ret; + + if (!(fh = fopen(file, "r"))) { + return; + }
No error message?
+static +int qemudBufferAdd(struct qemudBuffer *buf, const char *str) { + int need = strlen(str); + +static +int qemudBufferPrintf(struct qemudBuffer *buf, + const char *format, ...) {
Duplicate code?
+ if (qemudBufferPrintf(&buf, " <uuid>%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x</uuid>\n", + uuid[0], uuid[1], uuid[2], uuid[3], + uuid[4], uuid[5], uuid[6], uuid[7], + uuid[8], uuid[9], uuid[10], uuid[11], + uuid[12], uuid[13], uuid[14], uuid[15]) < 0)
$ grep "<uuid>" * test.c: " <uuid>%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x</uuid>\n", xend_internal.c: virBufferVSprintf(&buf, " <uuid>%s</uuid>\n", compact); xml.c:" <uuid>%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x</uuid>\n", UUID.. to be with '-' or not to be with '-'.. that's the question. It also seems there is again duplicate code do in test.c and xml.c.
+ qemudBufferPrintf(&buf, " <graphics type='vnc'/>\n");
no format args --> BufferAdd() Karel -- Karel Zak <kzak@redhat.com>

On Mon, Aug 28, 2006 at 10:06:03AM +0200, Karel Zak wrote:
On Mon, Aug 28, 2006 at 12:18:52AM +0100, Daniel P. Berrange wrote:
Attached is the patch for the qemu driver & daemon process
Cool pathes!
On first look (I hope I'm not too pedantic:-):
+static int qemudParseUUID(const char *uuid, + unsigned char *rawuuid) {
We have virParseUUID() in xml.c, I think we should maintain more than one version of same code.
Yes, be nice to re-factor helper methods into a dir which can be shared between the daemon & libvirt
+ if (!(*argv = malloc(sizeof(char *) * *argc))) + return -1;
Please, virQemuError(VIR_ERR_NO_MEMORY ....) (same or calloc())
I forgot to mention in my posting - I simply return a generic -1, internal error everywhere in the code. I'm planning to go through it again to add in explicit, more useful error codes.
+ printf("Load VM %s\n", file); + if (!(xml = xmlReadDoc(BAD_CAST doc, file ? file : "domain.xml", NULL, + XML_PARSE_NOENT | XML_PARSE_NONET | + XML_PARSE_NOERROR | XML_PARSE_NOWARNING))) { + printf("malformed\n"); + return NULL; + }
printf()... it seems you forgot there your debug messages ;-)
yes, quite a few of those need taking out - to be done at same time as error codes are added
+static void qemudLoadConfig(struct qemud_server *server, + const char *file) { + FILE *fh; + struct stat st; + struct qemud_vm *vm; + char xml[QEMUD_MAX_XML_LEN]; + int ret; + + if (!(fh = fopen(file, "r"))) { + return; + }
No error message?
This is run during initialization when it scans all config files in user's $HOME/.qemu.d directory. Since the daemon is already running in the background STDOUT & STDERR are already pointing to /dev/null. Of course during testing I run it in foreground (hence the printf()'s, but ordinarily we wouldn't see this. I think we need some very basically logging code to output issues like this to $HOME/.qemud.d/daemon.log
+static +int qemudBufferAdd(struct qemudBuffer *buf, const char *str) { + int need = strlen(str); + +static +int qemudBufferPrintf(struct qemudBuffer *buf, + const char *format, ...) {
Duplicate code?
Yes - needs re-factoring.
+ if (qemudBufferPrintf(&buf, " <uuid>%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x</uuid>\n", + uuid[0], uuid[1], uuid[2], uuid[3], + uuid[4], uuid[5], uuid[6], uuid[7], + uuid[8], uuid[9], uuid[10], uuid[11], + uuid[12], uuid[13], uuid[14], uuid[15]) < 0)
It also seems there is again duplicate code do in test.c and xml.c.
Yes - needs re-factoring.
+ qemudBufferPrintf(&buf, " <graphics type='vnc'/>\n");
no format args --> BufferAdd()
Opps, misssed that! Dan -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Aug 28, 2006 at 12:18:52AM +0100, Daniel P. Berrange wrote:
Attached is the patch for the qemu driver & daemon process
Since there is no scalable way to determine active standalone QEMU instances, or access their monitor connection, the backend implements a management daemon - called 'qemud' - which takes care of managing the complete lifecycle of QEMU virtual machines. The libvirt driver is a thin shim which talks to qemud via a UNIX domain socket - the wire protocol is modelled after the protocol used for libvirt_proxy, but obviously contains alot more functions since its full read/write, not just read-only.
Okay, it would have been nicer if QEmu opened a socket upon launching a new instance, but currently that sounds the only solution.
There is intended to be one qemud instance per UNIX user - the so called 'session' instance, and optionally one system global instance runing as root (or a system daemon account?) - the so called 'system' instance. When using libvirt, the URIs for each are 'qemu:///session' and 'qemu:///system' respectively.
I'm wondering what would be the role of the system instance. Aggregate the informations from the set of users running qemu when running as root ?
The UNIX socket for session instances is $HOME/.qemud and is chmod 0700 - no support is provided for users connecting to each other's instances. Although we could trivially extend to create a read-only socket $HOME/.qemud-ro if desired. The system instance is currently not finished but intended to run in /var/run/qemud or some such.
hum .... I'm not too fond of putting the socket directly in the user home directory, I would feel better with a .qemud directory, protected 700 by default and then have the socket mapped in that directory, I think it is more secure, a bit cleaner to not have socket in home dir, and gives a place to store informations if needed. Actually you already create .qemud.d for storage, can we just make it .qemud and store the socket there ? Or we can use a socket not mapped in the filesystem at all, based on the user name, which usually avoid the race when checking and then opening.
The backend code is in src/qemu_internal.h, src/qemu_internal.c, the daemon code is in qemud/* with qemud/protocol.h defining the network wire protocol. This latter header file is also included by src/qemu_internal.c
- qemud.c - the main daemon management code - handles all network I/O and forking / cleanup of QEMU processes. The implementation is poll() based & completely non-blocking. All requests/replies from a single client are, however, serialized. - dispatch.c - takes care of serializing/de-serializing requests/replies from/to libvirt - called from qemud.c when a complete message is available. - driver.c - the implementation of each request from libvirt - called from dispatch.c - config.c - manages reading & writing of XML files for inactive domains in the $HOME/.qemud.d/*.xml. Also takes care of generating a suitable command line for execing qemu.
In terms of hardware, there is support for
- harddisk - floppy, cdrom, harddisk - graphics - VNC - memory - virtual CPUs
In particular no support for
- networking - sound - non-native CPU types - boot order
But its enough to demonstrate the practice.
Well that looks excellent :-) , of course networking will be needed to really make use of it but this is fantastic progress !
+static int qemudParseUUID(const char *uuid, + unsigned char *rawuuid) {
Should we make a lib subdir to avoid duplicating code like this ? Alone that doesn't sound worth it, but if there is more routines duplicated could be useful.
+ //virXMLError(VIR_ERR_NO_SOURCE, (const char *) target, 0); + //virXMLError(VIR_ERR_NO_TARGET, (const char *) source, 0); + printf("Bogus floppy\n"); + printf("Bogus cdomr\n"); + printf("alse %s %s\n", device, target);
Need to unify and make the usual wrappers for error code. Even if not linked in the libvirt lib, so that error messages could be carried back at the protocol level, if we put much logic in the qemud we will need some logging facilities, or transport errors, right ?
+ obj = xmlXPathEval(BAD_CAST "/domain/devices/graphics", ctxt); + if ((obj == NULL) || (obj->type != XPATH_NODESET) || + (obj->nodesetval == NULL) || (obj->nodesetval->nodeNr == 0)) { + vm->def.graphicsType = QEMUD_GRAPHICS_NONE; + } else { + prop = xmlGetProp(obj->nodesetval->nodeTab[0], BAD_CAST "type"); + if (!strcmp((char *)prop, "vnc")) { + vm->def.graphicsType = QEMUD_GRAPHICS_VNC; + prop = xmlGetProp(obj->nodesetval->nodeTab[0], BAD_CAST "port"); + if (prop) { + conv = NULL; + vm->def.vncPort = strtoll((const char*)prop, &conv, 10); + } else { + vm->def.vncPort = 0; + } + } + }
Hum ... if we have no graphic device what's going on ? the QEmu process is forked by a daemon, so there is no way to get back to the console... The console access is something we didn't tried to address so far in libvirt. At least some console logs would be useful, otherwise we have the problem of hitting asynchronous interface is we really want to give interactive access... Well we can probably live without :-)
+static int qemudSaveConfig(struct qemud_server *server ATTRIBUTE_UNUSED, + struct qemud_vm *vm) { + char *xml; + int fd, ret = -1; + int towrite; + struct stat sb; + + if (!(xml = qemudGenerateXML(vm))) { + return -1; + } + + printf("Save VM %s\n", vm->configFile); + + if (stat(server->configDir, &sb) < 0) { + if (errno == ENOENT) { + if (mkdir(server->configDir, 0700) < 0) + return -1; + } else { + return -1; + } + } else if (!S_ISDIR(sb.st_mode)) { + return -1; + }
Shouldn't we check the mode and owner if the directory already exists ? that should be in the stat informations.
+ if (qemudFindVMByUUID(server, vm->def.uuid)) { + qemudFreeVM(vm); + printf("Duplicate domains, discarding %s\n", file); + return NULL; + }
UUID clashes due to users manually copying files may be more frequent than expected. Cloning is gonna be fun in general ...
+ if (qemudFindVMByName(server, vm->def.name)) { + qemudFreeVM(vm); + printf("Duplicate domains, discarding %s\n", file); + return NULL; + } [...] +int qemudScanConfigs(struct qemud_server *server) {
2 things here: 1/ caching if the modification timestamps of the directory and file didn't changed. I don't know how often the scanning will be done but it's not lightweight since it reparses everything 2/ name and uuid clashes could be detected here or when adding some. I scanned quickly the protocol part but I need to get my head around it :-) That's a lot of code ! One think I would like is to try to get all commenting at the same level, i.e. having every function carrying a comment header, That's something I will help with, maybe as a side effect of reviewing the code. Tell me how we could do that ? Maybe I could start with the simplest code from qemud.
-#define MAX_DRIVERS 5 +#define MAX_DRIVERS 10
I didn't expect that initially :-) Fantastic but that's a big chunk of code ! Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mon, Aug 28, 2006 at 04:55:09AM -0400, Daniel Veillard wrote:
On Mon, Aug 28, 2006 at 12:18:52AM +0100, Daniel P. Berrange wrote:
There is intended to be one qemud instance per UNIX user - the so called 'session' instance, and optionally one system global instance runing as root (or a system daemon account?) - the so called 'system' instance. When using libvirt, the URIs for each are 'qemu:///session' and 'qemu:///system' respectively.
I'm wondering what would be the role of the system instance. Aggregate the informations from the set of users running qemu when running as root ?
Well, the session instance is intended to fill the desktop user use-case, while the system instance is more aimed at a data-center use case where an admin might set up a bunch of persistently running QEMU instances on a server.
The UNIX socket for session instances is $HOME/.qemud and is chmod 0700 - no support is provided for users connecting to each other's instances. Although we could trivially extend to create a read-only socket $HOME/.qemud-ro if desired. The system instance is currently not finished but intended to run in /var/run/qemud or some such.
hum .... I'm not too fond of putting the socket directly in the user home directory, I would feel better with a .qemud directory, protected 700 by default and then have the socket mapped in that directory, I think it is more secure, a bit cleaner to not have socket in home dir, and gives a place to store informations if needed. Actually you already create .qemud.d for storage, can we just make it .qemud and store the socket there ?
Or we can use a socket not mapped in the filesystem at all, based on the user name, which usually avoid the race when checking and then opening.
Actually, the socket is already in the abstract namespace - i use the abstract path '\0$HOME/.qemud' - could easily put it in a dir below though
+static int qemudParseUUID(const char *uuid, + unsigned char *rawuuid) {
Should we make a lib subdir to avoid duplicating code like this ? Alone that doesn't sound worth it, but if there is more routines duplicated could be useful.
Ues, as karel pointed out, there are a number of utility functions from libvirt that I duplicated in the qemud/ directory primarily because I wanted to avoid the circular depedancy back onto libvirt from the daemon. I think it would be worthwhile having the various helper routines in some lib that we can then pull into both libvirt & the daemon as needed.
+ //virXMLError(VIR_ERR_NO_SOURCE, (const char *) target, 0); + //virXMLError(VIR_ERR_NO_TARGET, (const char *) source, 0); + printf("Bogus floppy\n"); + printf("Bogus cdomr\n"); + printf("alse %s %s\n", device, target);
Need to unify and make the usual wrappers for error code. Even if not linked in the libvirt lib, so that error messages could be carried back at the protocol level, if we put much logic in the qemud we will need some logging facilities, or transport errors, right ?
Yes, I need to rip out the debug code & formalize the set of error codes which can be returned to libvirt. Currently I catch every error and just return a generic -1, internal error. Obviously this needs to be better before the patch is merged.
+ obj = xmlXPathEval(BAD_CAST "/domain/devices/graphics", ctxt); + if ((obj == NULL) || (obj->type != XPATH_NODESET) || + (obj->nodesetval == NULL) || (obj->nodesetval->nodeNr == 0)) { + vm->def.graphicsType = QEMUD_GRAPHICS_NONE; + } else { + prop = xmlGetProp(obj->nodesetval->nodeTab[0], BAD_CAST "type"); + if (!strcmp((char *)prop, "vnc")) { + vm->def.graphicsType = QEMUD_GRAPHICS_VNC; + prop = xmlGetProp(obj->nodesetval->nodeTab[0], BAD_CAST "port"); + if (prop) { + conv = NULL; + vm->def.vncPort = strtoll((const char*)prop, &conv, 10); + } else { + vm->def.vncPort = 0; + } + } + }
Hum ... if we have no graphic device what's going on ? the QEmu process is forked by a daemon, so there is no way to get back to the console... The console access is something we didn't tried to address so far in libvirt. At least some console logs would be useful, otherwise we have the problem of hitting asynchronous interface is we really want to give interactive access... Well we can probably live without :-)
There's a couple of different ways this can work - the VNC console, or we can hook up a serial device to the guest (and assume user puts appropriate kernl boot command line args in 'console=ttyS0'. I also expect to save the stderr and stdout from QEMU to a log file so user can see if stuff goes wrong. I expect VNC is the primary console though.
+ if (stat(server->configDir, &sb) < 0) { + if (errno == ENOENT) { + if (mkdir(server->configDir, 0700) < 0) + return -1; + } else { + return -1; + } + } else if (!S_ISDIR(sb.st_mode)) { + return -1; + }
Shouldn't we check the mode and owner if the directory already exists ? that should be in the stat informations.
Yes, worthwhile doing.
+ if (qemudFindVMByUUID(server, vm->def.uuid)) { + qemudFreeVM(vm); + printf("Duplicate domains, discarding %s\n", file); + return NULL; + }
UUID clashes due to users manually copying files may be more frequent than expected. Cloning is gonna be fun in general ...
Yes, I'm not entirely sure what we can do there - either document - "do not do that", or automatically re-generate the UUID if they have a duplicate (re-saving the cofig file to disk with the new UUID). I believe VMWare automatically re-generates the UUID if you manually copy a file, so its a reasonable approach to take. Could do the same with the Name if neccessary - just append '-1', '-2', or something on to the end of it.
+ if (qemudFindVMByName(server, vm->def.name)) { + qemudFreeVM(vm); + printf("Duplicate domains, discarding %s\n", file); + return NULL; + } [...] +int qemudScanConfigs(struct qemud_server *server) {
2 things here: 1/ caching if the modification timestamps of the directory and file didn't changed. I don't know how often the scanning will be done but it's not lightweight since it reparses everything 2/ name and uuid clashes could be detected here or when adding some.
If we wanted to be really clever we could use INotify to automatically re-scan. A simple approach though would just re-scan the directory comparing timestamps every minute or so & re-loading configs which have changed.
I scanned quickly the protocol part but I need to get my head around it :-)
There's a couple of important assumptions I apply in the network code which I should document further to help explain the way it works.
That's a lot of code ! One think I would like is to try to get all commenting at the same level, i.e. having every function carrying a comment header, That's something I will help with, maybe as a side effect of reviewing the code. Tell me how we could do that ? Maybe I could start with the simplest code from qemud.
-#define MAX_DRIVERS 5 +#define MAX_DRIVERS 10
I didn't expect that initially :-)
Hehe - took me a little while to track down that issue too - I just could not work out why nothing was working at first :-) Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Aug 28, 2006 at 04:55:09AM -0400, Daniel Veillard wrote:
I scanned quickly the protocol part but I need to get my head around it :-) That's a lot of code ! One think I would like is to try to get all commenting at the same level, i.e. having every function carrying a comment header, That's something I will help with, maybe as a side effect of reviewing the code. Tell me how we could do that ? Maybe I could start with the simplest code from qemud.
I'm attaching an updated patch which has many many more comments in it, and also has much better error reporting - nearly all errors are propagated back to libvirt with formal codes & messages. Other changes previously suggested (re-factoring shared code, etc) are yet to be done. Dan -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Karel Zak