[libvirt] My beefs with the libvirt XML configuration format

Hello, In a recent post on my blog [1] I ranted on about libvirt and in particular I complained that the configuration files look like what I call “almost XML”. The reasons why I say that are multiple, let me try to explain some. In the configuration files, at least those created by virt-manager there is no specification of what the file should be (no document type, no namespace, and, IMHO, a too generic root element name); given that some kind of distinction is needed for software like Emacs's nxml-mode to know how to deal with the file, I think that's pretty bad for interaction between different applications. While libvirt knows perfectly well what it's dealing with, other packages might not. Might not sound a major issue but it starts tickling my senses when this happens. The configuration seem somewhat contrived in places like the disk configuration: if the disk is file-backed it require the file attribute to the <source> element, while it needs the dev attribute if it's a block device; given that it's a path in both cases it would have sounded easier on the user if a single path attribute was used. But this is opinable. The third problem I called out for in the block is a lack of a schema for the files; Daniel corrected me pointing out that the schemas are distributed with the sources and installed. Sure thing, I was wrong. On the other hand I maintain that there are problems with those schemas. The first is that both the version distributed with 0.7.4 and the git version as of today suffer from bug #546254 [2] (secret.rng being not well formed) so it means nobody has even tested them as of lately; then there is the fact that they are never referenced by the human-readable documentation [3] which is why I didn't find it the first time around; add also to that some contrived syntax in those schema as well that causes trang to produce a non-valid rnc file out of them (nxml-mode uses rnc rather than rng). But I guess the one big problem with the schemas is that they don't seem to properly encode what the human-readable documentation says, or what virt-manager does. For instance (please follow me with selector-like syntax), virt-manager creates /domain/os/type[@machine='pc-0.11'] in the created XML; the same attribute seem to be documented: “There are also two optional attributes, arch specifying the CPU architecture to virtualization, and machine referring to the machine type”. The schema does not seem to accept that attribute though (“element type: Relax-NG validity error : Invalid attribute machine for element type” with xmllint, just to make sure that it's not a bug in any other piece of software, this is Daniel's libxml2). Now after voicing my opinions here, as Daniel dared me to do, I'd like to explain a second why I didn't post this on the list in the first place: of what I wrote here, my beefs for calling this aXML, the only things that can be solved easily are the schemas; schemas that, at the time I wrote the blog, I was unable to find. The syntax, and the lack of a “safe” identification of the files as libvirt's are the kind of legacy problems one has to deal with to avoid wasting users' time with migrations and corrections, so I don't really think they should be addressed unless a redesign of the configuration is intended. Just my two cents, you're free to take them as you wish, I cannot boast a curriculum like Daniel's, but I don't think I'm stepping out of place to point out these things. [1] http://blog.flameeyes.eu/2009/12/07/i-know-you-missed-them-virtualisation-ra... [2] https://bugzilla.redhat.com/show_bug.cgi?id=546254 [3] http://libvirt.org/format.html -- Diego Elio Pettenò — “Flameeyes” http://blog.flameeyes.eu/ If you found a .asc file in this mail and know not what it is, it's a GnuPG digital signature: http://www.gnupg.org/

On Thu, Dec 10, 2009 at 04:51:54PM +0100, Diego Elio “Flameeyes” Pettenò wrote:
Hello,
In a recent post on my blog [1] I ranted on about libvirt and in particular I complained that the configuration files look like what I call “almost XML”. The reasons why I say that are multiple, let me try to explain some.
In the configuration files, at least those created by virt-manager there is no specification of what the file should be (no document type, no namespace, and, IMHO, a too generic root element name); given that some
Okay, I have been in the standardization space for XML for a decade now, so I have an opinion on those issues :-) Document type: DTDs are obsolete, the level of checking they provide is really not really useful. Namespace: in that case that would be combersome, namespaces are useful when you want to mix vocabularies coming from different places. In our use case I don't see any need to mix, the format is driven by an application. Major drawback if we had used namespaces, even a default one, all the XPath expressions and queries we are doing in the code would then need a namespace registration and a prefix, this is painful for just no gain in our case. Generic element name: well we used what was the most sensible based on the vocabulary used in http://libvirt.org/goals.html the goal documentation
kind of distinction is needed for software like Emacs's nxml-mode to know how to deal with the file, I think that's pretty bad for interaction between different applications. While libvirt knows
Your viewpoint is that users should edit the XML. My viewpoint is that software should do this, basically in normal use of libvirt nobody should have to look at the XML, the virt-viewer/virt-install etc... tools should generate and handle those for you. It happens that one may have to tweak something like a pathname in a definition or something but whatever the level of schemas available it won't help for that kind of tweaks. Things like changing the ethernet type adapter should be a pull down list in a gui like virt-manager, not loading the saved xml in emacs, finding the associated relax-ng, finding the place where it's defined and hoping emacs will get a list to pick from, which even if you did it right might just not work because the domain is running and your change would be discarded on restart... oops.
perfectly well what it's dealing with, other packages might not. Might not sound a major issue but it starts tickling my senses when this happens.
The configuration seem somewhat contrived in places like the disk configuration: if the disk is file-backed it require the file attribute to the <source> element, while it needs the dev attribute if it's a block device; given that it's a path in both cases it would have sounded easier on the user if a single path attribute was used. But this is opinable.
Yup and there have been opinions. The good point are that all the pros and cons which led to the format definition (and I tell you we got much discussion on format issues !) are all documented in the mailing list archives so you can find why we choose such or such format, sometimes we have remains from history, but honnestly the format is not that bad considering the heavy use, the incredible variations from hypervisors to hypervisors and the few years of history !
The third problem I called out for in the block is a lack of a schema for the files; Daniel corrected me pointing out that the schemas are distributed with the sources and installed. Sure thing, I was wrong. On the other hand I maintain that there are problems with those schemas. The first is that both the version distributed with 0.7.4 and the git version as of today suffer from bug #546254 [2] (secret.rng being not well formed) so it means nobody has even tested them as of lately; then
As you pointed out, this is now fixed in GIT, the secret handling was added recently and apparently we didn't add regresison testing, we usually use the Relax-NG for validating example in "make check"
there is the fact that they are never referenced by the human-readable documentation [3] which is why I didn't find it the first time around;
There is a tool /usr/bin/virt-xml-validate on my system part of libvirt-client package which does the va;idation for domain file, we don't expect people to know RNG or know how to validate against a .rng The format is documented
add also to that some contrived syntax in those schema as well that causes trang to produce a non-valid rnc file out of them (nxml-mode uses rnc rather than rng).
Ah, well this sounds like a bug in trang, you could try to reach James Clark for fixing, I though there was formal equivalence in his compact syntax. In any case the XML syntax is the core ISO spec, and I find normal to use that one.
But I guess the one big problem with the schemas is that they don't seem to properly encode what the human-readable documentation says, or what virt-manager does. For instance (please follow me with selector-like
Some mismatch is sometimes possible, we try to fix them when they are reported. Not everybody providing feature patches is Relax-NG aware and sometimes it's forbidden and fall though the crack, though if an instance is installed in teh tests suites, then the regression suite points out the schemas mismatch
syntax), virt-manager creates /domain/os/type[@machine='pc-0.11'] in the created XML; the same attribute seem to be documented: “There are also two optional attributes, arch specifying the CPU architecture to virtualization, and machine referring to the machine type”. The schema does not seem to accept that attribute though (“element type: Relax-NG validity error : Invalid attribute machine for element type” with xmllint, just to make sure that it's not a bug in any other piece of software, this is Daniel's libxml2).
there is various variants in the os type handling making that part of the schemas rather complex, best would be to open up a new bug (with the example as an attachement) unless you can come with a fix we can easilly validate.
Now after voicing my opinions here, as Daniel dared me to do, I'd like to explain a second why I didn't post this on the list in the first place: of what I wrote here, my beefs for calling this aXML, the only things that can be solved easily are the schemas; schemas that, at the time I wrote the blog, I was unable to find. The syntax, and the lack of a “safe” identification of the files as libvirt's are the kind of legacy problems one has to deal with to avoid wasting users' time with migrations and corrections, so I don't really think they should be addressed unless a redesign of the configuration is intended.
I honnestly think that trying to address the "editability" (for lack of a better term) is chasing the wrong target, XML is really not for human consumption - well it's not digest for most - and while it's extremely useful and powerful at the API level or long term description the tools should really abstract the low level syntactic details.
Just my two cents, you're free to take them as you wish, I cannot boast a curriculum like Daniel's, but I don't think I'm stepping out of place to point out these things.
No, it's fine, but the way all our ecosystem works is that if you find a bug you should report it. You reported 2 which are valid problem in libvirt, and this need to be fixed, one already thanks to your patch ! We probably ought to also add a reference to virt-xml-validate in the domain documentation page http://libvirt.org/formatdomain.html too So thanks for reporting the existing bugs, I ended up on your blog post randomly and I though the best was to get you here to actually solve the issues, seems to have worked better than I expected :-) There is still that crash problem you wrote about and which would need some clarification, Im' not sure I really understood what went wrong ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

2009/12/10 Daniel Veillard <veillard@redhat.com>:
So thanks for reporting the existing bugs, I ended up on your blog post randomly and I though the best was to get you here to actually solve the issues, seems to have worked better than I expected :-) There is still that crash problem you wrote about and which would need some clarification, Im' not sure I really understood what went wrong !
Daniel
The crash could be due to the QEMU monitor handling problem that was recently fixed. The problem was that the asynchronous monitor handling code would free the monitor object in case QEMU crashed or reported an error. But the asynchronous monitor handling freed the monitor object while the caller was not aware of this and the caller then accessed freed memory resulting in a crash. Matthias

Il giorno Thu, 10/12/2009 alle 18.28 +0100, Matthias Bolte ha scritto:
The crash could be due to the QEMU monitor handling problem that was recently fixed.
Yup that seems to be it, tied with a crash in qemu-kvm's 0.11.0 handling of virtio network (qemu crashed, which then caused libvirt to crash again) which is solved in 0.11.1. I'm sincerely not sure at this point if I should wait for 0.7.5 or look for a way to backport the needed change. -- Diego Elio Pettenò — “Flameeyes” http://blog.flameeyes.eu/ If you found a .asc file in this mail and know not what it is, it's a GnuPG digital signature: http://www.gnupg.org/

2009/12/10 Diego Elio “Flameeyes” <flameeyes@gmail.com>:
Il giorno Thu, 10/12/2009 alle 18.28 +0100, Matthias Bolte ha scritto:
The crash could be due to the QEMU monitor handling problem that was recently fixed.
Yup that seems to be it, tied with a crash in qemu-kvm's 0.11.0 handling of virtio network (qemu crashed, which then caused libvirt to crash again) which is solved in 0.11.1.
I'm sincerely not sure at this point if I should wait for 0.7.5 or look for a way to backport the needed change.
You can use the 0.7.4 release and apply this two commits to fix the crash: http://libvirt.org/git/?p=libvirt.git;a=commit;h=79533da1b36cc16e0f3f8aea798... http://libvirt.org/git/?p=libvirt.git;a=commit;h=8e7d14953ca6ef047112bc6ac17... I tested it and it works for me. Matthias

On Thu, Dec 10, 2009 at 05:44:43PM +0100, Daniel Veillard wrote:
Your viewpoint is that users should edit the XML. My viewpoint is that software should do this, basically in normal use of libvirt nobody should have to look at the XML, the virt-viewer/virt-install etc... tools should generate and handle those for you. It happens that one may have to tweak something like a pathname in a definition or something but whatever the level of schemas available it won't help for that kind of tweaks. Things like changing the ethernet type adapter should be a pull down list in a gui like virt-manager, not loading the saved xml in emacs, finding the associated relax-ng, finding the place where it's defined and hoping emacs will get a list to pick from,
I edit libvirt XML all the time. The 'virsh edit' command is most useful ... I'd like to add my own rant about this though: If an element isn't understood by libvirt, then libvirt just discards it (without any indication or error, and without just remembering the element in the XML). This caused me a great deal of pain yesterday when I was adding a <watchdog/> element to a domain on an F12 machine, but the watchdog didn't appear in the VM. Later I discovered that libvirt on F12 predates the watchdog feature, and so it was just tossing away the <watchdog/> element completely from the XML ...
which even if you did it right might just not work because the domain is running and your change would be discarded on restart... oops.
Which is also a bug. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into Xen guests. http://et.redhat.com/~rjones/virt-p2v

On Thu, Dec 10, 2009 at 05:49:39PM +0000, Richard W.M. Jones wrote:
On Thu, Dec 10, 2009 at 05:44:43PM +0100, Daniel Veillard wrote:
Your viewpoint is that users should edit the XML. My viewpoint is that software should do this, basically in normal use of libvirt nobody should have to look at the XML, the virt-viewer/virt-install etc... tools should generate and handle those for you. It happens that one may have to tweak something like a pathname in a definition or something but whatever the level of schemas available it won't help for that kind of tweaks. Things like changing the ethernet type adapter should be a pull down list in a gui like virt-manager, not loading the saved xml in emacs, finding the associated relax-ng, finding the place where it's defined and hoping emacs will get a list to pick from,
I edit libvirt XML all the time. The 'virsh edit' command is most useful ...
I'd like to add my own rant about this though:
If an element isn't understood by libvirt, then libvirt just discards it (without any indication or error, and without just remembering the element in the XML).
There should be an option to validate the XML input, either by providing a VIR_DOMAIN_XML_VALIDATE flag with the APIs which accept XML as input, or by having virsh edit doing validation after the editor exits. This would also allow virsh to re-launch the editor upon error and let you correct the mistake instead of forcing you to start again from scratch.
This caused me a great deal of pain yesterday when I was adding a <watchdog/> element to a domain on an F12 machine, but the watchdog didn't appear in the VM. Later I discovered that libvirt on F12 predates the watchdog feature, and so it was just tossing away the <watchdog/> element completely from the XML ...
It is good that it throws away unknown elements. Having settings in the XML that are used, but which might suddenly become active with any upgrade is not good for ensuring a stable ABI for the guest.
which even if you did it right might just not work because the domain is running and your change would be discarded on restart... oops.
That is not correct actually. It is possible with many of the drivers (Xen, QEMU, LXC, UML) to change the persistent config, regardless of whether the domain is running. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Dec 10, 2009 at 06:07:13PM +0000, Daniel P. Berrange wrote:
On Thu, Dec 10, 2009 at 05:49:39PM +0000, Richard W.M. Jones wrote:
On Thu, Dec 10, 2009 at 05:44:43PM +0100, Daniel Veillard wrote:
Your viewpoint is that users should edit the XML. My viewpoint is that software should do this, basically in normal use of libvirt nobody should have to look at the XML, the virt-viewer/virt-install etc... tools should generate and handle those for you. It happens that one may have to tweak something like a pathname in a definition or something but whatever the level of schemas available it won't help for that kind of tweaks. Things like changing the ethernet type adapter should be a pull down list in a gui like virt-manager, not loading the saved xml in emacs, finding the associated relax-ng, finding the place where it's defined and hoping emacs will get a list to pick from,
I edit libvirt XML all the time. The 'virsh edit' command is most useful ...
I'd like to add my own rant about this though:
If an element isn't understood by libvirt, then libvirt just discards it (without any indication or error, and without just remembering the element in the XML).
Remembering is a hard feature and probably not worth it in libvirt case (but I'm sure I argued with dan over this at the very beginning :-) it makes sense for editor like applications or when you know you need to mix vocabularies, for libvirt, I think it's overkill and would make the code significantly more complex.
There should be an option to validate the XML input, either by providing a VIR_DOMAIN_XML_VALIDATE flag with the APIs which accept XML as input, or by having virsh edit doing validation after the editor exits.
I think I suggested a couple of time to have the input XML data be validated at the API level, but we don't want to do this systematically, this would create IMHO more problems it can solve. Using a flag and/or activating it when libvirt conf is in debug mode would both make sense.
This would also allow virsh to re-launch the editor upon error and let you correct the mistake instead of forcing you to start again from scratch.
The schemas validation won't be perfect in any way, for example trying to limit the list of allowed ethernet adapter based on the hypervisor type is nearly impossible even with Relax-NG since we differentiate based on an attribute in the top level element (this would force to basically write parallel schemas and become completely unmaintainable). Relax-NG validation also will provide out of context error messages, while the conf parser can give way better diagnostics.
This caused me a great deal of pain yesterday when I was adding a <watchdog/> element to a domain on an F12 machine, but the watchdog didn't appear in the VM. Later I discovered that libvirt on F12 predates the watchdog feature, and so it was just tossing away the <watchdog/> element completely from the XML ...
It is good that it throws away unknown elements. Having settings in the XML that are used, but which might suddenly become active with any upgrade is not good for ensuring a stable ABI for the guest.
which even if you did it right might just not work because the domain is running and your change would be discarded on restart... oops.
That is not correct actually. It is possible with many of the drivers (Xen, QEMU, LXC, UML) to change the persistent config, regardless of whether the domain is running.
I used that as an example, the point being that decoupling the config modification process from the actual application using the definition makes some mistakes uncheckable either at the system level or even at teh syntactic level, to take again the example of changing the ethernet adapter while a specific tool like virt-manager may have the logic needed to be able to provide a list based on the hypervisor type, the Relax-NG will never embbed that logic. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Fri, Dec 11, 2009 at 08:09:33AM +0100, Daniel Veillard wrote:
On Thu, Dec 10, 2009 at 06:07:13PM +0000, Daniel P. Berrange wrote:
There should be an option to validate the XML input, either by providing a VIR_DOMAIN_XML_VALIDATE flag with the APIs which accept XML as input, or by having virsh edit doing validation after the editor exits.
I think I suggested a couple of time to have the input XML data be validated at the API level, but we don't want to do this systematically, this would create IMHO more problems it can solve. Using a flag and/or activating it when libvirt conf is in debug mode would both make sense.
This would also allow virsh to re-launch the editor upon error and let you correct the mistake instead of forcing you to start again from scratch.
The schemas validation won't be perfect in any way, for example trying to limit the list of allowed ethernet adapter based on the hypervisor type is nearly impossible even with Relax-NG since we differentiate based on an attribute in the top level element (this would force to basically write parallel schemas and become completely unmaintainable). Relax-NG validation also will provide out of context error messages, while the conf parser can give way better diagnostics.
I think it is a mistake that our current schemas try to validate the content of attributes such as ethernet adapter name. Increasingly we in a situation where the allowed values are dynamically determined on the fly. The schema would be more useful if it simply validated that it was a string a-Z,0-9, and didn't try to check explicit enumeration of values there. ie just validate basic syntax, and not semantics. The original poster's problem of 'pc' vs 'pc-0.11' is a good example of validating the individual values is bad - 'pc-0.11' is dynamically pulled from the QEMU binary so there's no hope of the schema ever being aware of that. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 12/10/2009 10:51 AM, Diego Elio “Flameeyes” Pettenò wrote:
Hello,
In a recent post on my blog [1] I ranted on about libvirt and in particular I complained that the configuration files look like what I call “almost XML�. The reasons why I say that are multiple, let me try to explain some.
In the configuration files, at least those created by virt-manager there is no specification of what the file should be (no document type, no namespace, and, IMHO, a too generic root element name); given that some kind of distinction is needed for software like Emacs's nxml-mode to know how to deal with the file, I think that's pretty bad for interaction between different applications. While libvirt knows perfectly well what it's dealing with, other packages might not. Might not sound a major issue but it starts tickling my senses when this happens.
The configuration seem somewhat contrived in places like the disk configuration: if the disk is file-backed it require the file attribute to the <source> element, while it needs the dev attribute if it's a block device; given that it's a path in both cases it would have sounded easier on the user if a single path attribute was used. But this is opinable.
This is something that has always bugged me as well, and is indeed a pain to deal with in virt-manager when doing things like changing cdrom media. I think we should just loosen the input restrictions, so that any path passed via the <source> properties file, dev, or dir are used, but will be dumped with the 'correct' property to maintain back compat. That said, I think the XML format is pretty straight forward. The above caveat you mention is the only real annoying thing. The one other stumbling block is that not all devices have a unique property to key off of in the XML (ex. you can have two identical <video> devices). This makes life difficult for virt-manager when removing a device, but it hasn't been a big issue in practice.
The third problem I called out for in the block is a lack of a schema for the files; Daniel corrected me pointing out that the schemas are distributed with the sources and installed. Sure thing, I was wrong. On the other hand I maintain that there are problems with those schemas. The first is that both the version distributed with 0.7.4 and the git version as of today suffer from bug #546254 [2] (secret.rng being not well formed) so it means nobody has even tested them as of lately; then there is the fact that they are never referenced by the human-readable documentation [3] which is why I didn't find it the first time around; add also to that some contrived syntax in those schema as well that causes trang to produce a non-valid rnc file out of them (nxml-mode uses rnc rather than rng).
The bug you mention only exists because we don't have secret XML regression tests. Other schemas are in better shape: I'm pretty confident that virtual network and storage pool/volume schemas have near complete coverage. Domain probably has very high coverage but there are no doubt nuances of the format that aren't covered by our regression suite, and therefor may have incorrect schemas. For a long while we didn't use the XML schemas for anything so they were horrendously out of date and practically useless. That has changed within the past year but we are still playing catchup to have the schema match libvirt code reality. Putting a link to schemas on the website is also a good idea.
But I guess the one big problem with the schemas is that they don't seem to properly encode what the human-readable documentation says, or what virt-manager does. For instance (please follow me with selector-like syntax), virt-manager creates /domain/os/type[@machine='pc-0.11'] in the created XML; the same attribute seem to be documented: “There are also two optional attributes, arch specifying the CPU architecture to virtualization, and machine referring to the machine type�. The schema does not seem to accept that attribute though (“element type: Relax-NG validity error : Invalid attribute machine for element type� with xmllint, just to make sure that it's not a bug in any other piece of software, this is Daniel's libxml2).
If there are any other pieces of the schema that you find are incorrect or don't match reality, please enumerate them here and I'll take some time to make sure they are tested in our regression suite. Thanks, Cole

On Thu, Dec 10, 2009 at 12:09:23PM -0500, Cole Robinson wrote:
This is something that has always bugged me as well, and is indeed a pain to deal with in virt-manager when doing things like changing cdrom media. I think we should just loosen the input restrictions, so that any path passed via the <source> properties file, dev, or dir are used, but will be dumped with the 'correct' property to maintain back compat.
That said, I think the XML format is pretty straight forward. The above caveat you mention is the only real annoying thing. The one other stumbling block is that not all devices have a unique property to key off of in the XML (ex. you can have two identical <video> devices). This makes life difficult for virt-manager when removing a device, but it hasn't been a big issue in practice.
That is going to change in the near future. I'm about to post patches which give every single device a <address> element containing their unique PCI, USB, or disk controller address. In addition every device will likely get a unique short name. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (6)
-
Cole Robinson
-
Daniel P. Berrange
-
Daniel Veillard
-
Diego Elio “Flameeyes” Pettenò
-
Matthias Bolte
-
Richard W.M. Jones