On Thu, Dec 10, 2009 at 04:51:54PM +0100, Diego Elio “Flameeyes” Pettenò wrote:
Hello,
In a recent post on my blog [1] I ranted on about libvirt and in
particular I complained that the configuration files look like what I
call “almost XML”. The reasons why I say that are multiple, let me try
to explain some.
In the configuration files, at least those created by virt-manager there
is no specification of what the file should be (no document type, no
namespace, and, IMHO, a too generic root element name); given that some
Okay, I have been in the standardization space for XML for a decade
now, so I have an opinion on those issues :-)
Document type: DTDs are obsolete, the level of checking they provide
is really not really useful.
Namespace: in that case that would be combersome, namespaces are
useful when you want to mix vocabularies coming from different
places. In our use case I don't see any need to mix, the format
is driven by an application. Major drawback if we had used
namespaces, even a default one, all the XPath expressions and
queries we are doing in the code would then need a namespace
registration and a prefix, this is painful for just no gain in our
case.
Generic element name: well we used what was the most sensible based on
the vocabulary used in
http://libvirt.org/goals.html the goal
documentation
kind of distinction is needed for software like Emacs's nxml-mode
to
know how to deal with the file, I think that's pretty bad for
interaction between different applications. While libvirt knows
Your viewpoint is that users should edit the XML. My viewpoint is that
software should do this, basically in normal use of libvirt nobody
should have to look at the XML, the virt-viewer/virt-install etc...
tools should generate and handle those for you. It happens that one may
have to tweak something like a pathname in a definition or something but
whatever the level of schemas available it won't help for that kind of
tweaks. Things like changing the ethernet type adapter should be a
pull down list in a gui like virt-manager, not loading the saved xml in
emacs, finding the associated relax-ng, finding the place where it's
defined and hoping emacs will get a list to pick from, which even if you
did it right might just not work because the domain is running and
your change would be discarded on restart... oops.
perfectly well what it's dealing with, other packages might not.
Might
not sound a major issue but it starts tickling my senses when this
happens.
The configuration seem somewhat contrived in places like the disk
configuration: if the disk is file-backed it require the file attribute
to the <source> element, while it needs the dev attribute if it's a
block device; given that it's a path in both cases it would have sounded
easier on the user if a single path attribute was used. But this is
opinable.
Yup and there have been opinions. The good point are that all the
pros and cons which led to the format definition (and I tell you we got
much discussion on format issues !) are all documented in the mailing
list archives so you can find why we choose such or such format,
sometimes we have remains from history, but honnestly the format is
not that bad considering the heavy use, the incredible variations from
hypervisors to hypervisors and the few years of history !
The third problem I called out for in the block is a lack of a
schema
for the files; Daniel corrected me pointing out that the schemas are
distributed with the sources and installed. Sure thing, I was wrong. On
the other hand I maintain that there are problems with those schemas.
The first is that both the version distributed with 0.7.4 and the git
version as of today suffer from bug #546254 [2] (secret.rng being not
well formed) so it means nobody has even tested them as of lately; then
As you pointed out, this is now fixed in GIT, the secret handling was
added recently and apparently we didn't add regresison testing, we
usually use the Relax-NG for validating example in "make check"
there is the fact that they are never referenced by the
human-readable
documentation [3] which is why I didn't find it the first time around;
There is a tool /usr/bin/virt-xml-validate on my system part of
libvirt-client package which does the va;idation for domain file, we
don't expect people to know RNG or know how to validate against a .rng
The format is documented
add also to that some contrived syntax in those schema as well that
causes trang to produce a non-valid rnc file out of them (nxml-mode uses
rnc rather than rng).
Ah, well this sounds like a bug in trang, you could try to reach James
Clark for fixing, I though there was formal equivalence in his compact
syntax. In any case the XML syntax is the core ISO spec, and I find
normal to use that one.
But I guess the one big problem with the schemas is that they
don't seem
to properly encode what the human-readable documentation says, or what
virt-manager does. For instance (please follow me with selector-like
Some mismatch is sometimes possible, we try to fix them when they are
reported. Not everybody providing feature patches is Relax-NG aware
and sometimes it's forbidden and fall though the crack, though if an
instance is installed in teh tests suites, then the regression suite
points out the schemas mismatch
syntax), virt-manager creates
/domain/os/type[(a)machine='pc-0.11'] in the
created XML; the same attribute seem to be documented: “There are also
two optional attributes, arch specifying the CPU architecture to
virtualization, and machine referring to the machine type”. The schema
does not seem to accept that attribute though (“element type: Relax-NG
validity error : Invalid attribute machine for element type” with
xmllint, just to make sure that it's not a bug in any other piece of
software, this is Daniel's libxml2).
there is various variants in the os type handling making that part
of the schemas rather complex, best would be to open up a new bug
(with the example as an attachement) unless you can come with a fix
we can easilly validate.
Now after voicing my opinions here, as Daniel dared me to do, I'd
like
to explain a second why I didn't post this on the list in the first
place: of what I wrote here, my beefs for calling this aXML, the only
things that can be solved easily are the schemas; schemas that, at the
time I wrote the blog, I was unable to find. The syntax, and the lack of
a “safe” identification of the files as libvirt's are the kind of legacy
problems one has to deal with to avoid wasting users' time with
migrations and corrections, so I don't really think they should be
addressed unless a redesign of the configuration is intended.
I honnestly think that trying to address the "editability" (for lack
of a better term) is chasing the wrong target, XML is really not for
human consumption - well it's not digest for most - and while it's
extremely useful and powerful at the API level or long term description
the tools should really abstract the low level syntactic details.
Just my two cents, you're free to take them as you wish, I cannot
boast
a curriculum like Daniel's, but I don't think I'm stepping out of place
to point out these things.
No, it's fine, but the way all our ecosystem works is that if you find
a bug you should report it. You reported 2 which are valid problem in
libvirt, and this need to be fixed, one already thanks to your patch !
We probably ought to also add a reference to virt-xml-validate in the
domain documentation page
http://libvirt.org/formatdomain.html too
So thanks for reporting the existing bugs, I ended up on your blog
post randomly and I though the best was to get you here to actually
solve the issues, seems to have worked better than I expected :-)
There is still that crash problem you wrote about and which would need
some clarification, Im' not sure I really understood what went wrong !
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit
http://xmlsoft.org/
daniel(a)veillard.com | Rpmfind RPM search engine
http://rpmfind.net/
http://veillard.com/ | virtualization library
http://libvirt.org/