[libvirt] ignore vs. error when inappropriate or unrecognized attributes/elements are present in XML

16 Aug 2011

      This is related to: https://bugzilla.redhat.com/show_bug.cgi?id=638633#c14

I had started to reply to it in the comments of the bug, but my reply 
became too long, and expanded into an issue wider than that single bug, 
so I figured I'd better discuss it here instead.

A synopsis: It's possible to specify a script to be executed when 
bringing up a guest interface device. This is done by adding a <script 
path='xxx'/> element to the <interface> definition.

The bug report was that <script path='blah'/> was added to an <interface 
type='bridge'> definition in the config for a qemu domain, and nobody 
complained that 'blah' didn't exist. This eventually led to the 
realization that when interface type='bridge' and driver=qemu, <script> 
isn't used, so it's ignored. More exactly, the script element is only 
recognized by the parser when interface type == 'bridge' or 'ethernet', 
and is only used by the hypervisor driver when:

1) the interface type='bridge' and the xen driver is being used.

or

2) the interface type='ethernet' and the qemu driver is being used.

After this all came to light, in comment 14, Dave says:
...
That's being the case, we need to explicitly reject the attempt to 
specify a
script.
(implied: when interface type='bridge' and the hypervisor is qemu).

On the surface that sounds like a good idea, but it's opening a pretty 
big can of worms.

Specifically for this issue, this rejection must take place in the 
hypervisor because different hypervisors support scripts for different 
types of interfaces - a script for type='bridge' is perfectly acceptable 
for xen, but not for qemu. That can be handled with an error log in the 
qemu driver in the case of type='bridge'. But if someone defines 
<interface type='network'> and specifies a script, the parser ignores 
the <script> element (because normally it's stored either in a union 
that's only valid when type='bridge' or a different union only valid 
when type='ethernet'), so qemu has no way of knowing a script was 
specified and therefore can't log an error.

That's not a terribly difficult problem, though. Just a bit more code 
than a simple "else { log an error; }" somewhere in the code - the 
parser needs to be changed to move the script attribute out of the 
unions and into the main struct, and always populate it, then the 
drivers need to be taught to log an error when appropriate).

But if we're doing that for the <script> subelement of <interface>, 
we're creating a prerequisite for what's expected in the case of all the 
*other* attributes/elements that are currently ignored under similar 
circumstances by libvirt's parsers. Essentially anything that's in a 
union contained in any of the virXxxDef structs is probably treated the 
same way. As a simple first example, look at "keymap", which is an 
attribute of a domain's devices/graphics element, but only valid if the 
type is spice or vnc. The parser ignores it in all other cases, and 
since it's stored in data.spice or data.vnc, there's not even a 
possibility of qemu (or other hypervisor) indicating any kind of error 
when someone specifies e.g. <graphics type='sdl' 'keymap='blah' .../>. 
Similarly, if someone were to add "tlsPort='5910' to a <graphics> of any 
type other than spice. Of course, from the point of view of all the code 
associated with <graphics type='vnc' ... />, this is no different than 
if someone had added "frozzle='fib'" - it's totally unexpected (in this 
case by any type), so it's ignored.

Actually, I can see now there are several different classes of this 
problem. Here are the first few that come to mind:

1) an attribute/element is completely unknown/unexpected in all cases 
(e.g. "frozzle='fib'" anywhere, or more insidious, something that 
*looks* correct, but isn't, e.g. "<script name='/path/to/script'/>"*)

2) an attribute/element is useful/expected only when some other 
attribute is set to a particular value (usually one called "type", but 
could be something else), for example keymap='blah' is only expected in 
a <graphics> element when type='spice' or type='vnc'.

3) an attribute/element is useful/expected only for certain combinations 
of the value of some other attribute and which driver is using the 
element, e.g. the subject of this bug - script='blah' is only expected 
when type='bridge' and it's used by the Xen driver, or type='ethernet' 
and it's used by the qemu driver.

So what are the rules of engagement for these various cases? When do we 
ignore, when do we log an error during parsing, and when do we log an 
error in the code that's using the parsed data?

(*Another example of (1) from recent real-life - during testing of 
https://bugzilla.redhat.com/show_bug.cgi?id=727982, a QA person claimed 
that the bugt hadn't been fixed because the hosts file wasn't being 
updated; it turned out that they were adding the new <host> element at 
the toplevel under <network> rather than inside the <dhcp> subelement, 
so the parser was ignoring it (since it's unrecognized at that level) 
rather than logging an error).

[libvirt] ignore vs. error when inappropriate or unrecognized attributes/elements are present in XML

Laine Stump