This is related to:
https://bugzilla.redhat.com/show_bug.cgi?id=638633#c14
I had started to reply to it in the comments of the bug, but my reply
became too long, and expanded into an issue wider than that single bug,
so I figured I'd better discuss it here instead.
A synopsis: It's possible to specify a script to be executed when
bringing up a guest interface device. This is done by adding a <script
path='xxx'/> element to the <interface> definition.
The bug report was that <script path='blah'/> was added to an <interface
type='bridge'> definition in the config for a qemu domain, and nobody
complained that 'blah' didn't exist. This eventually led to the
realization that when interface type='bridge' and driver=qemu, <script>
isn't used, so it's ignored. More exactly, the script element is only
recognized by the parser when interface type == 'bridge' or 'ethernet',
and is only used by the hypervisor driver when:
1) the interface type='bridge' and the xen driver is being used.
or
2) the interface type='ethernet' and the qemu driver is being used.
After this all came to light, in comment 14, Dave says:
That's being the case, we need to explicitly reject the attempt
to
specify a
script.
(implied: when interface type='bridge' and the hypervisor is qemu).
On the surface that sounds like a good idea, but it's opening a pretty
big can of worms.
Specifically for this issue, this rejection must take place in the
hypervisor because different hypervisors support scripts for different
types of interfaces - a script for type='bridge' is perfectly acceptable
for xen, but not for qemu. That can be handled with an error log in the
qemu driver in the case of type='bridge'. But if someone defines
<interface type='network'> and specifies a script, the parser ignores
the <script> element (because normally it's stored either in a union
that's only valid when type='bridge' or a different union only valid
when type='ethernet'), so qemu has no way of knowing a script was
specified and therefore can't log an error.
That's not a terribly difficult problem, though. Just a bit more code
than a simple "else { log an error; }" somewhere in the code - the
parser needs to be changed to move the script attribute out of the
unions and into the main struct, and always populate it, then the
drivers need to be taught to log an error when appropriate).
But if we're doing that for the <script> subelement of <interface>,
we're creating a prerequisite for what's expected in the case of all the
*other* attributes/elements that are currently ignored under similar
circumstances by libvirt's parsers. Essentially anything that's in a
union contained in any of the virXxxDef structs is probably treated the
same way. As a simple first example, look at "keymap", which is an
attribute of a domain's devices/graphics element, but only valid if the
type is spice or vnc. The parser ignores it in all other cases, and
since it's stored in data.spice or data.vnc, there's not even a
possibility of qemu (or other hypervisor) indicating any kind of error
when someone specifies e.g. <graphics type='sdl' 'keymap='blah'
.../>.
Similarly, if someone were to add "tlsPort='5910' to a <graphics> of
any
type other than spice. Of course, from the point of view of all the code
associated with <graphics type='vnc' ... />, this is no different than
if someone had added "frozzle='fib'" - it's totally unexpected (in
this
case by any type), so it's ignored.
Actually, I can see now there are several different classes of this
problem. Here are the first few that come to mind:
1) an attribute/element is completely unknown/unexpected in all cases
(e.g. "frozzle='fib'" anywhere, or more insidious, something that
*looks* correct, but isn't, e.g. "<script
name='/path/to/script'/>"*)
2) an attribute/element is useful/expected only when some other
attribute is set to a particular value (usually one called "type", but
could be something else), for example keymap='blah' is only expected in
a <graphics> element when type='spice' or type='vnc'.
3) an attribute/element is useful/expected only for certain combinations
of the value of some other attribute and which driver is using the
element, e.g. the subject of this bug - script='blah' is only expected
when type='bridge' and it's used by the Xen driver, or type='ethernet'
and it's used by the qemu driver.
So what are the rules of engagement for these various cases? When do we
ignore, when do we log an error during parsing, and when do we log an
error in the code that's using the parsed data?
(*Another example of (1) from recent real-life - during testing of
https://bugzilla.redhat.com/show_bug.cgi?id=727982, a QA person claimed
that the bugt hadn't been fixed because the hosts file wasn't being
updated; it turned out that they were adding the new <host> element at
the toplevel under <network> rather than inside the <dhcp> subelement,
so the parser was ignoring it (since it's unrecognized at that level)
rather than logging an error).