On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
> In order for the user to be able to fix broken domains function
> qemuDomainGetXMLDesc() needs to be able to lookup invalid domain
> definitions and handle them properly. When redefined, function
> qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a
> nice addition, qemuDomainGetState() can lookup such domains without any
> other change and that allows virsh not only to get their status, but
> also to list them.
Hmm, that's an interesting approach to the problem. I wonder though
if we could do things slightly differently such that we don't need
to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading
the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set
the error field. Now we merely need to change the qemuDomainStart
method, so it refuses to launch a VM with the 'error' flag set. All
the other APIs could be essentially unchanged. Sure it would not
be useful to allow things like virDomainAttachDevice, etc on such
broken domains, but for sake of simplicity we can avoid touching
all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to
be blocked as well. And we would have to go through all the APIs just
to see whether it accesses something that might be missing. Moreover,
how would you decide what to skip at each error during parsing? If,
for example, the <numatune> has some faulty attribute in one
subelement should we skip all the elements or just the one or just
skip that one particular attribute? We would also not format the
faulty attribute to the XML being dumped, so the user wouldn't see
what's missing, which is even worse when there are two problems and
they fix only one, so we skip the other one.
I thought about your approach as well, but that would require complete
rewrite of the whole parsing code (maybe changing it to generated one
based on RNG schema and some minor additional info) and it would touch
way more APIs. The approach I'm suggesting here keeps almost all APIs
untouched and it doesn't have an effect on anything, unless explicitly
specified.
Check last two patches to see how much qemu driver needs to be
changed, for others it will be similar, probably a little bit less..
The patches are 17+/9- when combined. I don't think you can do much
better than that. And that doesn't show the improvements that can be
made in the starting and parsing code after this series is applied.