On Tue, Sep 22, 2015 at 02:36:54PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:26:26PM +0200, Martin Kletzander wrote:
> On Tue, Sep 22, 2015 at 02:17:14PM +0100, Daniel P. Berrange wrote:
> >On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
> >>On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
> >>>On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
> >>>>In order for the user to be able to fix broken domains function
> >>>>qemuDomainGetXMLDesc() needs to be able to lookup invalid domain
> >>>>definitions and handle them properly. When redefined, function
> >>>>qemuDomainDefineXMLFlags() must clear the 'invalid XML'
reason. As a
> >>>>nice addition, qemuDomainGetState() can lookup such domains without
any
> >>>>other change and that allows virsh not only to get their status, but
> >>>>also to list them.
> >>>
> >>>Hmm, that's an interesting approach to the problem. I wonder though
> >>>if we could do things slightly differently such that we don't need
> >>>to change so many APIs.
> >>>
> >>>eg, just have a 'bool error' field in virDomainDefPtr. When
loading
> >>>the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set
> >>>the error field. Now we merely need to change the qemuDomainStart
> >>>method, so it refuses to launch a VM with the 'error' flag set.
All
> >>>the other APIs could be essentially unchanged. Sure it would not
> >>>be useful to allow things like virDomainAttachDevice, etc on such
> >>>broken domains, but for sake of simplicity we can avoid touching
> >>>all the methods except start.
> >>>
> >>
> >>To be honest, I'm a afraid that we might forget some API that needs to
> >>be blocked as well. And we would have to go through all the APIs just
> >>to see whether it accesses something that might be missing. Moreover,
> >>how would you decide what to skip at each error during parsing? If,
> >>for example, the <numatune> has some faulty attribute in one
> >>subelement should we skip all the elements or just the one or just
> >>skip that one particular attribute? We would also not format the
> >>faulty attribute to the XML being dumped, so the user wouldn't see
> >>what's missing, which is even worse when there are two problems and
> >>they fix only one, so we skip the other one.
> >
> >Oh, I wasn't suggesting changing the parser. I just mean that we would
> >create a virDomainDefPtr instance which only contains the name and
> >UUID, and ID field (if the guest is running from domain status XML).
> >
> >We'd leave all the rest of the config blank - our code ought to be
> >able to deal with that, since essentially all config is optional
> >aside from the name/uuid anyway.
> >
>
> Then probably I misunderstood because that's exactly what this series
> is doing.
Right, but I mean without converting all the drivers to use this new
qemuDomObjFromDomainInvalid() method. I'm suggesting just having a
check in the start method only.
I don't think virDomainDef with only name, id and uuid set would not
cause problems, but to be honest I haven't tried that. I just
remember how sometimes we rely on the fact that whatever we have in
virDomainDef went through parsing code. I believe that if we remove
the safeguard and only add a check to *CreateXML*() functions, we'll
find some flaws. Mainly information gathering functions that work
with enums that have no default values (e.g. virDomainVirtType).
Well, even without those, we will report false information. No
pinning, no tuning, everything would return success with all
information lost. So we'd add few checks here and there. Then we'd
realize that (offline) migration will fail weirdly since
qemuMigrationIsAllowed() will return true and the first thing we will
do after that is formatting the XML. We'd probably define way
different domain on the destination if that worked, I'm pretty sure
parsing would fail there. Anyway, I think we'll end up adding more
explicit checks to the code in your case. Way more than three calls
to a differently named function per driver.
Also, this can lead to a bigger cleanup, if we do some wrapper around
our APIs, we might move ACL checks, domain object gathering, job
creation and common API cleanups (*EndAPI()) and all other stuff in
it, so it won't be visible anyway. But that's a future that won't
happen until I have way to much free time.