On Wed, Jun 27, 2007 at 02:43:18PM +0100, Richard W.M. Jones wrote:
virt-install has some code which waits for a domain to appear just
after
it has been created. It looks like the loop attached to the end of this
email, and is functional but has two problems.
Problem (1) is that self.conn.lookupByName doesn't distinguish between a
"Not found" domain and an actual error. For example there is no way to
tell the difference between being unable to contact xend (an actual
error), and being able to contact xend, but xend not being able to find
the domain (not found).
It is possible to tell the difference, we just don't report it well.
As shown here:
>>> import libvirt
>>> conn = libvirt.open ("xen+tls:///")
>>> d = conn.lookupByName ("Domain-0")
>>> d = conn.lookupByName ("doesnotexist")
[...]
libvirt.libvirtError: virDomainLookupByName() failed
then I deliberately kill the remote daemon:
>>> d = conn.lookupByName ("doesnotexist")
libvir: Remote error : Error in the push function.
[...]
The first exception is a Not found condition (not an error) whereas the
second is an error.
There is no explicit libvirt error code for 'no such domain' so it
is basically impossible to catch this scenario in an app currently
This problem is actually not just an issue with virLookup* group of functions.
Basically any function which takes a virDomainPtr arg can have a 'no such
domain' error, since a domain could have gone away in the time since the
app got hold of a virDomainPtr object. The QEMU driver just throws a generic
'internal error', but I think its worth introducing an explicit 'no such
domain'
and 'no such network' error code and fixing all functions to report these
correctly.
Problem (2) is that virterror is over anxious to print error messages to
stderr, even if the caller can handle them and even if (as in the Not
found case) they don't indicate errors. In practical terms this means
that the virt-install loop attached below may print out 1 or 2 error
messages even when it is functioning normally. You'll see an error like
this appearing [sic]:
libvir: Xen Daemon error : GET operation failed:
In the python bindings all errors are converted into Exceptions, so
the python binding really shouldn't be printing out anything to the
console at all by default. It'll all be reported as exceptions. The
default error reporting func in libvirt is doing this so perhaps we
should register a no-op func.
Since it's difficult to change the LookupBy* functions without
changing
the ABI, I suspect that the best thing to do is going to be to add a new
call with better semantics. Therefore I suggest:
virDomainPtr *
virDomainLookup (virConnectPtr conn, int flags,
int id, char *str, int *error);
where flags is one of:
VIR_LOOKUP_BY_ID, VIR_LOOKUP_BY_NAME, VIR_LOOKUP_BY_UUID
or VIR_LOOKUP_BY_UUID_STRING
The return values are:
ret = domain, *error = 0 => found it
ret = NULL, *error = 0 => not found
ret = NULL, *error = 1 => error (check virterror)
I'd do it the other way around, returning the error code, and putting
the domain object into a parameter
int
virDomainLookup (virConnectPtr conn, int flags,
int id, char *str, virDomainPtr *dom);
That said, I'm not convinced we need this if we fix the error reporting
of the original functions to allow the 'no such domain' error to be
reliably caught & handled.
Dan,
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules:
http://search.cpan.org/~danberr/ -=|
|=- Projects:
http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|