virt-install has some code which waits for a domain to appear just after
it has been created. It looks like the loop attached to the end of this
email, and is functional but has two problems.
Problem (1) is that self.conn.lookupByName doesn't distinguish between a
"Not found" domain and an actual error. For example there is no way to
tell the difference between being unable to contact xend (an actual
error), and being able to contact xend, but xend not being able to find
the domain (not found).
As shown here:
>> import libvirt
>> conn = libvirt.open ("xen+tls:///")
>> d = conn.lookupByName ("Domain-0")
>> d = conn.lookupByName ("doesnotexist")
[...]
libvirt.libvirtError: virDomainLookupByName() failed
then I deliberately kill the remote daemon:
>> d = conn.lookupByName ("doesnotexist")
libvir: Remote error : Error in the push function.
[...]
The first exception is a Not found condition (not an error) whereas the
second is an error.
Problem (2) is that virterror is over anxious to print error messages to
stderr, even if the caller can handle them and even if (as in the Not
found case) they don't indicate errors. In practical terms this means
that the virt-install loop attached below may print out 1 or 2 error
messages even when it is functioning normally. You'll see an error like
this appearing [sic]:
libvir: Xen Daemon error : GET operation failed:
Since it's difficult to change the LookupBy* functions without changing
the ABI, I suspect that the best thing to do is going to be to add a new
call with better semantics. Therefore I suggest:
virDomainPtr *
virDomainLookup (virConnectPtr conn, int flags,
int id, char *str, int *error);
where flags is one of:
VIR_LOOKUP_BY_ID, VIR_LOOKUP_BY_NAME, VIR_LOOKUP_BY_UUID
or VIR_LOOKUP_BY_UUID_STRING
The return values are:
ret = domain, *error = 0 => found it
ret = NULL, *error = 0 => not found
ret = NULL, *error = 1 => error (check virterror)
Addition 1: There would be a similar function virNetworkLookup, but
without needing the 'id' parameter because networks don't have IDs.
Addition 2: Change the driver internals so that they don't call
virterror in the not found case. (This requires quite a bit of
rejigging in xend_internal, but is not too hard).
Addition 3: Language bindings could be modified to detect this function
and if present change their existing LookupBy* functions to use the new
interface.
Thoughts?
Rich.
----------------------------------------------------
This is the troublesome loop:
logging.debug("Created guest, looking to see if it is running")
# sleep in .25 second increments until either a) we find
# our domain or b) it's been 5 seconds. this is so that
# we can try to gracefully handle domain creation failures
num = 0
d = None
while num < (5 / .25): # 5 seconds, .25 second sleeps
try:
d = self.conn.lookupByName(self.name)
break
except libvirt.libvirtError, e:
logging.debug("No guest running yet " + str(e))
pass
num += 1
time.sleep(0.25)
--
Emerging Technologies, Red Hat -
http://et.redhat.com/~rjones/
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod
Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in
England and Wales under Company Registration No. 03798903