Quoting Daniel P. Berrange (berrange(a)redhat.com):
On Wed, Sep 28, 2011 at 02:14:52PM -0500, Serge E. Hallyn wrote:
> Nova (openstack) calls libvirt to create a container, then
> periodically checks using GetInfo to see whether the container
> is up. If it does this too quickly, then libvirt returns an
> error, which in libvirt.py causes an exception to be raised,
> the same type as if the container was bad.
lxcDomainGetInfo(), holds a mutex on 'dom' for the duration of
its execution. It checks for virDomainObjIsActive() before
trying to use the cgroups.
lxcDomainStart(), holds the mutex on 'dom' for the duration of
its execution, and does not return until the container is running
and cgroups are present.
Yup, now that you mention it, I do see that. So this shouldn't be
happening. Can't explain it, but copious fprintf debugging still
suggests it is :)
Is it possible that vm->def->id is not being set to -1 when it is
first defined, and I'm catching it between define and start? I
would think that would show up as much more broken, though I'm
not seeing where vm->def->id gets set to -1 during domain definition.
Well, I'll keep digging then. Thanks for setting me straight on
the mutex!
Similarly when we delete the cgroups, we again hold the lock
on 'dom'.
Thus any time viDomainObjIsActive() returns true, AFAICT, we have
guaranteed that the cgroup does in fact exist.
So can't see how control gets to the 'else' part of this
condition if the cgroups don't exist like you describe.
if (!virDomainObjIsActive(vm) || driver->cgroup == NULL) {
info->cpuTime = 0;
info->memory = vm->def->mem.cur_balloon;
} else {
if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) !=
0) {
lxcError(VIR_ERR_INTERNAL_ERROR,
_("Unable to get cgroup for %s"), vm->def->name);
goto cleanup;
}
What libvirt version were you seeing this behaviour with ?
0.9.2
thanks,
-serge