The problem is the mutex lock on xenUnifiedPrivatePtr which is held around
xenDomainUsedCpus.
xenUnifiedDomainGetXMLDesc
...
xenUnifiedLock(priv);
cpus = xenDomainUsedCpus(dom);
xenUnifiedUnlock(priv);
...
Unfortunately the introduction of virDomainDefPtr added the following call paths
xenDomainUsedCpus
...
nb_vcpu = xenUnifiedDomainGetMaxVcpus(dom);
return xenUnifiedDomainGetVcpusFlags(...)
...
if (!(def = xenGetDomainDefForDom(dom)))
return xenGetDomainDefForUUID(dom->conn, dom->uuid);
...
ret = xenHypervisorLookupDomainByUUID(conn, uuid);
...
xenUnifiedLock(priv);
name = xenStoreDomainGetName(conn, id);
xenUnifiedUnlock(priv);
...
if ((ncpus = xenUnifiedDomainGetVcpus(dom, cpuinfo, nb_vcpu,
...
if (!(def = xenGetDomainDefForDom(dom)))
[again like above]
Right now, running the GetXMLDesc command for an active Xen domain will lock up
right in the xenUnifiedDomainGetMaxVcpus call. But any subcall leading to a call
to xenGetDomainDefForDom while holding the xenUnifiedPrivatePtr lock will have
the same fate.
I assume the lock around the xenDomainUsedCpus call is there to ensure all
accesses to the private pointer see consistent data. Otherwise it would be
possible to simply release the lock before the GetMaxVcpus and GetVcpus calls.
If that lock cannot be dropped this feels like a much more painful rework is needed.
What do others think?
-Stefan