On 04.08.2016 14:37, weifuqiang wrote:
The reason of this problem is that fdstream abort event or close
event occured at the same time, libvirtd doesn't deal with the synchronousness well
enough.
the flows about fdStream is bellow
1、qemuDomainDefineXMLFlags -> virDomainObjListAdd -> qemuDomainObjPrivateAlloc
-> virChrdevAlloc -> virHashCreate
2、qemuDomainOpenConsole -> virChrdevOpen -> virHashAddEntry(devs->hash, path,
st)
3、virDomainObjDispose -> privateDataFreeFunc (qemuDomainObjPrivateFree) - >
virChrdevFree(*dev locked*) -> virChrdevFreeClearCallbacks - >
virFDStreamSetInternalCloseCb(*fdst locked*)
4、virFDStreamCloseInt (*fdst locked*) -> icbFreeOpaque(virChrdevFDStreamCloseCb (*dev
locked*)) -> virHashRemoveEntry
Thank you for your very detailed description of the problem.
The AB lock problem is obviouse: in step 3, it locks chardev before fdst, and in step 4,
it's the opsite way.
The reason of libvirtd crash is that: in virFDStreamCloseInt function we set fdst to
NULL, while in virFDStreamSetInternalCloseCb we use fdst->lock, note that fdst has
already been freed.
another crash problem occurs because that when virChrdevFree was earlier finished,
dev->hash freed and all date of hash is freed, but fdstream event flow use fdStream
after hash free. Ahha~~, libvirtd coredump.
All of those problem is because clear vm flow and fdStream flow concurs synchronously.
then I fix this problem by modify virChrdevFree()
void virChrdevFree(virChrdevsPtr devs)
{
if (!devs || !devs->hash)
return;
for (;;) {
virMutexLock(&devs->lock);
if (0 == virHashSize(devs->hash)) {
virMutexUnlock(&devs->lock);
break;
}
virMutexUnlock(&devs->lock);
usleep(10 * 1000);
}
virMutexLock(&devs->lock);
virHashFree(devs->hash);
virMutexUnlock(&devs->lock);
virMutexDestroy(&devs->lock);
VIR_FREE(devs);
Usually, a race fix that contains usleep() is not really a fix.
}
If the chardev is removed by fdStream close or fdStream abort when vm destroy or
shutdown, the modification works well. But I'm not sure is that all chardev would be
removed when we clear vm, if not, it would always sleep here.
Another solution is as follows:
virMutexLock(vm);
virChrdevFree();
virMutexUnlock(vm);
virMutexLock(vm);
virFDStreamCloseInt();
virMutexUnlock(vm);
I like this one more.
However, what I'm thinking is:
1) looks like when qemuDomainObjPrivateFree() is called, it's a lost
game thereafter. I mean, qemuDomainObjPrivateFree() callsirChrdevFree()
which destroys the virChrdevs::mutex. Any later attempts to lock it will
result in crash (or something similarly nasty). Therefore I think in
virChrdevFree() we should unregister the stream close callbacks (icbCb
and icbFreeOpaque). We are doing the cleanup anyway.
2) I'm wondering whether it is necessary to have the close callbacks
(icbCb and icbFreeOpaque) guarded with stream mutex locked. Looks to me
like we could unlock the stream while the callback are running and then
lock it back again.
Michal