On Tue, Jul 23, 2024 at 13:51:09 +0100, Daniel P. Berrangé wrote:
On Tue, Jul 23, 2024 at 02:41:13PM +0200, Peter Krempa wrote:
> On Fri, Jul 19, 2024 at 14:46:49 +0200, Michal Privoznik wrote:
> > Imagine two threads. Thread A is executing qemuProcessStop() and
> > thread B is executing qemuDomainCreateXML(). To make things more
> > interesting, the domain XML passed to qemuDomainCreateXML matches
> > name + UUID of that the virDomainObj that qemuProcessStop() is
> > currently working with. Here's what happens.
> >
> > 1) Thread A locks @vm, enters qemuProcessStop().
> >
> > 2) Thread B parses given XML, calls virDomainObjListAdd() ->
> > virDomainObjListAdd() -> virDomainObjListAddLocked() ->
> > virDomainObjListFindByUUIDLocked(). Since there's a match on
> > UUID, an virDomainObj object is returned and the thread
> > proceeds to calling virObjectLock(). NB, it's the same object
> > as in thread A.
> >
> > 3) Thread A sets vm->def->id = -1; this means that from this
> > point on, virDomainObjIsActive() will return false.
> >
> > 4) Thread A calls qemuDomainObjStopWorker() which unlocks the
> > @vm.
> >
> > 5) Thread B acquires the @vm lock and since
> > virDomainObjIsActive() returns false, it proceeds to calling
> > virDomainObjAssignDef() where vm->def is replaced.
> >
> > 6) Thread B then calls qemuProcessBeginJob() which unlocks the
> > @vm temporarily.
> >
> > 7) Thread A, still in qemuDomainObjStopWorker() acquires @vm lock
> > and proceeds with cleanup.
> >
> > 8) Thread A finds different definition than the one needing
> > cleanup.
> >
> > In my testing I've seen stale pointers, e.g.
> > vm->def->nets[0]->priv was NULL, which lead to a SIGSEGV as
> > there's 'QEMU_DOMAIN_NETWORK_PRIVATE(net)->created' line when
> > cleaning up nets. Your mileage may vary.
> >
> > Even if we did not crash, the plain fact that vm->def is changed
> > in the middle of qemuProcessStop() means we might be cleaning up
> > something else than intended.
>
> This paragraph is the important bit. The root cause of the problem here
> is that 'virDomainObjListAdd' inside 'qemuDomainCreateXML' can
modify
> 'vm->def' whithout holding any _MODIFY-type JOB on the domain object
> which we normally require for any modification of 'vm->def' related
data.
>
> This wasn't a problem until now as we've relinquished the lock on @vm
> only in situations when the @vm object was considered live:
>
> 1) Before the per-VM thread cleanup was added to qemuProcessStop it
> never unlocked
> 2) After the per-VM thread cleanup was added, this unlock was done
> prior to setting vm->def->id to '-1'
> 3) All other cases are done only when the VM is live.
>
> > As a fix, I'm moving all lines that obviously touch vm->def
> > before the domain object is unlocked. That left
> > virHookCall(VIR_HOOK_QEMU_OP_STOPPED, VIR_HOOK_SUBOP_END) nearly
> > next to virHookCall(VIR_HOOK_QEMU_OP_RELEASE,
> > VIR_HOOK_SUBOP_END) which I figured is not something we want. So
> > I've shuffled things a bit more.
>
> This feels like a fix for symptoms of 'virDomainObjListAdd' not
> honouring the _MODIFY-type job expectation, and we're shuffling code
> around so that it doesn't care about the broken expectation.
>
> Since I don't currently have a better idea of how to fix this I'm okay
> with this patch given the following conditions:
>
> > Fixes: 3865410e7f67ca4ec66e9a905e75f452762a97f0
>
> Explain that the above commit inverted the order of setting the VM as
> inactive and unlocking thus allowing the above sequence of events to
> happen, and,
Why not just revert that change ? It claimed to be making things
safer, but did the opposite. Even with this fixup below I'm
pretty uncomfortable with setting 'id = -1' and unlocking the
@vm, before we've done all our cleanup.
You'd have to also revert d29e0f3d4a5362d7b33261df1e55896396707de3 which
is the commit actually moving the unlock of the VM object inside
qemuProcessStop after setting id = -1 which actually does fix a
crash on migration.
We can possibly move the qemuDomainObjStopWorker(vm) call as the last
thing in qemuProcessStop, so that everything is cleaned up before
unlocking.
Either way, unlocking inside qemuProcessStop is fragile as we're giving
a chance for races which would not be possible before.