Thanks for your valuable comments and suggestions, I will separate it to
three patches and send later.
On 2017/1/5 8:35, John Ferlan wrote:
Going through some older on list patches with no activity... There were
two which appeared to be the same thing, so I chose the later one.
Pardon the delay in processing this...
On 11/23/2016 11:29 PM, Wang King wrote:
> From: wangjing <king.wang(a)huawei.com>
>
> The virCloseCallbacksSet method increase object reference for
> VM, and decrease object reference in virCloseCallbacksUnset.
> But VM UUID will be deleted from closecallbacks list in
> virCloseCallbacksRun when connection disconnected, and then
> object reference cannot be decreased by virCloseCallbacksUnset
> in callback functions.
>
> Signed-off-by: Wang King <king.wang(a)huawei.com>
> ---
> src/util/virclosecallbacks.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
It seems there may be multiple things being fixed - if so then they
should each be separate patches.
Of the following paths (consumers of close callbacks), which was causing
the refcnt issue? Do you have a backtrace? How reproducible?
1. bhyveProcessAutoDestroy will destroy the transient vm, return NULL
2. lxcProcessAutoDestroy will destroy the transient vm, return NULL
3. qemuMigrationCleanup will not destroy the vm
4. qemuProcessAutoDestroy will call qemuDomainRemoveInactive which looks
to destroy the transient vm... if it does then last virDomainObjEndAPI
will cause the return to be NULL
> diff --git a/src/util/virclosecallbacks.c b/src/util/virclosecallbacks.c
> index 891a92b..26d5075 100644
> --- a/src/util/virclosecallbacks.c
> +++ b/src/util/virclosecallbacks.c
> @@ -300,7 +300,9 @@ virCloseCallbacksGetForConn(virCloseCallbacksPtr closeCallbacks,
> data.list = list;
> data.oom = false;
>
> + virObjectLock(closeCallbacks);
> virHashForEach(closeCallbacks->list, virCloseCallbacksGetOne, &data);
> + virObjectUnlock(closeCallbacks);
So this seems to be "a" bugfix for [1]
>
> if (data.oom) {
> VIR_FREE(list->entries);
> @@ -329,22 +331,15 @@ virCloseCallbacksRun(virCloseCallbacksPtr closeCallbacks,
> * them all from the hash. At that point we can release
> * the lock and run the callbacks safely. */
>
> - virObjectLock(closeCallbacks);
> list = virCloseCallbacksGetForConn(closeCallbacks, conn);
> if (!list)
> return;
[1]
Considering the former code, this return would return with the
closeCallbacks locked - so that's "one" bug...
>
> for (i = 0; i < list->nentries; i++) {
> - char uuidstr[VIR_UUID_STRING_BUFLEN];
> - virUUIDFormat(list->entries[i].uuid, uuidstr);
> - virHashRemoveEntry(closeCallbacks->list, uuidstr);
> - }
I'm trying to figure out why this hunk of code moved below. Is that a
separate issue? I would think the process in the comment is what is
desired... With a lock, grab all the callback entries into a local
structure by UUID and callback function, then clear out the callbacks
from the hash, then call the callback functions.
My concern is this change could have some other timing issue as this
path is called during/from a <driver>ConnectClose() function (bhyve,
qemu, lxc)
> - virObjectUnlock(closeCallbacks);
> -
> - for (i = 0; i < list->nentries; i++) {
> virDomainObjPtr vm;
> + virDomainObjPtr dom;
>
> - if (!(vm = virDomainObjListFindByUUID(domains,
> + if (!(vm = virDomainObjListFindByUUIDRef(domains,
> list->entries[i].uuid))) {
The second line needs to move 3 spaces to line up under domains... I
think this is the "second" bug fix. In this case, the bug fix is that we
don't want the callback function to remove our reference.
> char uuidstr[VIR_UUID_STRING_BUFLEN];
> virUUIDFormat(list->entries[i].uuid, uuidstr);
> @@ -352,10 +347,20 @@ virCloseCallbacksRun(virCloseCallbacksPtr closeCallbacks,
> continue;
> }
>
> - vm = list->entries[i].callback(vm, conn, opaque);
> - if (vm)
> - virObjectUnlock(vm);
> + dom = list->entries[i].callback(vm, conn, opaque);
> + if (dom)
> + virObjectUnlock(dom);
> + virObjectUnref(vm);
and the corollary for the Ref which probably causes 'vm' to be disposed.
This is more than likely what the issue is - calling something which
causes vm to ref down to 0.
John
> }
> +
> + virObjectLock(closeCallbacks);
> + for (i = 0; i < list->nentries; i++) {
> + char uuidstr[VIR_UUID_STRING_BUFLEN];
> + virUUIDFormat(list->entries[i].uuid, uuidstr);
> + virHashRemoveEntry(closeCallbacks->list, uuidstr);
> + }
> + virObjectUnlock(closeCallbacks);
> +
> VIR_FREE(list->entries);
> VIR_FREE(list);
> }
>
.