On Tue, May 20, 2008 at 10:11:57AM -0400, Cole Robinson wrote:
Richard W.M. Jones wrote:
> On Tue, May 20, 2008 at 09:57:02AM -0400, Cole Robinson wrote:
>> I'll try this again from the original unpatched state and try to
>> figure out exactly what was happening.
>
> Do you trigger an error at any point in the testing? You could try
> doing that (eg. try migrating a QEMU domain).
Yes the whole time I am triggering errors, as I am trying to shutdown
or destroy an already shutdown domain. That's the interesting part:
once destroy errors once, even subsequent shutdown errors start leaking.
Ok, so the problem is a very complex one and here's what's going on....
- Storing the domain/network objects in the virError struct does not
increment the reference count.
- When a domain ref count drops to zero, the virReleaseDomain internal
method will null-ify the dom/net field in any virError struct still
refering to them.
- In the remote daemon the paradigm for the helpers is usually
if (virDomainCreate (dom) == -1) {
virDomainFree(dom);
return -1;
}
virDomainFree(dom);
return 0;
Both of those virDomainFree calls will release the last reference
and thus, even if virDomainCreate() set the 'dom' field in the
virError object it will now get nullified.
- The remote daemon code which serializes the error object runs *after*
that snippet I show above.
Thus in the *absence* of memory leaks in the remote daemon helper methods
it is *impossible* for the 'dom' or 'net' fields to ever be set in the
error object serialized on the wire.
So there *is* a memory leak in server_error() on the client end, but it
is impossible to trigger it unless there is also a memory leak in the
server end :-) Now the virDoaminDestroy() method helper does currently
have a memory leak, hence why you saw the corresponding leak on the client
side. Fixing the server side leak, hid the client side leak. Also, once
a leak occurs on the server end, it will live forever, so once one method
leaks, you'll see the client side leaks for all other methods which trigger
an error too.
So we *do* need the patch to the server_error() method - at very least
for old daemons which will set the dom/net fields in the errors they
return.
We also ought to fix the daemon so that it correctly captures and
serializes the error objects. This is a very non-trivial fix, since
the code serializing the errors is running far too late. We need to
capture the error much earlier, which will probably require quite a
large change.
So I vote for applying all Cole's patches which do indeed fix a number
of memory leaks. Fixing the daemon to correctly serialize errors with
a dom/net object can be done later unless someone has burning desire
to tackle it now
Dan.
--
|: Red Hat, Engineering, Boston -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|