Re: [Libvir] [PATCH] deep copy dom structure

On Fri, Mar 28, 2008 at 04:39:31PM +0100, Guido Guenther wrote:
--- src/virterror.c | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/src/virterror.c b/src/virterror.c index 1463129..49f3b89 100644 --- a/src/virterror.c +++ b/src/virterror.c @@ -115,6 +115,11 @@ virResetError(virErrorPtr err) free(err->str1); free(err->str2); free(err->str3); + if(err->dom) { + if(err->dom->name) + free(err->dom->name); + free(err->dom); + } memset(err, 0, sizeof(virError)); }
@@ -383,6 +388,15 @@ __virRaiseError(virConnectPtr conn, virDomainPtr dom, virNetworkPtr net, virResetError(to); to->conn = conn; to->dom = dom; + if (dom) { + if (!(to->dom = malloc(sizeof(virDomain)))) { + fprintf(stderr, "cannot allocate memory in %s", __FUNCTION__); + } else { + to->dom->name = strdup(dom->name); + memcpy (to->dom->uuid, dom->uuid, VIR_UUID_BUFLEN); + to->dom->id = dom->id; + } + } to->net = net; to->domain = domain; to->code = code;
What problem are you trying to solve? The 'conn', 'dom' and 'net' fields in a virterror can't be accessed safely, particularly from a garbage-collected language. The only safe thing one could do with them is a physical equality test on the pointer against an existing conn/dom/net pointer. But the solution isn't to copy the dom object, since in a GC-d language someone could grab a handle to the copied dom which would have a different lifetime from the error object. If anything the solution would be to remove those fields from the error object since they are highly unlikely to be useful in any real world application. We can't do this because of backwards compatibility promises but we should instead deprecate them. Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top

On Mon, Mar 31, 2008 at 11:00:56AM +0100, Richard W.M. Jones wrote:
What problem are you trying to solve?
AFAICT, the problem is that the Domain/Network objects are being free'd before the remote daemon has a chance to serialze the Error object into XDR format. So it is accessing free'd memory when serializing the dom/net field in the Error object & geting back garbage.
The 'conn', 'dom' and 'net' fields in a virterror can't be accessed safely, particularly from a garbage-collected language. The only safe thing one could do with them is a physical equality test on the pointer against an existing conn/dom/net pointer.
But the solution isn't to copy the dom object, since in a GC-d language someone could grab a handle to the copied dom which would have a different lifetime from the error object. If anything the solution would be to remove those fields from the error object since they are highly unlikely to be useful in any real world application. We can't do this because of backwards compatibility promises but we should instead deprecate them.
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object. We also probably need to re-arrange the remote daemon code a little so that it serializes to XDR format before the Domain/Network object are free'd. I agree that deep copying isn't the answer here & messes up the reference counting and will cause potential memory leaks. Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
On Mon, Mar 31, 2008 at 11:00:56AM +0100, Richard W.M. Jones wrote:
The 'conn', 'dom' and 'net' fields in a virterror can't be accessed safely, particularly from a garbage-collected language. The only safe thing one could do with them is a physical equality test on the pointer against an existing conn/dom/net pointer.
But the solution isn't to copy the dom object, since in a GC-d language someone could grab a handle to the copied dom which would have a different lifetime from the error object. If anything the solution would be to remove those fields from the error object since they are highly unlikely to be useful in any real world application. We can't do this because of backwards compatibility promises but we should instead deprecate them.
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
We also probably need to re-arrange the remote daemon code a little so that it serializes to XDR format before the Domain/Network object are free'd.
I agree that deep copying isn't the answer here & messes up the reference counting and will cause potential memory leaks.
yes, this will become unmanageable very fast i'm afraid. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mon, Mar 31, 2008 at 08:12:19AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
My patch for this. it's a bit more complex because we also keep a global variable for the last variable, and that needed to be exported to the hash module, it's still rather simple, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mon, Mar 31, 2008 at 08:49:12AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 08:12:19AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
My patch for this. it's a bit more complex because we also keep a global variable for the last variable, and that needed to be exported to the hash module, it's still rather simple,
Yes, that's pretty much what I was thinking we should do - ACK from me. Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Mon, Mar 31, 2008 at 02:35:31PM +0100, Daniel P. Berrange wrote:
On Mon, Mar 31, 2008 at 08:49:12AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 08:12:19AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
My patch for this. it's a bit more complex because we also keep a global variable for the last variable, and that needed to be exported to the hash module, it's still rather simple,
Yes, that's pretty much what I was thinking we should do - ACK from me.
thanks, commited to CVS, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mon, Mar 31, 2008 at 08:49:12AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 08:12:19AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
My patch for this. it's a bit more complex because we also keep a global variable for the last variable, and that needed to be exported to the hash module, it's still rather simple,
Yes, ack from me too. I'm still going to deprecate these in the OCaml bindings, or maybe just change them so you can only do physical pointer comparisons. I just can't see a real life use for them ... You already know which call failed and from that what domain it was operating on, so except perhaps in the case of migration there is no new information here. Also I've not seen any libvirt program yet which does more than just print the error message and either abort or ignore it. Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top

On Mon, Mar 31, 2008 at 02:40:54PM +0100, Richard W.M. Jones wrote:
On Mon, Mar 31, 2008 at 08:49:12AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 08:12:19AM -0400, Daniel Veillard wrote:
On Mon, Mar 31, 2008 at 01:02:49PM +0100, Daniel P. Berrange wrote:
We should at the very least NULL-ify the dom/net fields in the Error object associated with the Connection when we free the Domain/Network object.
agreed, very simple test but avoids dandling pointers.
My patch for this. it's a bit more complex because we also keep a global variable for the last variable, and that needed to be exported to the hash module, it's still rather simple,
Yes, ack from me too.
I'm still going to deprecate these in the OCaml bindings, or maybe just change them so you can only do physical pointer comparisons. I just can't see a real life use for them ... You already know which call failed and from that what domain it was operating on, so except perhaps in the case of migration there is no new information here. Also I've not seen any libvirt program yet which does more than just print the error message and either abort or ignore it.
I just looked at the python binding, and it doesn't even bother to copy the dom/net fields from virErrorPtr into the python layer - just completely ignores them. Similarly I don't bother to do anything with them in the Perl bindings. IMHO, we should just mark them as deprecated in the API docs & tell people not to use them. Note, we don't even have the virStoragePoolPtr or virStorageVolPtr objects in the error either. Regards, Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Mon, Mar 31, 2008 at 03:39:06PM +0100, Daniel P. Berrange wrote:
bindings. IMHO, we should just mark them as deprecated in the API docs & tell people not to use them. Note, we don't even have the virStoragePoolPtr or virStorageVolPtr objects in the error either.
Patch attached. Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top

On Mon, Mar 31, 2008 at 03:45:37PM +0100, Richard W.M. Jones wrote:
On Mon, Mar 31, 2008 at 03:39:06PM +0100, Daniel P. Berrange wrote:
bindings. IMHO, we should just mark them as deprecated in the API docs & tell people not to use them. Note, we don't even have the virStoragePoolPtr or virStorageVolPtr objects in the error either.
Patch attached.
There i disagree, maybe the bindings don't use them but it could be used perfectly fine if you process the error directly in the callback from the library, i.e. in a synchronous way. I would not flag them as deprecated, just that one need to be careful to use them asynchronously. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

The 'conn', 'dom' and 'net' fields in a virterror can't be accessed safely, particularly from a garbage-collected language. The only safe thing one could do with them is a physical equality test on the pointer against an existing conn/dom/net pointer.
The problem is that we're freeing the dom/net objects before serializing the result so we're accessing already freed memory (which only works by accident at the momemt). I'll have a look at moving around virDomainFree() and friends when back from vactions. -- Guido
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
guido@honk.sigxcpu.org
-
Richard W.M. Jones