[Libvir] [PATCH] Python bindings: Use virDomain/NetworkGetConnect instead of storing connection in domain/network object

Previously[1] I patched libvirt to add the virDomainGetConnect and virNetworkGetConnect functions. The rationale for these is explained here[2]. The attached patch makes the Python bindings use these functions instead of storing the connection object explicitly. Rich. [1]https://www.redhat.com/archives/libvir-list/2007-June/msg00365.html [2]https://www.redhat.com/archives/libvir-list/2007-June/msg00352.html

On Mon, Jul 23, 2007 at 04:00:59PM +0100, Richard W.M. Jones wrote:
Previously[1] I patched libvirt to add the virDomainGetConnect and virNetworkGetConnect functions. The rationale for these is explained here[2].
The attached patch makes the Python bindings use these functions instead of storing the connection object explicitly.
Hum ... The problem was the following: - you create a connection - you get a domain pointer from the connection - you delete the reference from the connection (e.g. by exiting the method) but keep the domain object - the connection count of the python object goes down to 0 - _del() is called on the object - the connection is closed at the C level - you have a dandling domain python object whose C pointer is dead Are you sure your patch is safe ? Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote:
On Mon, Jul 23, 2007 at 04:00:59PM +0100, Richard W.M. Jones wrote:
Previously[1] I patched libvirt to add the virDomainGetConnect and virNetworkGetConnect functions. The rationale for these is explained here[2].
The attached patch makes the Python bindings use these functions instead of storing the connection object explicitly.
Hum ... The problem was the following:
- you create a connection - you get a domain pointer from the connection
At the C level, conn->uses is incremented when we get a new domain object.
- you delete the reference from the connection (e.g. by exiting the method) but keep the domain object - the connection count of the python object goes down to 0 - _del() is called on the object - the connection is closed at the C level
conn->uses should still be >= 1 (reflecting the outstanding domain object), so the connection shouldn't be closed.
- you have a dandling domain python object whose C pointer is dead
Are you sure your patch is safe ?
Having said that, I will now go and construct a test case to see if this really is safe. garbage.collection++ Rich.

This is my test case: ------------------------------------------ #!/usr/bin/python import libvirt def get_dom(): conn = libvirt.open ("xen"); dom = conn.lookupByName ("debian32fv"); return dom dom = get_dom() print "name = %s\n" % dom.name() ------------------------------------------ I modified src/hash.c to add a little bit of debug of the state of conn->uses in virGetConnect, virFreeConnect, virGetDomain, virFreeDomain. When running the program I see: ------------------------------------------ libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virInitialize () libvirt: virConnectOpen (name=xen) conn(0x6748c0) uses = 1 libvirt: do_open: proceeding with name=xen:/// libvirt: do_open: trying driver 0 (Test) ... libvirt: do_open: driver 0 returned DECLINED libvirt: do_open: trying driver 1 (QEMU) ... libvirt: do_open: driver 1 returned DECLINED libvirt: do_open: trying driver 2 (Xen) ... libvirt: do_open: driver 2 returned SUCCESS libvir: Remote error : Connection refused libvir: warning : Failed to find the network: Is the daemon running ? libvirt: virDomainLookupByName (conn=0x6748c0, name=debian32fv) new domain: conn(0x6748c0) uses = 2 libvirt: virConnectClose (conn=0x6748c0) conn(0x6748c0) uses = 1 libvirt: virDomainGetName (domain=0x679e20) name = debian32fv libvirt: virDomainFree (domain=0x679e20) free domain: conn(0x6748c0) uses = 0 ------------------------------------------ I would say that it's working as intended. Rich.

On Mon, Jul 23, 2007 at 05:11:09PM +0100, Richard W.M. Jones wrote:
This is my test case:
------------------------------------------ #!/usr/bin/python
import libvirt
def get_dom(): conn = libvirt.open ("xen"); dom = conn.lookupByName ("debian32fv"); return dom
dom = get_dom() print "name = %s\n" % dom.name() ------------------------------------------
I would say that it's working as intended.
In this case yes, but if you add one further line at the end of this example conn2 = dom.get_connection() And then have python garbage collect 'dom' and the original 'conn' you're not safe because 'conn2' didn't increment the reference count: /* * virDomainGetConnect: * @dom: pointer to a domain * * Provides the connection pointer associated with a domain. The * reference counter on the connection is not increased by this * call. */ Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Mon, Jul 23, 2007 at 05:11:09PM +0100, Richard W.M. Jones wrote:
This is my test case:
------------------------------------------ #!/usr/bin/python
import libvirt
def get_dom(): conn = libvirt.open ("xen"); dom = conn.lookupByName ("debian32fv"); return dom
dom = get_dom() print "name = %s\n" % dom.name() ------------------------------------------
I would say that it's working as intended.
In this case yes, but if you add one further line at the end of this example
conn2 = dom.get_connection()
And then have python garbage collect 'dom' and the original 'conn' you're not safe because 'conn2' didn't increment the reference count:
I finally managed to reproduce this. For reference, the reproduction code is below: -------------------- #!/usr/bin/python import libvirt import libvirtmod def get_dom(): conn = libvirt.open ("xen"); dom = conn.lookupByName ("debian32fv"); return dom dom = get_dom() print "name = %s" % dom.name() conn2 = libvirt.virConnect (_obj = libvirtmod.virDomainGetConnect (dom._o)) dom2 = conn2.lookupByName ("debian32fv") -------------------- This crashes at the last line because virConnectClose was called in the destructor of conn. It doesn't seem like changing virDomainGetConnect to play with reference counts will fix this. The damage was done when virConnectClose was called, which happened much earlier. The only solution is to keep the relation that domain contains a connection explicit in the Python code. Same probably applies to OCaml bindings too.
/* * virDomainGetConnect: * @dom: pointer to a domain * * Provides the connection pointer associated with a domain. The * reference counter on the connection is not increased by this * call. */
For this reason also I suggest that we remove virDomainGetConnect and virNetworkGetConnect calls entirely, to prevent anyone from thinking they can use them in a safe way. Patch is attached, and yes I know in this case it breaks strict ABI compatibility. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Tue, Jul 24, 2007 at 03:19:23PM +0100, Richard W.M. Jones wrote:
It doesn't seem like changing virDomainGetConnect to play with reference counts will fix this. The damage was done when virConnectClose was called, which happened much earlier. The only solution is to keep the relation that domain contains a connection explicit in the Python code. Same probably applies to OCaml bindings too.
/* * virDomainGetConnect: * @dom: pointer to a domain * * Provides the connection pointer associated with a domain. The * reference counter on the connection is not increased by this * call. */
For this reason also I suggest that we remove virDomainGetConnect and virNetworkGetConnect calls entirely, to prevent anyone from thinking they can use them in a safe way. Patch is attached, and yes I know in this case it breaks strict ABI compatibility.
I still find them useful in the context of C coding as you can get the connextion without having to carry it in all the calls (assuming you understand the reference counting associated to connections, this allow for simpler/cleaner code). Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Here is a patch which adds a large warning to those functions, and removes a comment from the Python code. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Tue, Jul 24, 2007 at 04:00:25PM +0100, Richard W.M. Jones wrote:
Here is a patch which adds a large warning to those functions, and removes a comment from the Python code.
Okay, I removed the generator comment, and added a part of the warning to the 2 function descriptions (only the first part though) for the release. If you really think one need to warn C users more then maybe we can get less threatening wording. thanks, and sorry I rushed this a bit, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Richard W.M. Jones