[PATCH libvirt-python 0/5] Fixes from adding type annotation

Hello, as announced a long time ago with <https://www.redhat.com/archives/libvir-list/2018-November/msg00291.html> and recently refreshed with <https://www.redhat.com/archives/libvir-list/2020-April/msg00892.html> I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt. I have finished this work now and have a working version at <https://github.com/univention/libvirt-python/tree/typing> which consists of 90 patches in total as I has to go over evry file to understand and fix all things. As that patch bomb is quiet large I'm going to submit them in smaller chunks to make them more reviewable. Today I start with the first round consisting of "real" bugs in the current code: Philipp Hahn (5): generator: Fix undefined variables file generator: Fix string formatting generator: Fix domainSnapshot.listAllChildren() qemu-api: Fix return type libvirtaio: Fix return types of callback generator.py | 6 +++--- libvirt-override-virDomainSnapshot.py | 2 +- libvirt-qemu-override-api.xml | 4 ++-- libvirtaio.py | 18 ++++++++++++------ 4 files changed, 18 insertions(+), 12 deletions(-) After that I plan to continue with: 2. fix examples/ to work with Python 3 3. Cleanup code tree-wide 4. Cleanup generator.py 5. Cleanup sanitytest.py 6. Teach generator.py to add PEP 484 annotation 7. Assorted cleanups (the order and chunking is not final yet)
examples/domipaddrs: Convert to python 3 print() examples/domipaddrs: Fix Python 2 dict.iteritems() examples/*: Remove stray semicolon example/dhcp*: Fix None comparison examples/event-test: Remove unneeded global statement examples/event-test: Work with old version of python-libvirt examples/event-test: Use atexit for Python 3 examples/esxlist: Fix Python 2 raw_input() examples/consolecallback: Add var to save callback examples/consolecallback: Fix assorted errors examples: Add missing return values
libvirtaio: Drop object(*args, **kwargs) libvirtaio: Fix return type libvirtaio: assert callback type
Do not use bare except Cleanup imports Fix white space Remove legacy libvirtError arguments
stream: Fix exception traceback handling override: Simplify exception handling generator: Simplify exception handling generator: Change type of quiet to bool generator: Remove unneeded line continuation generator: Convert to 'not in' and 'is not' generator: Remove dead variable assignments generator: Remove skipped_modules generator: Remove useless sort key generator: Fix return type on failure generator: Merge now identical if-elif-else cases generator: Use more string formatting generator: Simplify string concatentaion generator: Use enumerate() generator: Use increment assignment generator: Use string concatenation generator: Remove global declarations generator: Initialize function_classes directly generator: Check contained in hash generator: Use dict.item() to walk keys and values generator: Walk only the values generator: Directly get dict length generator: Just walk the dict generator: Use splitlines() generator: Open file with context manager generator: Refactor parser creation generator: Remove unused SAX content handler methods generator: Use SAX method names generator: Use string formatting generator: Convert in_function to boolean generator: Simplify XML attribute fetching generator: Initialize with empty strings generator: Expand tuple to names in for loop generator: Store arguments and return as tuple generator: Fixed writing cached=None generator: Simplify sorting generator: Simplify loop break generator: Simplify boolean condition generator: Convert dict() to set() generator: Converto to defaultdict() generator: Add PEP 484 type annotations override: Add manual PEP 484 type annotations sanitytest: Skip type annotations stream: Simplify boolean condition domain: Fix None comparison stream: no type change stream: Convert type() to isinstance() stream: Return None from callback connect: Just clear all event handlers override: no type change sanitytest: Do not re-declare set sanitytest: Drop else:pass sanitytest: Drop Python 2 compatibility sanitytest: Add PEP 484 type annotations sanitytest: Use 3-tuple for basicklassmap sanitytest: Use 3-tuple for finalklassmap sanitytest: Use set for tracking used functions sanitytest: Use str.startswith() instead of str[0] generator: Generate PEP 484 type annotation override: Catch type error generator: Special handling for virStoragePool.listAllVolumes generator: Merge code for __init__ genration generator: Use empty string instead of None generator: break lines in generated code generator: Expand tuple to names in for loop generator: Work around type change generator: use pointer wrapper for all objects
examples: Add/fix PEP 484 type annotation -- 2.20.1

Signed-off-by: Philipp Hahn <hahn@univention.de> --- generator.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/generator.py b/generator.py index 426f007..7f4ae1f 100755 --- a/generator.py +++ b/generator.py @@ -928,7 +928,7 @@ def buildStubs(module, api_xml): parser.close() except IOError: msg = sys.exc_info()[1] - print(file, ":", msg) + print(api_xml, ":", msg) sys.exit(1) n = len(list(funcs.keys())) @@ -948,7 +948,7 @@ def buildStubs(module, api_xml): parser.close() except IOError: msg = sys.exc_info()[1] - print(file, ":", msg) + print(override_api_xml, ":", msg) if not quiet: # XXX: This is not right, same function already in @functions -- 2.20.1

remove excessive arguments. Signed-off-by: Philipp Hahn <hahn@univention.de> --- generator.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generator.py b/generator.py index 7f4ae1f..0000a14 100755 --- a/generator.py +++ b/generator.py @@ -766,7 +766,7 @@ def print_function_wrapper(module, name, output, export, include): if file == "python_accessor": if args[1][1] == "char *": c_call = "\n VIR_FREE(%s->%s);\n" % ( - args[0][0], args[1][0], args[0][0], args[1][0]) + args[0][0], args[1][0]) c_call = c_call + " %s->%s = (%s)strdup((const xmlChar *)%s);\n" % (args[0][0], args[1][0], args[1][1], args[1][0]) else: -- 2.20.1

virDomainSnapshot(dom, _obj) expects a reference to the virDomain as its first argument, but virDomainSnapshot.listAllChildren() passes `self` instead: libvirt.py:6459: error: Argument 1 to "virDomainSnapshot" has incompatible type "virDomainSnapshot"; expected "virDomain"
import libvirt con = libvirt.open('test:///default') dom = con.lookupByName("test") first = dom.snapshotCreateXML("""<domainsnapshot><name>First</name></domainsnapshot>""") second = dom.snapshotCreateXML("""<domainsnapshot><name>Second</name></domainsnapshot>""") child, == first.listAllChildren() second.domain() <libvirt.virDomain object at 0x7fb32be3cfd0> ^^^^^^^^^ child.domain() <libvirt.virDomainSnapshot object at 0x7fb32bdb9080> ^^^^^^^^^^^^^^^^^
Signed-off-by: Philipp Hahn <hahn@univention.de> --- libvirt-override-virDomainSnapshot.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libvirt-override-virDomainSnapshot.py b/libvirt-override-virDomainSnapshot.py index ec53358..87eac86 100644 --- a/libvirt-override-virDomainSnapshot.py +++ b/libvirt-override-virDomainSnapshot.py @@ -14,6 +14,6 @@ retlist = list() for snapptr in ret: - retlist.append(virDomainSnapshot(self, _obj=snapptr)) + retlist.append(virDomainSnapshot(self.domain(), _obj=snapptr)) return retlist -- 2.20.1

The API XML description uses "C types": "str *" is not valid. Signed-off-by: Philipp Hahn <hahn@univention.de> --- libvirt-qemu-override-api.xml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libvirt-qemu-override-api.xml b/libvirt-qemu-override-api.xml index ca0dae9..b280a88 100644 --- a/libvirt-qemu-override-api.xml +++ b/libvirt-qemu-override-api.xml @@ -3,14 +3,14 @@ <symbols> <function name='virDomainQemuMonitorCommand' file='python-qemu'> <info>Send an arbitrary monitor command through qemu monitor of domain</info> - <return type='str *' info='the command output or None in case of error'/> + <return type='char *' info='the command output or None in case of error'/> <arg name='domain' type='virDomainPtr' info='pointer to domain object'/> <arg name='cmd' type='const char *' info='the command which will be passed to QEMU monitor'/> <arg name='flags' type='unsigned int' info='an OR'ed set of virDomainQemuMonitorCommandFlags'/> </function> <function name='virDomainQemuAgentCommand' file='python-qemu'> <info>Send a Guest Agent command to domain</info> - <return type='str *' info='the command output'/> + <return type='char *' info='the command output'/> <arg name='domain' type='virDomainPtr' info='pointer to the domain'/> <arg name='cmd' type='const char *' info='guest agent command on domain'/> <arg name='timeout' type='int' info='timeout seconds'/> -- 2.20.1

libvirt defines the signature for the callback functions, e.g. the functions for remove() must return -1 on error and 0 on success. Raising an exception violates that contract. _remove_timeout() did not explicitly handle a double-remove and implicitly passed on the exception. update() expects no return value, so remove the pointless return to pass on None. Signed-off-by: Philipp Hahn <hahn@univention.de> --- libvirtaio.py | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/libvirtaio.py b/libvirtaio.py index 328e6f2..f1811c1 100644 --- a/libvirtaio.py +++ b/libvirtaio.py @@ -370,13 +370,13 @@ class virEventAsyncIOImpl(object): https://libvirt.org/html/libvirt-libvirt-event.html#virEventUpdateHandleFunc ''' self.log.debug('update_handle(watch=%d, event=%d)', watch, event) - return self.callbacks[watch].update(event=event) + self.callbacks[watch].update(event=event) def _remove_handle(self, watch): '''Unregister a callback from a file handle. :param int watch: file descriptor watch to stop listening on - :returns: None (see source for explanation) + :returns: -1 on error, 0 on success .. seealso:: https://libvirt.org/html/libvirt-libvirt-event.html#virEventRemoveHandleFunc @@ -386,12 +386,13 @@ class virEventAsyncIOImpl(object): callback = self.callbacks.pop(watch) except KeyError as err: self.log.warning('remove_handle(): no such handle: %r', err.args[0]) - raise + return -1 fd = callback.descriptor.fd assert callback is self.descriptors[fd].remove_handle(watch) if len(self.descriptors[fd].callbacks) == 0: del self.descriptors[fd] callback.close() + return 0 def _add_timeout(self, timeout, cb, opaque): '''Register a callback for a timer event @@ -425,20 +426,25 @@ class virEventAsyncIOImpl(object): https://libvirt.org/html/libvirt-libvirt-event.html#virEventUpdateTimeoutFun... ''' self.log.debug('update_timeout(timer=%d, timeout=%d)', timer, timeout) - return self.callbacks[timer].update(timeout=timeout) + self.callbacks[timer].update(timeout=timeout) def _remove_timeout(self, timer): '''Unregister a callback for a timer :param int timer: the timer to remove - :returns: None (see source for explanation) + :returns: -1 on error, 0 on success .. seealso:: https://libvirt.org/html/libvirt-libvirt-event.html#virEventRemoveTimeoutFun... ''' self.log.debug('remove_timeout(timer=%d)', timer) - callback = self.callbacks.pop(timer) + try: + callback = self.callbacks.pop(timer) + except KeyError as err: + self.log.warning('remove_timeout(): no such timeout: %r', err.args[0]) + return -1 callback.close() + return 0 _current_impl = None -- 2.20.1

Hello, Am 27.04.20 um 15:44 schrieb Philipp Hahn:
as announced a long time ago with <https://www.redhat.com/archives/libvir-list/2018-November/msg00291.html> and recently refreshed with <https://www.redhat.com/archives/libvir-list/2020-April/msg00892.html> I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt.
I have finished this work now and have a working version at <https://github.com/univention/libvirt-python/tree/typing> which consists of 90 patches in total as I has to go over evry file to understand and fix all things.
I just opened a merge request <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>. This also unearthed severl bugs where C-pointer point to the wrong types, which lead to strange crashes. The fixes for them are at the start of my current branch, except <https://gitlab.com/pmhahn/libvirt-python/-/commit/9fdf8e9ece4b6695bcddaeb2f998bc11f57e2735> which is more tricky. If you want to fast-pache them I can try to extract those patches into a separate branch, which can be merged without my other work. Philipp

Hello, Am 25.07.20 um 23:45 schrieb Philipp Hahn:
Am 27.04.20 um 15:44 schrieb Philipp Hahn:
I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt. ... I just opened a merge request <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
While working on that I stumbled over some annoyances in the current Python API: There are many methods which return a List[int], for example: virDomainGetInfo virDomainGetState virDomainGetControlInfo virDomainGetBlockInfo virDomainGetJobInfo virStoragePoolGetInfo virStorageVolGetInfo virStorageVolGetInfoFlags There are more function returning similar information as plain List: virNodeGetSecurityModel virNodeGetSecurityModel virDomainGetSecurityLabelList virDomainBlockStats virDomainInterfaceStats virDomainGetVcpus virNodeGetCPUMap The worst offender is `virNodeGetInfo`, which returns `Tuple[str, int*7]`: The problem for type annotation is that `List` is unlimited in the number of elements, that is you cannot specify the number of entries the list must or will contain. Also all elements of a list must have the same (super-)type; for "str" and "int" of "virNodeGetInfo()" that would be "Any", which is not very useful here. A better type for those `List`s would be `Tuple`, which has a fixed length and can have different types for each position. But that would be an API change: In most cases users of those functions should not notice the difference as indexing tuples or lists is the same. It would break code where someone would do something like this: ret = virFunction_returning_list() ret += [1, 2, 3] which breaks if that function would return a `Tuple` in the future:
$ python -c '() + []' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: can only concatenate tuple (not "list") to tuple
Even better then plain `Tuple`s would be <https://docs.python.org/3/library/collections.html#collections.namedtuple>, which would allow us to use the return value of `virNodeGetInfo()` as this:
from collections import namedtuple import libvirt virNodeInfo = namedtuple("virNodeInfo", ["model", "memory", "cpus", "mhz", "nodes", "sockets", "cores", "threads"]) c = libvirt.open('test:///default') info = virNodeInfo(*c.getInfo()) print(info.model) print(info) # virNodeInfo(model='i686', memory=3072, cpus=16, mhz=1400, nodes=2, sockets=2, cores=2, threads=2)
<https://libvirt.org/html/libvirt-libvirt-host.html#virNodeInfo> This could be improved even more by using <https://mypy.readthedocs.io/en/stable/kinds_of_types.html?highlight=NamedTuple#named-tuples>, which allows to add type information:
from typing import NamedTuple virNodeInfo = NamedTuple("virNodeInfo", [("model", str), ("memory", int), ("cpus", int), ("mhz", int), ("nodes", int), ("sockets", int), ("cores", int), ("threads", int)])
or with Python 3.6 the more readable form
class virNodeInfo(NamedTuple): model: str memory: int cpus: int mhz: int nodes: int sockets: int cores: int threads: int
IMHO that would be a real usability improvement as I have to lookup that information every time I have to use those functions myself. Indexing with `namedtuple` and `NamedTuple` still works, so you can still use `info[0]` above to get the model. What do you think of that? When would be a good time to make such a change? Philipp

On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
Hello,
Am 25.07.20 um 23:45 schrieb Philipp Hahn:
Am 27.04.20 um 15:44 schrieb Philipp Hahn:
I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt. ... I just opened a merge request <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
While working on that I stumbled over some annoyances in the current Python API: There are many methods which return a List[int], for example: virDomainGetInfo virDomainGetState virDomainGetControlInfo virDomainGetBlockInfo virDomainGetJobInfo virStoragePoolGetInfo virStorageVolGetInfo virStorageVolGetInfoFlags
There are more function returning similar information as plain List: virNodeGetSecurityModel virNodeGetSecurityModel virDomainGetSecurityLabelList virDomainBlockStats virDomainInterfaceStats virDomainGetVcpus virNodeGetCPUMap
The worst offender is `virNodeGetInfo`, which returns `Tuple[str, int*7]`: The problem for type annotation is that `List` is unlimited in the number of elements, that is you cannot specify the number of entries the list must or will contain. Also all elements of a list must have the same (super-)type; for "str" and "int" of "virNodeGetInfo()" that would be "Any", which is not very useful here.
A better type for those `List`s would be `Tuple`, which has a fixed length and can have different types for each position.
But that would be an API change: In most cases users of those functions should not notice the difference as indexing tuples or lists is the same. It would break code where someone would do something like this: ret = virFunction_returning_list() ret += [1, 2, 3]
I would argue that ^this was never the intended way of using the returned object and would lean towards using a named tuple, since like you've pointed out it would be a transparent change in the vast majority of cases. However, since we don't document anywhere how the return value should be treated, from that perspective it's still valid to change the returned list for whatever purposes and therefore we are breaking the API contract, which we can't do even though the change itself is reasonable. Regards, Erik

On Mon, Jul 27, 2020 at 11:19:46AM +0200, Erik Skultety wrote:
On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
Hello,
Am 25.07.20 um 23:45 schrieb Philipp Hahn:
Am 27.04.20 um 15:44 schrieb Philipp Hahn:
I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt. ... I just opened a merge request <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
While working on that I stumbled over some annoyances in the current Python API: There are many methods which return a List[int], for example: virDomainGetInfo virDomainGetState virDomainGetControlInfo virDomainGetBlockInfo virDomainGetJobInfo virStoragePoolGetInfo virStorageVolGetInfo virStorageVolGetInfoFlags
There are more function returning similar information as plain List: virNodeGetSecurityModel virNodeGetSecurityModel virDomainGetSecurityLabelList virDomainBlockStats virDomainInterfaceStats virDomainGetVcpus virNodeGetCPUMap
The worst offender is `virNodeGetInfo`, which returns `Tuple[str, int*7]`: The problem for type annotation is that `List` is unlimited in the number of elements, that is you cannot specify the number of entries the list must or will contain. Also all elements of a list must have the same (super-)type; for "str" and "int" of "virNodeGetInfo()" that would be "Any", which is not very useful here.
A better type for those `List`s would be `Tuple`, which has a fixed length and can have different types for each position.
But that would be an API change: In most cases users of those functions should not notice the difference as indexing tuples or lists is the same. It would break code where someone would do something like this: ret = virFunction_returning_list() ret += [1, 2, 3]
I would argue that ^this was never the intended way of using the returned object and would lean towards using a named tuple, since like you've pointed out it would be a transparent change in the vast majority of cases. However, since we don't document anywhere how the return value should be treated, from that perspective it's still valid to change the returned list for whatever purposes and therefore we are breaking the API contract, which we can't do even though the change itself is reasonable.
I think the above example is just plain crazy. While it may be technically possible, I don't think that example is a compelling usage to justify not doing the tuple change. Even if it does break some app (I very much doubt it will), it'll still be a net win for all the other sane apps to change to using a named tuple. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hello, Am 25.07.20 um 23:45 schrieb Philipp Hahn:
Am 27.04.20 um 15:44 schrieb Philipp Hahn:
I'm working on adding PEP 484 type hints <https://www.python.org/dev/peps/pep-0484/> to the Python binding of libvirt. ... I just opened a merge request <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
While working on that I stumbled over two issues: 1. generator.skip_impl contained a list of the function names, which were also defined in `libvirt-override-api.xml`. Adding an override requires adding it in *two* locations. With <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9/diffs?commit_id=96e7414e488af84f7be04a50f5cd7457142c898d> it took the liberty to remove the list from generator.py and to extends it by parsing "api.xml" where file="python*". 2. Historically "api*.xml" used "char *" to indicate some custom Python type. Mapping this to "str" is obviously wrong, so I had to review them and changed many of them to "Any" or more specific types with <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9/diffs?commit_id=c3b26fccfc5e4e23ad93db51a41a85275f0ea5b5>. As that information is both used to generate the low-level "C-to-Python" mapping for "libvirtmod" but also the high-level Python module "libvirt", changing `type` to something other then a C-type might breaks the C-level wrapper. Using a Python-type there which must be declared in "generator.py" is also somehow cumbersome as there are many types which are used only once. So I would like to extend this file with a new attribute like "pytype", which then can be used to overwrite the type used by generator.py.
<function name='virDomainGetVcpus' file='python'> <info>Extract information about virtual CPUs of domain, store it in info array and also in cpumaps.</info> <return type='char *' info='None in case of error, returns a tuple of vcpu info and vcpu map.'/> <return type='PyAny' info='None in case of error, returns a tuple of vcpu info and vcpu map.'/><!-- Tuple[List[Tuple[int, int, int, int]], List[Tutple[int, ...]]] --> <arg name='domain' type='virDomainPtr' info='pointer to domain object, or NULL for Domain0'/> </function>
Is that okay? Philipp
participants (3)
-
Daniel P. Berrangé
-
Erik Skultety
-
Philipp Hahn