On Wed, Jun 10, 2020 at 09:20:29AM +0800, Shi Lei wrote:
> This tool is used to generate parsexml/formatbuf functions for structs.
> It is based on libclang and its python-binding.
> Some directives (such as genparse, xmlattr, etc.) need to be added on
> the declarations of structs to direct the tool.
>
> Signed-off-by: Shi Lei <shi_lei(a)massclouds.com>
> ---
> build-aux/generator/directive.py | 839 +++++++++++++++++++++++++++++++
> build-aux/generator/go | 14 +
> build-aux/generator/main.py | 416 +++++++++++++++
> build-aux/generator/utils.py | 100 ++++
FWIW, we've tried to standardize on scripts/ directory for any helper
scripts used during the build process. Calling it "generator" is probably
too broad, as we've lots of diffferent generators.
I think this would would be a good fit as scripts/xmlgen/
Okay. I move it there.
> po/POTFILES.in | 1 +
> 5 files changed, 1370 insertions(+)
> create mode 100644 build-aux/generator/directive.py
> create mode 100755 build-aux/generator/go
> create mode 100755 build-aux/generator/main.py
> create mode 100644 build-aux/generator/utils.py
> +T_CLEAR_FUNC_IMPL = '''
> +void
> +${typename}Clear(${typename}Ptr def)
> +{
> + if (!def)
> + return;
> +
> + ${body}
> +}
> +'''
> +
> +T_CLEAR_FUNC_DECL = '''
> +void
> +${typename}Clear(${typename}Ptr def);
In general I think we have a preference for ${typename}Free
functions, rather than ${typename}Clear.
A Clear() func is needed when a struct is often stack allocated,
or when the method allocating & free'ing the struct is different
from the method clearing it. The latter case is generally considered
bad practice - it is safer and easier to verify code when clearing
and freeing is done at the same same.
> +def clearMember(member):
> + mtype = TypeTable().get(member['type'])
> +
> + refname = 'def->%s' % member['name']
> + if member.get('array'):
> + refname += '[i]'
> +
> + code = ''
> + if mtype['meta'] == 'Struct':
> + if member['pointer'] and not mtype.get('external'):
> + code = '%sClear(%s);' % (mtype['name'], refname)
> + code += '\nVIR_FREE(%s);' % refname
We should use g_free() for any new code.
Okay.
> + else:
> + code = '%sClear(&%s);' % (mtype['name'], refname)
> + elif mtype['name'] == 'String':
> + code = 'VIR_FREE(%s);' % refname
> + elif mtype['name'] in ['Chars', 'UChars']:
> + code = 'memset(%s, 0, sizeof(%s));' % (refname, refname)
> + elif not member.get('array'):
> + code = '%s = 0;' % refname
> +
> + if member.get('specified'):
> + assert not member.get('array'), "'specified' can't
come with 'array'."
> + code += '\n%s_specified = false;' % refname
> +
> + if member.get('array') and code:
> + counter = counterName(member['name'])
> + if singleline(code):
> + code = render(T_LOOP_SINGLE, counter=counter, body=code)
> + else:
> + code = render(T_LOOP_MULTI,
> + counter=counter, body=indent(code, 2))
> + code += '\nVIR_FREE(def->%s);' % member['name']
> + code += '\ndef->%s = 0;' % counter
> +
> + return code
> +
> +
> +T_CLEAR_NAMESPACE = '''
> +if (def->namespaceData && def->ns.free)
> + (def->ns.free)(def->namespaceData);
> +'''
> +
> +
> +def makeClearFunc(writer, atype):
> + if 'genparse' not in atype:
> + return
> +
> + blocks = BlockAssembler()
> + for member in atype['members']:
> + blocks.append(clearMember(member))
> +
> + if 'namespace' in atype:
> + blocks.append(T_CLEAR_NAMESPACE.strip())
> +
> + body = indent(blocks.output('\n\n'), 1)
> +
> + impl = render(T_CLEAR_FUNC_IMPL, typename=atype['name'], body=body)
> + writer.write(atype, 'clearfunc', '.c', impl)
> +
> + decl = render(T_CLEAR_FUNC_DECL, typename=atype['name'])
> + writer.write(atype, 'clearfunc', '.h', decl)
Bearing in mind the above, I'd say we should create Free() funcs,
not Clear() funcs here.
> +
> +
> +#
> +# Templates for parsing member block
> +#
> +T_READ_ATTR_BY_PROP = '${mdvar} = virXMLPropString(node,
"${tname}");'
> +T_READ_ELEM_BY_PROP = '${mdvar} = virXMLChildNode(node,
"${tname}");'
> +T_READ_ELEM_CONTENT = '${mdvar} = virXMLChildNodeContent(node,
"${tname}");'
> +
> +T_PARSE_MEMBER_MORE = '''
> +${number} = virXMLChildNodeSet(node, "${tname}", &nodes);
> +if (${number} > 0) {
> + size_t i;
> +
> + if (VIR_ALLOC_N(def->${name}, ${number}) < 0)
> + goto error;
> +
> + for (i = 0; i < ${number}; i++) {
> + xmlNodePtr tnode = nodes[i];
> + ${item}
> + }
> + def->${counter} = ${number};
> + VIR_FREE(nodes);
> +} else if (${number} < 0) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("Invalid ${tname} element found."));
For error messages we should avoid using python string substitution.
Instead we want todo
virReportError(VIR_ERR_XML_ERROR,
_("Invalid %s element found.").
"${tname}");
This means that when we repeat this code with many different values
of ${tname}, we're only getting 1 string for the translation team
to deal with.
The same point for other virReportError error callsin this file,
I won't mention them all individually.
Got it. Thanks!
> +T_ALLOC_MEMORY = '''
> +if (VIR_ALLOC(def->${name}) < 0)
> + goto error;
> +'''
We prefer g_new0 for new code now.
As a general rule for any code, if there is a GLib API that works
then we should use that, instead of a libvirt API.
Okay.
> +T_PARSE_FUNC_DECL = '''
> +int
> +${funcname}(${args});
> +'''
> +
> +T_PARSE_FUNC_IMPL = '''
> +int
> +${funcname}(${args})
> +{
> + ${declare_vars}
> + VIR_USED(instname);
> + VIR_USED(opaque);
There should not be any need for this macro. Instead change
the function declaration to add G_GNUC_UNUSED for "instname"
and "opaque". This annotation is harmless if the variable
really is used.
Okay.
> +
> + if (!def)
> + goto error;
> +
> + ${body}
> +
> + return 0;
> +
> + error:
> + ${cleanup_vars}
> + ${typename}Clear(def);
IIUC, the caller of this method is responsible for allocating
'def', and so the caller should be responsible for running
${typename}Free(def). IOW, we should not need to call Clear
for this.
I suggest we can retain it.
Sometimes the caller doesn't call Free, so this clear function ensures that
members of 'def' will be cleared or freed on error.
And it will not hurt performance, since it is only called on error.
> + return -1;
> +}
> +def _handleTmpVars(tmpvars):
> + heads, tails = [], []
> + tmpvars = dedup(tmpvars)
> + for var in tmpvars:
> + if var == 'nodes':
> + heads.append('xmlNodePtr *nodes = NULL;')
> + tails.append('VIR_FREE(nodes);')
> + elif var.endswith('Str'):
> + heads.append('g_autofree char *%s = NULL;' % var)
> + elif var.endswith('Node'):
> + heads.append('xmlNodePtr %s = NULL;' % var)
> + else:
> + assert var.endswith('Nodes') and var.startswith('n')
> + heads.append('int %s = 0;' % var)
> +
> + return '\n'.join(heads), '\n'.join(tails)
One thing I noticed in the generated code is that we;re
duplicating the strings twice:
Consider this snippet of generated code:
g_autofree char *nameStr = NULL;
nameStr = virXMLPropString(node, "name");
if (nameStr == NULL) {
virReportError(VIR_ERR_XML_ERROR,
_("Missing 'name' setting in '%s'"),
instname);
goto error;
}
def->name = g_strdup(nameStr);
virXMLPropString already duplicates the string, and then
we g_strdup again. So we should just get rid of the
g_autofree and directly assign def->name = nameStr.
Okay. I would consider how to modify it.
> diff --git a/build-aux/generator/go b/build-aux/generator/go
> new file mode 100755
> index 0000000..9e30e08
> --- /dev/null
> +++ b/build-aux/generator/go
> @@ -0,0 +1,14 @@
> +# This is a command-line tool
> +
> +libclang_line=`ldconfig -p | grep libclang`
This needs to be libclang.so, otherwise it can match on the
wrong library:
$ ldconfig -p | grep libclang
libclang.so.10 (libc6,x86-64) => /lib64/libclang.so.10
libclang.so (libc6,x86-64) => /lib64/libclang.so
libclang-cpp.so.10 (libc6,x86-64) => /lib64/libclang-cpp.so.10
libclang-cpp.so (libc6,x86-64) => /lib64/libclang-cpp.so
Okay.
> +export libclang_path=`expr "$libclang_line" : '.*
=> \(.*\)$'`
> +if test -z "$libclang_path"; then
> + echo "libclang is required by libvirt\n"
> + exit -1
> +fi
> diff --git a/build-aux/generator/main.py b/build-aux/generator/main.py
> new file mode 100755
> index 0000000..78a90f1
> --- /dev/null
> +++ b/build-aux/generator/main.py
> @@ -0,0 +1,416 @@
> +#!/usr/bin/env python3
> +#
> +# Copyright (C) 2020 Shandong Massclouds Co.,Ltd.
> +#
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with this library. If not, see
> +# <
http://www.gnu.org/licenses/>.
> +#
> +if __name__ == "__main__":
> +
> + libclang_path = os.environ.get('libclang_path')
> + assert libclang_path, 'No libclang library.'
> + Config.set_library_file(libclang_path)
I'm wondering if we really need this at all ? AFAICT, the libclang
python library works fine without it, so this is only required if
trying to use a non-standard libclang, which I think ought to be
possible already by setting LD_LIBRARY_PATH
I'm working on Ubuntu. On Ubuntu 20.04 LTS and 18.04, I find:
# ldconfig -p | grep libclang
libclang-10.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libclang-10.so.1
libclang-10.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libclang-10.so
There's no libclang.so!
If I install python3 bindings from the standard apt repository, it works!
By default, this version just find 'libclang-10.so'.
But if I use 'pip3' to install python3 bindings, it doesn't work.
Because it is another verison. This version find 'libclang.so' and it can't
find it!
So I think we can retain these lines to adapt to the distinction among platforms.
Regards,
Shi Lei