[libvirt] [PATCH 0/3] Speed up syntax-check a little

This shaves off 4 % of syntax-check runtime on my 8-core system, which is enough of a reason for me. Surely it saves more on our overloaded CI workers. Ján Tomko (3): scripts: speedup prohibit-duplicate-header scripts: check-aclrules: use in instead of find scripts: check-aclrules: use regular expressions less often scripts/check-aclrules.py | 38 ++++++++++++++++++---------- scripts/prohibit-duplicate-header.py | 4 +++ 2 files changed, 29 insertions(+), 13 deletions(-) -- 2.19.2

Running regular expressions with capture groups is expensive. Bail out early if the line does not start with a '#'. This reduces the runtime of the check by two thirds. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/prohibit-duplicate-header.py | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scripts/prohibit-duplicate-header.py b/scripts/prohibit-duplicate-header.py index dfdfa0bf0b..420311ccef 100644 --- a/scripts/prohibit-duplicate-header.py +++ b/scripts/prohibit-duplicate-header.py @@ -30,6 +30,10 @@ def check_file(filename): for line in fh: lineno = lineno + 1 + # skip non-matching lines early + if line[0] != '#': + continue + headermatch = re.search(r'''^# *include *[<"]([^>"]*\.h)[">]''', line) if headermatch is not None: inc = headermatch.group(1) -- 2.19.2

On Wed, Nov 20, 2019 at 07:32:44PM +0100, Ján Tomko wrote:
Running regular expressions with capture groups is expensive. Bail out early if the line does not start with a '#'.
This reduces the runtime of the check by two thirds.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

For checking whether a substring is present in a string, using the pattern: "str" in string is slightly faster than: string.find("str") != -1 Use it to shave off 4 % of the runtime of this script that processes quite a few long source files. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/scripts/check-aclrules.py b/scripts/check-aclrules.py index e06a019b00..d145e59164 100755 --- a/scripts/check-aclrules.py +++ b/scripts/check-aclrules.py @@ -97,7 +97,7 @@ def fixup_name(name): def name_to_ProcName(name): elems = [] - if name.find("_") != -1 or name.lower() in ["open", "close"]: + if "_" in name or name.lower() in ["open", "close"]: elems = [n.lower().capitalize() for n in name.split("_")] else: elems = [name] @@ -116,11 +116,11 @@ with open(proto, "r") as fh: filtered = False for line in fh: - if line.find("/**") != -1: + if "/**" in line: incomment = True filtered = False elif incomment: - if line.find("* @aclfilter") != -1: + if "* @aclfilter" in line: filtered = True elif filtered: m = re.search(r'''REMOTE_PROC_(.*)\s+=\s*\d+''', line) @@ -211,7 +211,7 @@ def process_file(filename): # an ACL function if intable: assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) - if line.find("}") != -1: + if "}" in line: intable = False table = None elif assign is not None: @@ -247,9 +247,9 @@ def process_file(filename): intable = True table = name - if line.find("{") != -1: + if "{" in line: brace = brace + 1 - if line.find("}") != -1: + if "}" in line: brace = brace - 1 return errs -- 2.19.2

On Wed, Nov 20, 2019 at 07:32:45PM +0100, Ján Tomko wrote:
For checking whether a substring is present in a string, using the pattern: "str" in string is slightly faster than: string.find("str") != -1
Use it to shave off 4 % of the runtime of this script that processes quite a few long source files.
This is also the case in python FWIW...
Signed-off-by: Ján Tomko <jtomko@redhat.com> ---
Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Thu, Nov 21, 2019 at 09:27:31AM +0100, Erik Skultety wrote:
On Wed, Nov 20, 2019 at 07:32:45PM +0100, Ján Tomko wrote:
For checking whether a substring is present in a string, using the pattern: "str" in string is slightly faster than: string.find("str") != -1
Use it to shave off 4 % of the runtime of this script that processes quite a few long source files.
This is also the case in python FWIW...
Derp...clearly I didn't have my cup of coffee in the morning, since I thought this was related to perl, sorry for the noise. Erik
Signed-off-by: Ján Tomko <jtomko@redhat.com> ---
Reviewed-by: Erik Skultety <eskultet@redhat.com>
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Use a simple if "substr" in line before running a regular expression, which saves time, especially if the regex has a capture group. This reduces runtime of the check by almost 78 % for me. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/scripts/check-aclrules.py b/scripts/check-aclrules.py index d145e59164..40c47c1c6b 100755 --- a/scripts/check-aclrules.py +++ b/scripts/check-aclrules.py @@ -150,13 +150,21 @@ def process_file(filename): # Looks for anything which appears to be a function # body name. Doesn't matter if we pick up bogus stuff # here, as long as we don't miss valid stuff - m = re.search(r'''\b(\w+)\(''', line) + m = None + if "(" in line: + m = re.search(r'''\b(\w+)\(''', line) if m is not None: maybefunc = m.group(1) elif brace > 0: - ensureacl = re.search(r'''(\w+)EnsureACL''', line) - checkacl = re.search(r'''(\w+)CheckACL''', line) - stub = re.search(r'''\b(\w+)\(''', line) + ensureacl = None + checkacl = None + stub = None + if "EnsureACL" in line: + ensureacl = re.search(r'''(\w+)EnsureACL''', line) + if "CheckACL" in line: + checkacl = re.search(r'''(\w+)CheckACL''', line) + if "(" in line: + stub = re.search(r'''\b(\w+)\(''', line) if ensureacl is not None: # Record the fact that maybefunc contains an # ACL call, and make sure it is the right call! @@ -210,7 +218,9 @@ def process_file(filename): # every func listed there, has an impl which calls # an ACL function if intable: - assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) + assign = None + if '"' in line: + assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) if "}" in line: intable = False table = None @@ -237,8 +247,10 @@ def process_file(filename): file=sys.stderr) errs = True else: - m = re.search(r'''^(?:static\s+)?(vir(?:\w+)?Driver)\s+''', - line) + m = None + if "Driver" in line: + m = re.search(r'''^(?:static\s+)?(vir(?:\w+)?Driver)\s+''', + line) if m is not None: name = m.group(1) if name not in ["virNWFilterCallbackDriver", -- 2.19.2

On Wed, Nov 20, 2019 at 07:32:46PM +0100, Ján Tomko wrote:
Use a simple if "substr" in line before running a regular expression, which saves time, especially if the regex has a capture group.
This reduces runtime of the check by almost 78 % for me.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) ...
# every func listed there, has an impl which calls # an ACL function if intable: - assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) + assign = None + if '"' in line:
^This doesn't quite work, does it? I don't see any '"' in the driver API mapping structures. The only reliable pattern I see we could use is ', /* ' but in order to use that, we'd have to unify a few source files, mainly vz_driver.c, doing: git grep -Enpi ", \s+/\* " | grep driver.c should help identifying them. With this one addressed: Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Fri, Nov 22, 2019 at 09:00:12AM +0100, Erik Skultety wrote:
On Wed, Nov 20, 2019 at 07:32:46PM +0100, Ján Tomko wrote:
Use a simple if "substr" in line before running a regular expression, which saves time, especially if the regex has a capture group.
This reduces runtime of the check by almost 78 % for me.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) ...
# every func listed there, has an impl which calls # an ACL function if intable: - assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) + assign = None + if '"' in line:
^This doesn't quite work, does it? I don't see any '"' in the driver API mapping structures. The only reliable pattern I see we could use is ', /* ' but in order to use that, we'd have to unify a few source files, mainly vz_driver.c, doing:
git grep -Enpi ", \s+/\* " | grep driver.c
should help identifying them.
Oops, that should've been "=". I have no idea how I managed to type that equal sign sideways. =) Jano
With this one addressed: Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Fri, Nov 22, 2019 at 11:37:31AM +0100, Ján Tomko wrote:
On Fri, Nov 22, 2019 at 09:00:12AM +0100, Erik Skultety wrote:
On Wed, Nov 20, 2019 at 07:32:46PM +0100, Ján Tomko wrote:
Use a simple if "substr" in line before running a regular expression, which saves time, especially if the regex has a capture group.
This reduces runtime of the check by almost 78 % for me.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) ...
# every func listed there, has an impl which calls # an ACL function if intable: - assign = re.search(r'''\.(\w+)\s*=\s*(\w+),?''', line) + assign = None + if '"' in line:
^This doesn't quite work, does it? I don't see any '"' in the driver API mapping structures. The only reliable pattern I see we could use is ', /* ' but in order to use that, we'd have to unify a few source files, mainly vz_driver.c, doing:
git grep -Enpi ", \s+/\* " | grep driver.c
should help identifying them.
Oops, that should've been "=".
I have no idea how I managed to type that equal sign sideways. =)
I thought so, but with the pattern I suggested, you can narrow down the final set of lines even more, so might turn out to be even faster than a plain "=" Erik

Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 2 +- scripts/check-driverimpls.py | 2 +- scripts/check-symfile.py | 6 +++--- scripts/dtrace2systemtap.py | 4 ++-- scripts/genpolkit.py | 4 ++-- scripts/gensystemtap.py | 2 +- 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/scripts/check-aclrules.py b/scripts/check-aclrules.py index 40c47c1c6b..1a76ebcb3b 100755 --- a/scripts/check-aclrules.py +++ b/scripts/check-aclrules.py @@ -128,7 +128,7 @@ with open(proto, "r") as fh: api = name_to_ProcName(m.group(1)) # Event filtering is handled in daemon/remote.c # instead of drivers - if line.find("_EVENT_REGISTER") == -1: + if "_EVENT_REGISTER" not in line: filteredmap[api] = True incomment = False diff --git a/scripts/check-driverimpls.py b/scripts/check-driverimpls.py index ca7ec3af7f..bc3f16a816 100755 --- a/scripts/check-driverimpls.py +++ b/scripts/check-driverimpls.py @@ -33,7 +33,7 @@ def checkdriverimpls(filename): for line in fh: lineno = lineno + 1 if intable: - if line.find("}") != -1: + if "}" in line: intable = False mainprefix = None continue diff --git a/scripts/check-symfile.py b/scripts/check-symfile.py index a543a0fbcd..0c02591991 100755 --- a/scripts/check-symfile.py +++ b/scripts/check-symfile.py @@ -36,9 +36,9 @@ ret = 0 with open(symfile, "r") as fh: for line in fh: line = line.strip() - if line.find("{") != -1: + if "{" in line: continue - if line.find("}") != -1: + if "}" in line: continue if line in ["global:", "local:"]: continue @@ -46,7 +46,7 @@ with open(symfile, "r") as fh: continue if line[0] == '#': continue - if line.find("*") != -1: + if "*" in line: continue line = line.strip(";") diff --git a/scripts/dtrace2systemtap.py b/scripts/dtrace2systemtap.py index 922713e599..cf4d8e1a79 100755 --- a/scripts/dtrace2systemtap.py +++ b/scripts/dtrace2systemtap.py @@ -49,9 +49,9 @@ with open(dtrace, "r") as fh: line = line.strip() if line == "": continue - if line.find("provider ") != -1 and line.find("{") != -1: + if "provider " in line and "{" in line: continue - if line.find("};") != -1: + if "};" in line: continue if line.startswith("#"): diff --git a/scripts/genpolkit.py b/scripts/genpolkit.py index 0cdba2bd3c..230d920f70 100755 --- a/scripts/genpolkit.py +++ b/scripts/genpolkit.py @@ -55,13 +55,13 @@ aclfile = sys.argv[1] with open(aclfile, "r") as fh: for line in fh: if in_opts: - if line.find("*/") != -1: + if "*/" in line: in_opts = False else: m = re.search(r'''\*\s*\@(\w+):\s*(.*?)\s*$''', line) if m is not None: opts[m.group(1)] = m.group(2) - elif line.find("**") != -1: + elif "**" in line: in_opts = True else: m = re.search(r'''VIR_ACCESS_PERM_(%s)_((?:\w|_)+),''' % diff --git a/scripts/gensystemtap.py b/scripts/gensystemtap.py index c4bc1620d0..7b391cc911 100755 --- a/scripts/gensystemtap.py +++ b/scripts/gensystemtap.py @@ -46,7 +46,7 @@ def load_file(fh): instatus = True elif re.search(r'''enum remote_auth_type''', line): inauth = True - elif line.find("}") != -1: + elif "}" in line: intype = False instatus = False inauth = False -- 2.19.2

On Wed, Nov 20, 2019 at 08:16:27PM +0100, Ján Tomko wrote:
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- scripts/check-aclrules.py | 2 +- scripts/check-driverimpls.py | 2 +- scripts/check-symfile.py | 6 +++--- scripts/dtrace2systemtap.py | 4 ++-- scripts/genpolkit.py | 4 ++-- scripts/gensystemtap.py | 2 +- 6 files changed, 10 insertions(+), 10 deletions(-)
With the following squashed in: diff --git a/scripts/dtrace2systemtap.py b/scripts/dtrace2systemtap.py index cf4d8e1a79..d6bf1f8d1f 100755 --- a/scripts/dtrace2systemtap.py +++ b/scripts/dtrace2systemtap.py @@ -126,7 +126,7 @@ for file in filelist: for idx in range(len(argbits)): arg = argbits[idx] isstr = False - if arg.find("char *") != -1: + if 'char *' in arg: isstr = True m = re.search(r'''^.*\s\*?(\S+)$''', arg) Reviewed-by: Erik Skultety <eskultet@redhat.com>

On 11/20/19 3:32 PM, Ján Tomko wrote:
This shaves off 4 % of syntax-check runtime on my 8-core system, which is enough of a reason for me.
Around 10% increase on my system running 'make -j syntax-check'. All patches: Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

On 11/20/19 4:34 PM, Daniel Henrique Barboza wrote:
On 11/20/19 3:32 PM, Ján Tomko wrote:
This shaves off 4 % of syntax-check runtime on my 8-core system, which is enough of a reason for me.
Around 10% increase on my system running 'make -j syntax-check'.
In time: *performance* increase. Did a few runs, on average these patches made syntax-check 10% faster for me (~11 seconds without, ~10 seconds with them). Thanks, DHB
participants (3)
-
Daniel Henrique Barboza
-
Erik Skultety
-
Ján Tomko