Re: [libvirt] [PATCH v3 03/22] build-aux: rewrite po file minimizer in Python

27 Sep 2019

      On Thu, Sep 26, 2019 at 05:34:49PM +0200, Ján Tomko wrote:
...
On Thu, Sep 26, 2019 at 02:16:04PM +0100, Daniel P. Berrangé wrote:
...
On Thu, Sep 26, 2019 at 12:39:39PM +0200, Erik Skultety wrote:
...
On Tue, Sep 24, 2019 at 03:58:44PM +0100, Daniel P. Berrangé wrote:
question 1) what's the benefit of compiling a regex and using it only once? Btw
python does cache every pattern passed to re.match (and friends) so compilation
IMO hardly ever makes sense unless you're doing 1000s of searches for the same
Some of the scripts here are run on the whole libvirt codebase so that
is the case here. For example just removing the pre-compilation of
regexes for comments from the spacing check script bumped the execution
time from 6.5s to 7.4s
Sadly, the one script where pre-compilation matters the most is the one
where separating them puts them far away from the usage to not fit on
one screen.
I could do a little custom function that caches all regexes

  recache = {}

  def research(regex, line):
    global recache
    if regex not in recache:
      recache[regex] = re.compile(regex)
    return recache[regex].search(line)

then the loop we can do a normal

     research(r'''some regex''', line)

so we can get readability and full caching together. Probably not worth
repeating this trick for every script, but certainly the whitespace
script and a few others probably benefit.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|