On 09/25/2015 01:27 PM, Daniel P. Berrange wrote:
So, I instrumented the netcf and augeas code to checking timings.
The aug_get calls time less than a millisecond, as do the various
other calls. I found the bulk of the time is actually coming from
the netcf function "get_augeas", which in turns call "aug_load"
for every single damn netcf function call. So when we have 500
interfaces, we're telling augeas to load all the config files
1000 times. That's where the slowness is coming from....
Either we need to stop loading config files on every fnuction
call in netcf,
Right you are! netcf has a variable "load_augeas" that is set each time
an API call is started. I removed the line that sets it, and the time
for virsh iface-list --all for 309 toplevel interfaces (300 vlans
connected to 300 bridges + whatever is really on the host) went from
22.2 sec. to 1.47 sec!
But of course we can't blindly re-use the initial data forever. Looking
at aug_load() itself, I see a lot of comments about everything it does
to avoid unnecessary re-loading of files. This was added in 2010, and is
probably what I remember David talking about:
commit 5ee8163051be8214507c13c86171ac90ca7cb91f
Author: David Lutterkort <lutter(a)redhat.com>
Date: Tue Jun 29 15:32:44 2010 -0700
Avoid unnecessary file parsing when reloading the tree
We used to reparse every file we knew about upon aug_load. Now, we only
reparse files if the file has changed on disk.
We test a few scenarios to make sure aug_load retains its behavior of
obliterating the tree and filling it with the latest from disk. This
includes throwing away unsaved changes or trees that have been deleted.
So now the question is whether something can be done to improve
aug_load() (maybe it used to perform better and there has been a
regression?), or if netcf should derive a list of files from the list
sent to augeas, and do its own checking of the timestamps.
(After this message I think we can remove libvir-list from the Cc, since
it's clear that everything is between netcf and augeas).