[libvirt] [PATCH 0/3] Speed up schema testing

Currently, we spawn a new xmllint process for every tested XML file. Fortunately, xmllint accepts multiple files on its command line. Use xargs on XML files we expect to be valid and only fall back to file-by-file check if that fails (or if VIR_TEST_EXPENSIVE was requested). This speeds up successfully passing domainschematest 32x (from ~13s to ~0.5s) and make check over 2x (over 16s to ~8s), making it even more fun to use with git rebase -x. Ján Tomko (3): schematestutils: split out file-by-file schema checking schematestutils: split out check_one_file schematestutils: Add check_schema_quick tests/schematestutils.sh | 79 ++++++++++++++++++++++++++++++++++++------------ 1 file changed, 59 insertions(+), 20 deletions(-) -- 2.7.3

Checking the XML files one-by-one is expensive, split it out to make it easier to skip. --- tests/schematestutils.sh | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/tests/schematestutils.sh b/tests/schematestutils.sh index e07e9b9..7ffc800 100644 --- a/tests/schematestutils.sh +++ b/tests/schematestutils.sh @@ -9,10 +9,21 @@ SCHEMA="$abs_top_srcdir/docs/schemas/$2" test_intro $this_test -n=0 -f=0 +check_schema_by_one +test_final $n $f + +ret=0 +test $f != 0 && ret=255 +exit $ret + +} + +check_schema_by_one () { + for dir in $DIRS do + n=0 + f=0 XML=`find $abs_srcdir/$dir -name '*.xml'` || exit 1 for xml in `echo "$XML" | sort` @@ -37,11 +48,4 @@ do fi done done - -test_final $n $f - -ret=0 -test $f != 0 && ret=255 -exit $ret - } -- 2.7.3

If we expect an XML file to be invalid, we cannot check multiple files with a single xmllint command. Split out the minimum needed to check a file into a separate function. --- tests/schematestutils.sh | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/tests/schematestutils.sh b/tests/schematestutils.sh index 7ffc800..498b7ec 100644 --- a/tests/schematestutils.sh +++ b/tests/schematestutils.sh @@ -28,24 +28,28 @@ do for xml in `echo "$XML" | sort` do - n=`expr $n + 1` - cmd="xmllint --relaxng $SCHEMA --noout $xml" - result=`$cmd 2>&1` - ret=$? - - # Alter ret if error was expected. - case $xml:$ret in - *-invalid.xml:[34]) ret=0 ;; - *-invalid.xml:0) ret=3 ;; - esac - + check_one_file test_result $n $(basename $(dirname $xml))"/"$(basename $xml) $ret if test "$verbose" = "1" && test $ret != 0 ; then printf '%s\n' "$cmd" "$result" fi - if test "$ret" != 0 ; then - f=`expr $f + 1` - fi done done } + +check_one_file () { +n=`expr $n + 1` +cmd="xmllint --relaxng $SCHEMA --noout $xml" +result=`$cmd 2>&1` +ret=$? + +# Alter ret if error was expected. +case $xml:$ret in + *-invalid.xml:[34]) ret=0 ;; + *-invalid.xml:0) ret=3 ;; +esac + +if test "$ret" != 0 ; then + f=`expr $f + 1` +fi +} -- 2.7.3

We have hundreds of XML files. Pass multiple of them on xmllint command line to avoid spawning a process for every single one. If any of them fails validation, or VIR_TEST_EXPENSIVE was set to 1, fall back to the file-by-file checking. --- tests/schematestutils.sh | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/tests/schematestutils.sh b/tests/schematestutils.sh index 498b7ec..81a1006 100644 --- a/tests/schematestutils.sh +++ b/tests/schematestutils.sh @@ -6,10 +6,19 @@ check_schema () { DIRS=$1 SCHEMA="$abs_top_srcdir/docs/schemas/$2" +SKIP_EXPENSIVE=0 test_intro $this_test -check_schema_by_one +if test "$VIR_TEST_EXPENSIVE" != 1; then + check_schema_quick + test $f == 0 && SKIP_EXPENSIVE=1 +fi + +if test $SKIP_EXPENSIVE != 1; then + check_schema_by_one +fi + test_final $n $f ret=0 @@ -18,6 +27,28 @@ exit $ret } +check_schema_quick () { + f=0 + ABSDIRS="" + for dir in $DIRS; do + ABSDIRS="$ABSDIRS $abs_srcdir/$dir" + done + VALIDXMLS=`find $ABSDIRS -not -name '*-invalid.xml' -name '*.xml'` || exit 1 + INVALIDXMLS=`find $ABSDIRS -name '*-invalid.xml'` || exit 1 + + n=`echo "$VALIDXMLS" | wc -l` + result=`echo "$VALIDXMLS" | xargs xmllint --relaxng $SCHEMA --noout 2>&1` + ret=$? + if test "$ret" != 0 ; then + f=`expr $f + 1` + fi + + for xml in `echo "$INVALIDXMLS" | sort` + do + check_one_file + done +} + check_schema_by_one () { for dir in $DIRS -- 2.7.3

On Tue, Jun 07, 2016 at 05:15:55PM +0200, Ján Tomko wrote:
Currently, we spawn a new xmllint process for every tested XML file. Fortunately, xmllint accepts multiple files on its command line.
Use xargs on XML files we expect to be valid and only fall back to file-by-file check if that fails (or if VIR_TEST_EXPENSIVE was requested).
This speeds up successfully passing domainschematest 32x (from ~13s to ~0.5s) and make check over 2x (over 16s to ~8s), making it even more fun to use with git rebase -x.
How about killing the horrible use of shell script and instead just calling virXMLValidateAgainstSchema() from a regular C unit test. That ought to reduce overhead even more and for added benefit eliminate awful shell script code. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (2)
-
Daniel P. Berrange
-
Ján Tomko