Re: [libvirt PATCH 0/4] RFC: tests: introduce lavocado

26 Jul 2021

      On Wed, Jul 21, 2021 at 07:26:49PM +0100, Daniel P. Berrangé wrote:
...
On Tue, Jul 20, 2021 at 02:19:25PM +0200, Erik Skultety wrote:
...
On Thu, Jul 01, 2021 at 07:04:32PM +0100, Daniel P. Berrangé wrote:
...
Libvirt has consumers writing applications in a variety of
languages, and periodically reporting bugs.  My general wish
for a test harness would be for something that can invoke
and consume arbitrary test cases. Essentially if someone
can provide us a piece of code that demonstrates a problem
in libvirt, I would like to be able to use that directly as
a functional test, regardless of language it was written
in.
Well, in theory yes - it is all nice to let anyone contribute tests in their
language of preference. Realistically, look at the nwfilter stuff in TCK, it's
a pile of unintelligible Bash which cannot be debugged. So, I understand why
you'd want to imagine or write tests in Go, but someone will have to maintain
them otherwise we'd end up with a 3rd framework nobody will want to use.
The nwfilter shell script was just badly written, and we should have
pushed back on accepting it in that state, but it is was so broadly
functional it was preferrable than getting no testing. It could have
been just as horrible if written in Perl or Python too - the core
approach of the tests themselves was just wrong. The use of shell
was just icing on the cake.
The main point is that we have applications consuming libvirt in
multiple languages. Python and Go are the most important ones in
terms of usage today, and we get some quite detailed bug reports
with demo code. If we get a bug report with a Go demo program,
it can't be assumed to be easy to just convert that to Python,
and vica-verca.
If the bug is against the Go/Whatever bindings, the test should be integrated
there. If it's against libvirt core, I don't see how the test case could not
be ported nor reproduced with Python and so I'd strictly mandate Python - now,
I'm not saying we should reject anything, what I'm saying is that the same way
we don't accept just any code in any language in libvirt upstream (even if
people went the extra mile and made sure the whole project would just work)
and people have to re-spin a number of revisions until the community is
satisfied enough to merge it because of both stability and maintainability
reasons, we should approach the tests with the same care and set clear
expectations that must be met if a test case is to be contributed.

If a test case is contributed as part of some code changes, then it has to
conform to our expecations == Python.
On the other hand if it's part of a bug report, than we need to IMO invest the
time into porting that test case to our selected language of preference
ourselves.

...
...
...
...
I'm wondering if we can't bridge the divide between what we
have already in libvirt-tck, and what you're bringing to the
table with avocado here. While we've not done much development
with the TCK recently, there are some very valuable tests
there, especially related to firewall support and I don't
fancy rewriting them.
Nobody does fancy rewriting those, but there's no way going around that, we'll
have to do it at some point because like I said, they're very error prone
since they match against the verbatim output of firewall CLI tools which leads
to frequent breakages and you figuring out what went wrong is pure torment.
Agreed, we'll have to bite the bullet eventually. The hard thing is
deciding just how we want to validate correctness. Comparing stdout
format is an "easy" way. If we don't do that we need to figure out a
way to declare what the firewall state is suposed to be, and how to
validate a match, as we want to validate that the correct set of
rules is actually generated.
Possibly we should have used a python binding for libxtables to
query the firewall state and thus been in control of the expected
data format.
I don't know anything about libxtables, so I'll happily leave it on the table
and hope for the best in terms of relying on a stable data format,
but I guess the proper way of handling networking is to come up with helper
code that constructs and sends raw packets to test whether the firewall rules
we put in place as part of a test case actually work.

...
...
...
...
Thus my suggestion is that we:
- Put this avocado code into the libvirt-tck repository,
    with focus on the supporting infra for making it easy to
    write Python tests
Clearly I missed the section where you mentioned that we don't need to host
the framework in the main repo...Anyway, I disagree here, again, we have a
precedent of 2 failed frameworks where nobody really knows where to contribute
a test nor wants to deal with any additional repos. A few remarks
- it is much easier if the
  test suite lives alongside our unit test suite which is an "environment" every
  developer is familiar with.
I don't buy that argument. We already have countless repos and people
who are motivated to contribute do so easily. The failures of the
existing frameworks is down to a number of reasons. People really
didn't want to write Perl code is the big one for the TCK. For the
other framework its complexity and fact that it was not safe to
run on precious hosts put off ever using it.
The biggest thing though is that we never even made an effort to
develop a culture of writing functional tests as a prerequisite
for merging feature.
In QEMU there is a avocado integration in-repo, but very few
QEMU contributors ever do anything with it.  The block iotests
integration is another functional test suite and that is very
successful, because the block maintainers have set clear
expectation that you must contributor tests to it if you want
your features accepted.
We can see the same with other projects which have a strong
culture of testing. In OpenStack you'll not get your feature
accepted unless you've made a credible effort to provide
functional tests - which live in a completely separate git
repo.
...
- integration into gitlab CI would be much easier with a single repo than with
  a multi-repo setup
    -> yes, we already do that with binding builds, but I'm talking about
       setups where we'd have to configure gitlab to clone 3 repos and build
       them in chain, setup the framework config and execute the tests;
As you say, we already deal with that. Cloning extra git repos is
a complete non-issue in the grand scheme of everything needed to
create and maintain a CI pipeline.  If we really wanted to we could
create a psuedo-monolithic repo and graft in the individual repos
using 'git subtree'.
Well, it's not about that it cannot be done, it's about developer experience.
Let's say I'm working on a feature to which I'm expected to contribute a test
case for.

I have to clone 2 repos and make changes to both - this is fine, even though
having it in a single repo would speed things up. Now I have to test my test
case, what then?
    - I need to build and deploy libvirt myself
    - I need to build/run the test suite against the libvirt I built in the
      previous step - all manually!

If we instead have everything in the same repo, we can (and we indeed plan to)
have automation helpers, that would prepare the environment for you and with a
single "make integration_test" an environment would be created, git would be
clone, libvirt built and installed, and test run against that. Also the review
process would be much more predictable, because a feature could not be merged
before the corresponding test was merged which is much easier to do in a single
repo than 2 individual ones, especially with a one that is solely based on a
merge request workflow which on its own is enough for many core libvirt
developers to be discouraged to bother with contributing.

All I'm trying to say here is that we need to put the developer's needs and
workflow first, because in the end, it's them who are going to both contribute
and maintain those tests. And so we should get more developer voices here (so
far it was just you Dan, Cleber, Beraldo, and myself) - if the overall
consensus is in favour of usage of arbitrary languages for test cases and
hosting the repo separately, by all means let's do it as long as the core
development team agrees on it, otherwise we'll just end up with another useless
framework and thus many human hours wasted on the CI efforts.

Erik