
On Mon, Feb 06, 2023 at 05:19:52PM +0100, Erik Skultety wrote:
On Mon, Feb 06, 2023 at 03:45:12PM +0000, Daniel P. Berrangé wrote:
On Mon, Feb 06, 2023 at 02:52:57PM +0100, Erik Skultety wrote:
This is a follow up to: https://listman.redhat.com/archives/libvir-list/2023-January/237201.html
The effort here is to unify the way builds/tests are executed in GitLab CI vs local container executions and make another step forward in terms of reproducibility of (specifically) GitLab environments.
Even though code to run all but one (coverity) jobs from GitLab via the build.sh script is added with this series, local behavior remains the same as before this series. The reason for that is that that will require more patches ridding of the Makefile which is currently used and instead integrate usage of lcitool with the ci/helper Python script which is currently the entry point for local container executions.
snip
.gitlab-ci.yml | 56 ++++++++++------------- ci/Makefile | 16 ++++--- ci/build.sh | 121 +++++++++++++++++++++++++++++++++++++++++++------ ci/helper | 21 ++++++--- 4 files changed, 155 insertions(+), 59 deletions(-) mode change 100644 => 100755 ci/build.sh
I'm really super unenthusiastic about this change, and also the similar change to add an ci/integration.sh. Shell scripting is something we worked hard to eliminate from libvirt. It is an awful language to do anything non-trivial with, error handling, lack of data structures, variable handling, portability are all bug generators.
I know the .gitlab-ci.yml 'script' commands are technically shell script, but they are pretty trivial bits and don't have to worry about portability for dash vs bash etc, or complex control logic. The majority of it is just a simple list of commands to invoke, with the occasional conditional.
The build.sh script is by contrast significantly more complex. By the very nature of taking "N" separate gitlab jobs and multiplexing them onto the one shell script, we're now having todo command line arg parsing in shell and de-multiplexing back to commands. The CI logic is now split up across even more sources - the gitlab config, the build.sh script and the meson.build files, which I think is worse for maintaining this too.
True, except that even despite that, and I know what I'm talking about since I wrote the integration jobs templates, GitLab is super picky; you can't debug the syntax problems properly; certain common syntactic sugars don't work and have to be replaced by less common counterparts; using YAML for bash formatting leads to many table-flipping moments; you have to wait for Nx10 minutes before it fails with some ridiculous error you would have caught early locally but you couldn't because of the stage and artifact dependencies. So, once you have it finally in place it's unarguably nice, but the road towards that isn't really rainbow and unicorns either, it's still shell after all. Since I'm mainly working on this part of libvirt, I myself would appreciate if I could easily just run a single script instead of copy pasting commands one-by-one to a local VM to catch stupid errors quickly rather than wait for GitLab to fail.
I appreciate the goal of being able to run CI commands locally, but I'm not feeling this approach is a net win. I'd much rather just having developers invoke the meson command directly, which is not that hard.
Well, the big picture here is about making running integration tests locally less painful than it is now...plus increase the amount of automation involved. That's the major driver here - libvirt needs to be built locally in a safe environment before integration tests could be run against it without touching the host. So, with this script, while I agree with everything you said about Bash, was just one step towards getting the automation I mentioned in place. I also considered Python (yeah..why), but that would be super akward.
TL;DR: if we don't do anything about how we currently maintain the build recipes (note we're maintaining 2 already), everybody will continue ignoring the integration test suite, unless we enable merge requests where the status quo would be somewhat (but not completely) eliminated. With integration tests you can't ignore the aspect of getting feedback early compared to waiting for GitLab CI.
If we do really want to provide something that 100% mirrors the gitlab CI job commands though, I'm even more convinced now that we should be using the .gitlab-ci.yml as the canonical source.
Since I last mentioned that idea, I found out that this should be something we can achieve via the gitlab runner 'exec' command.
Haven't heard of ^this one, but I wonder how we can get something meaningful out of it for the purposes I mentioned above.
Erik
So, I had a brief look at gihttps://gitlab.com/gitlab-org/gitlab-runner/-/issues/2797tlab-runner exec this morning. Long story short, it's deprecated [1] and though it was scheduled for complete removal, that plan was put on hold for now. Nevertheless, it's not getting any features that are introduced to .gitlab-ci.yml. The gist is that it doesn't understand 'include' nor 'extends' which we're using heavily across our gitlab configuration hierarchy, so it's a no-go. It also doesn't support artifacts in any form, so while it technically should be possible to save RPM builds in a volume (not sure if the bind mount is cleared on job completion or not) we could not pass them into a VM running the integration tests the same convenient way as we do in the CI. Another thing to consider is that this solution also requires to have the gitlab-runner binary installed locally, which many in the community may not be comfortable with, but that's up for a debate, but hey, you can always build it yourself. There's a plan to rework GitLab's local CI execution functionality heavily [2] before the 'exec' command is sunset, but ultimately their plan is have something working by Q3 FY24, which can be anywhere from March 2024 until September 2024, that's more than a year ahead. In your reply you mentioned the unappealing complexity of the script potentially leading to bugs. At the same time though one currently can't consume .gitlab-ci.yml recipes to run a local integration test in a VM. So, how about I get rid of the multiplexing and CLI parsing by placing each job recipe in a standalone script or even going one step further and extract the commonalities to a separate file that would be source from within each job script? Would you consider that meeting each other halfway? As for POSIX compliance, I guess this would be a soft-requirement based on whether shellcheck was run during review, does gitlab do something better? IIUC there's no pre-check and you'll only know after a job was executed that there was an incompatible statement. [1] https://docs.gitlab.com/runner/commands/#gitlab-runner-exec-deprecated [2] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2797 Regards, Erik