[libvirt PATCH 0/5] gitlab: expand the CI job coverage (RESEND)

There are two main goals with this series - Introduce a minimal job building the website and publishing an artifact which can be deployed onto libvirt.org - Expanding CI jobs to get coverage closer to Travis/Jenkins Previous posting lost last two mails due to transient SMTP problem Daniel P. Berrangé (5): gitlab: use CI for building website contents gitlab: reduce number of cross-build CI jobs gitlab: group jobs into stages gitlab: add several native CI jobs gitlab: rename the cross build jobs .gitlab-ci.yml | 105 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 75 insertions(+), 30 deletions(-) -- 2.24.1

Run the bare minimum build that is possible to create the docs. Ideally the '--without-remote' arg would be passed, but there are several bugs preventing a build from succeeding without the remote driver built. The generated website is published as an artifact and thus is browsable on build completion and can be downloaded as a zip file. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index ea49c6178b..6f7e0ce135 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -44,3 +44,26 @@ debian-sid-cross-i686: debian-sid-cross-mipsel: <<: *job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-mipsel:latest + +# This artifact published by this job is downloaded by libvirt.org to +# be deployed to the web root: +# https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... +website: + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS --prefix=$(pwd)/../vroot || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) + - make -j $(getconf _NPROCESSORS_ONLN) install + - cd .. + - mv vroot/share/doc/libvirt/html/ website + image: quay.io/libvirt/buildenv-libvirt-fedora-31:latest + variables: + CONFIGURE_OPTS: --without-libvirtd --without-esx --without-hyperv --without-test --without-dtrace --without-openvz --without-vmware --without-attr --without-audit --without-blkid --without-bash-completion --without-capng --without-curl --without-dbus --without-firewalld --without-fuse --without-glusterfs --without-libiscsi --without-libssh --without-numactl --without-openwsman --without-pciaccess --without-readline --without-sanlock --without-sasl --without-selinux --without-ssh2 --without-udev + artifacts: + expose_as: 'Website' + name: 'website' + when: on_success + expire_in: 30 days + paths: + - website -- 2.24.1

On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
+# This artifact published by this job is downloaded by libvirt.org to +# be deployed to the web root: +# https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... +website: + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS --prefix=$(pwd)/../vroot || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) + - make -j $(getconf _NPROCESSORS_ONLN) install + - cd .. + - mv vroot/share/doc/libvirt/html/ website + image: quay.io/libvirt/buildenv-libvirt-fedora-31:latest + variables: + CONFIGURE_OPTS: --without-libvirtd --without-esx --without-hyperv --without-test --without-dtrace --without-openvz --without-vmware --without-attr --without-audit --without-blkid --without-bash-completion --without-capng --without-curl --without-dbus --without-firewalld --without-fuse --without-glusterfs --without-libiscsi --without-libssh --without-numactl --without-openwsman --without-pciaccess --without-readline --without-sanlock --without-sasl --without-selinux --without-ssh2 --without-udev + artifacts: + expose_as: 'Website' + name: 'website' + when: on_success + expire_in: 30 days + paths: + - website
The overall idea of building the website as a CI job is a reasonable one, especially because it will allow us to stop periodically speding time convincing libvirt to build just enough on what has long been an unsupported target. A couple of more technical questions: * why do we care about whether all those features are enabled or not? It's pretty ugly to have that list hardcoded in our build scripts, and I don't quite get the point in having it in the first place; * as a follow up, why would this be a separate job? We are always going to build the website on one of our supported targets, so basically we end up building it twice... Can't we just generate the artifact as a side-effect of the regular Fedora 31 build? -- Andrea Bolognani / Red Hat / Virtualization

On Fri, Mar 20, 2020 at 06:07:47PM +0100, Andrea Bolognani wrote:
On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
+# This artifact published by this job is downloaded by libvirt.org to +# be deployed to the web root: +# https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... +website: + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS --prefix=$(pwd)/../vroot || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) + - make -j $(getconf _NPROCESSORS_ONLN) install + - cd .. + - mv vroot/share/doc/libvirt/html/ website + image: quay.io/libvirt/buildenv-libvirt-fedora-31:latest + variables: + CONFIGURE_OPTS: --without-libvirtd --without-esx --without-hyperv --without-test --without-dtrace --without-openvz --without-vmware --without-attr --without-audit --without-blkid --without-bash-completion --without-capng --without-curl --without-dbus --without-firewalld --without-fuse --without-glusterfs --without-libiscsi --without-libssh --without-numactl --without-openwsman --without-pciaccess --without-readline --without-sanlock --without-sasl --without-selinux --without-ssh2 --without-udev + artifacts: + expose_as: 'Website' + name: 'website' + when: on_success + expire_in: 30 days + paths: + - website
The overall idea of building the website as a CI job is a reasonable one, especially because it will allow us to stop periodically speding time convincing libvirt to build just enough on what has long been an unsupported target.
A couple of more technical questions:
* why do we care about whether all those features are enabled or not? It's pretty ugly to have that list hardcoded in our build scripts, and I don't quite get the point in having it in the first place;
It is to reduce the build time - it cuts time for the job in 1/2 which is worthwhile win.
* as a follow up, why would this be a separate job? We are always going to build the website on one of our supported targets, so basically we end up building it twice...
Can't we just generate the artifact as a side-effect of the regular Fedora 31 build?
The other jobs run "distcheck" for building, and as such the built artifacts are all deleted upon success, as its all internal to the distcheck target. Using gitlab stages for different types of builds gives us a more friendly output view. We can distinguish what aspect of the build has failed at a glance instead of having to peer into the 100's of KB of build logs. Eventually we can make use of filters so that when people submit a patch which only touches the docs, we can skip all the native build and cross build jobs entirely, only running the docs stage. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, 2020-03-20 at 17:27 +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 06:07:47PM +0100, Andrea Bolognani wrote:
* why do we care about whether all those features are enabled or not? It's pretty ugly to have that list hardcoded in our build scripts, and I don't quite get the point in having it in the first place;
It is to reduce the build time - it cuts time for the job in 1/2 which is worthwhile win.
On my laptop, where make is configured to use parallel builds by default through $MAKEFLAGS: $ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh && make && make install DESTDIR=$(pwd)/../install' real 2m52.997s user 14m46.604s sys 1m56.444s $ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh --without-libvirtd --without-esx --without-hyperv --without-test --without-dtrace --without-openvz --without-vmware --without-attr --without-audit --without-blkid --without-bash-completion --without-capng --without-curl --without-dbus --without-firewalld --without-fuse --without-glusterfs --without-libiscsi --without-libssh --without-numactl --without-openwsman --without-pciaccess --without-readline --without-sanlock --without-sasl --without-selinux --without-ssh2 --without-udev && make && make install DESTDIR=$(pwd)/../install' real 1m59.594s user 9m4.929s sys 1m13.152s $ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh && make -C docs/ && make -C docs/ install DESTDIR=$(pwd)/../install' real 0m33.350s user 0m54.281s sys 0m10.986s So we can basically have our cake and eat it too! :)
Using gitlab stages for different types of builds gives us a more friendly output view. We can distinguish what aspect of the build has failed at a glance instead of having to peer into the 100's of KB of build logs.
Are we eventually going to have the same syntax-check / build / check split as we currently have in Jenkins?
Eventually we can make use of filters so that when people submit a patch which only touches the docs, we can skip all the native build and cross build jobs entirely, only running the docs stage.
Sounds like a nice little optimization, assuming we can get it to work reliably, but I have to wonder how frequent it really is that the documentation is updated outside of a series that touches the code as well... -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Mar 23, 2020 at 03:35:03PM +0100, Andrea Bolognani wrote:
On Fri, 2020-03-20 at 17:27 +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 06:07:47PM +0100, Andrea Bolognani wrote:
* why do we care about whether all those features are enabled or not? It's pretty ugly to have that list hardcoded in our build scripts, and I don't quite get the point in having it in the first place;
It is to reduce the build time - it cuts time for the job in 1/2 which is worthwhile win.
On my laptop, where make is configured to use parallel builds by default through $MAKEFLAGS:
$ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh && make && make install DESTDIR=$(pwd)/../install'
real 2m52.997s user 14m46.604s sys 1m56.444s
$ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh --without-libvirtd --without-esx --without-hyperv --without-test --without-dtrace --without-openvz --without-vmware --without-attr --without-audit --without-blkid --without-bash-completion --without-capng --without-curl --without-dbus --without-firewalld --without-fuse --without-glusterfs --without-libiscsi --without-libssh --without-numactl --without-openwsman --without-pciaccess --without-readline --without-sanlock --without-sasl --without-selinux --without-ssh2 --without-udev && make && make install DESTDIR=$(pwd)/../install'
real 1m59.594s user 9m4.929s sys 1m13.152s
$ git clean -xdf && time sh -c 'mkdir build && cd build && ../autogen.sh && make -C docs/ && make -C docs/ install DESTDIR=$(pwd)/../install'
real 0m33.350s user 0m54.281s sys 0m10.986s
So we can basically have our cake and eat it too! :)
Using gitlab stages for different types of builds gives us a more friendly output view. We can distinguish what aspect of the build has failed at a glance instead of having to peer into the 100's of KB of build logs.
Are we eventually going to have the same syntax-check / build / check split as we currently have in Jenkins?
This isn't a desirable approach, because in general you're not going to be sharing the git checkout between build jobs in the pipeline. You can selectively publish data from one stage to another, but we don't really want to publish the entire build dir output, which is what would be required to split off the build & check stages. The main benefit for having them separate is to make it easier to view the logs to see what part failed. GitLab has a mechanism for publishing artifacts, and the GNOME projects use this to publish their unit tests results in junit format IIUC. If we can get something like this wired up then we can solve the problem if making it easy to view test failures as a distinct thing from general build failures. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, 2020-03-23 at 15:27 +0000, Daniel P. Berrangé wrote:
On Mon, Mar 23, 2020 at 03:35:03PM +0100, Andrea Bolognani wrote:
Are we eventually going to have the same syntax-check / build / check split as we currently have in Jenkins?
This isn't a desirable approach, because in general you're not going to be sharing the git checkout between build jobs in the pipeline. You can selectively publish data from one stage to another, but we don't really want to publish the entire build dir output, which is what would be required to split off the build & check stages.
Makes sense. I was asking mostly out of curiosity anyway.
The main benefit for having them separate is to make it easier to view the logs to see what part failed.
GitLab has a mechanism for publishing artifacts, and the GNOME projects use this to publish their unit tests results in junit format IIUC. If we can get something like this wired up then we can solve the problem if making it easy to view test failures as a distinct thing from general build failures.
That'd be neat :) -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Mar 10, 2020 at 10:09:41AM +0000, Daniel P. Berrangé wrote:
Run the bare minimum build that is possible to create the docs. Ideally the '--without-remote' arg would be passed, but there are several bugs preventing a build from succeeding without the remote driver built.
The generated website is published as an artifact and thus is browsable on build completion and can be downloaded as a zip file.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> ---
Reviewed-by: Erik Skultety <skultety.erik@gmail.com>

We're going to add more build jobs to CI, and users have limited time granted on the shared CI runners. The number of cross-build jobs currently present is not sustainable, so cut it down to two interesting jobs to cover big endian and 32-bit platform variants. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 37 ++++++------------------------------- 1 file changed, 6 insertions(+), 31 deletions(-) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 6f7e0ce135..b6a8db7881 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -5,29 +5,12 @@ - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) - make -j $(getconf _NPROCESSORS_ONLN) -# We could run every arch on every versions, but it is a little -# overkill. Instead we split jobs evenly across 9, 10 and sid -# to achieve reasonable cross-coverage. - -debian-9-cross-armv6l: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-armv6l:latest - -debian-9-cross-mips64el: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips64el:latest - -debian-9-cross-mips: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips:latest - -debian-10-cross-aarch64: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-aarch64:latest - -debian-10-cross-ppc64le: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-ppc64le:latest +# There are many possible cross-arch jobs we could do, but to preserve +# limited CI resource time allocated to users, we cut it down to two +# interesting variants. The default jobs are x86_64, which means 64-bit +# and little endian. We thus pick armv7l as an interesting 32-bit +# platform, and s390x as an interesting big endian platform. We split +# between Debian 10 and sid to help detect problems on the horizon. debian-10-cross-s390x: <<: *job_definition @@ -37,14 +20,6 @@ debian-sid-cross-armv7l: <<: *job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-armv7l:latest -debian-sid-cross-i686: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-i686:latest - -debian-sid-cross-mipsel: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-mipsel:latest - # This artifact published by this job is downloaded by libvirt.org to # be deployed to the web root: # https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... -- 2.24.1

On Tue, Mar 10, 2020 at 10:09:42AM +0000, Daniel P. Berrangé wrote:
We're going to add more build jobs to CI, and users have limited time granted on the shared CI runners. The number of cross-build jobs currently present is not sustainable, so cut it down to two interesting jobs to cover big endian and 32-bit platform variants.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 37 ++++++------------------------------- 1 file changed, 6 insertions(+), 31 deletions(-)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 6f7e0ce135..b6a8db7881 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -5,29 +5,12 @@ - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) - make -j $(getconf _NPROCESSORS_ONLN)
-# We could run every arch on every versions, but it is a little -# overkill. Instead we split jobs evenly across 9, 10 and sid -# to achieve reasonable cross-coverage. - -debian-9-cross-armv6l: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-armv6l:latest - -debian-9-cross-mips64el: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips64el:latest - -debian-9-cross-mips: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips:latest - -debian-10-cross-aarch64: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-aarch64:latest - -debian-10-cross-ppc64le: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-ppc64le:latest +# There are many possible cross-arch jobs we could do, but to preserve +# limited CI resource time allocated to users, we cut it down to two +# interesting variants. The default jobs are x86_64, which means 64-bit +# and little endian. We thus pick armv7l as an interesting 32-bit +# platform, and s390x as an interesting big endian platform. We split +# between Debian 10 and sid to help detect problems on the horizon.
Will the sid actually be reliable? I've been trying to install sid into a VM with lcitool for quite a while and it was broken every time I tried. On the other hand, those container images are static, so we're going to run "older" container builds and not get the latest development stuff, but I guess that's fine for these purposes. Erik

On Fri, Mar 20, 2020 at 03:57:40PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:42AM +0000, Daniel P. Berrangé wrote:
We're going to add more build jobs to CI, and users have limited time granted on the shared CI runners. The number of cross-build jobs currently present is not sustainable, so cut it down to two interesting jobs to cover big endian and 32-bit platform variants.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 37 ++++++------------------------------- 1 file changed, 6 insertions(+), 31 deletions(-)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 6f7e0ce135..b6a8db7881 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -5,29 +5,12 @@ - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) - make -j $(getconf _NPROCESSORS_ONLN)
-# We could run every arch on every versions, but it is a little -# overkill. Instead we split jobs evenly across 9, 10 and sid -# to achieve reasonable cross-coverage. - -debian-9-cross-armv6l: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-armv6l:latest - -debian-9-cross-mips64el: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips64el:latest - -debian-9-cross-mips: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-9-cross-mips:latest - -debian-10-cross-aarch64: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-aarch64:latest - -debian-10-cross-ppc64le: - <<: *job_definition - image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-ppc64le:latest +# There are many possible cross-arch jobs we could do, but to preserve +# limited CI resource time allocated to users, we cut it down to two +# interesting variants. The default jobs are x86_64, which means 64-bit +# and little endian. We thus pick armv7l as an interesting 32-bit +# platform, and s390x as an interesting big endian platform. We split +# between Debian 10 and sid to help detect problems on the horizon.
Will the sid actually be reliable? I've been trying to install sid into a VM with lcitool for quite a while and it was broken every time I tried. On the other hand, those container images are static, so we're going to run "older" container builds and not get the latest development stuff, but I guess that's fine for these purposes.
Yeah, right now we're insulated from any instability in sid. In future I'd like to dynamically generated containers on demand as part of the gitlab job, at which point instability will be critical, but we can worry about it when we get there. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Within a stage all jobs run in parallel. Stages are ordered so later stages are only executed if previous stages succeeded. By using separate stages for the cross builds, we can avoid wasting CI resources if the relatively simple website build fails. Later we can avoid running cross builds, if the native build fails too. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index b6a8db7881..e28ec584ea 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,4 +1,9 @@ -.job_template: &job_definition +stages: + - website + - cross_build + +.cross_build_job_template: &cross_build_job_definition + stage: cross_build script: - mkdir build - cd build @@ -13,17 +18,18 @@ # between Debian 10 and sid to help detect problems on the horizon. debian-10-cross-s390x: - <<: *job_definition + <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-s390x:latest debian-sid-cross-armv7l: - <<: *job_definition + <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-armv7l:latest # This artifact published by this job is downloaded by libvirt.org to # be deployed to the web root: # https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... website: + stage: website script: - mkdir build - cd build -- 2.24.1

On Tue, Mar 10, 2020 at 10:09:43AM +0000, Daniel P. Berrangé wrote:
Within a stage all jobs run in parallel. Stages are ordered so later stages are only executed if previous stages succeeded. By using separate stages for the cross builds, we can avoid wasting CI resources if the relatively simple website build fails. Later we can avoid running cross builds, if the native build fails too.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
Reviewed-by: Erik Skultety <eskultet@redhat.com>

With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build + +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck + +debian-9: + <<: *native_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-debian-9:latest + +centos-7: + <<: *native_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-centos-7:latest + +fedora-31: + <<: *native_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-fedora-31:latest + +fedora-rawhide: + <<: *native_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-fedora-rawhide:latest + +ubuntu-1804: + <<: *native_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-ubuntu-1804:latest + + .cross_build_job_template: &cross_build_job_definition stage: cross_build script: @@ -10,6 +42,7 @@ stages: - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) - make -j $(getconf _NPROCESSORS_ONLN) + # There are many possible cross-arch jobs we could do, but to preserve # limited CI resource time allocated to users, we cut it down to two # interesting variants. The default jobs are x86_64, which means 64-bit @@ -25,6 +58,14 @@ debian-sid-cross-armv7l: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-armv7l:latest +fedora-30-cross-mingw32: + <<: *cross_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw32:latest + +fedora-30-cross-mingw64: + <<: *cross_build_job_definition + image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw64:latest + # This artifact published by this job is downloaded by libvirt.org to # be deployed to the web root: # https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=webs... -- 2.24.1

On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build Also, since I've been playing with migrating other machines to PSI for a while, 'make' should be replaced with $MAKE otherwise native_build job reference won't work on FreeBSD. Maybe even do make install to VIRT_PREFIX? Otherwise looks good to me. Reviewed-by: Erik Skultety <skultety.erik@gmail.com>

On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Also, since I've been playing with migrating other machines to PSI for a while, 'make' should be replaced with $MAKE otherwise native_build job reference won't work on FreeBSD.
I'll need to check if $MAKE is actually set or not.
Maybe even do make install to VIRT_PREFIX?
'distcheck' does an install step. There's no shared install tree between jobs, so the VIRT_PREFIX concept isn't applicable.
Otherwise looks good to me. Reviewed-by: Erik Skultety <skultety.erik@gmail.com>
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Mar 20, 2020 at 02:59:36PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Right, but my point was that if e.g. we introduce a FreeBSD builder, we'd want to reference the same job template in which case the directory will exist.
Also, since I've been playing with migrating other machines to PSI for a while, 'make' should be replaced with $MAKE otherwise native_build job reference won't work on FreeBSD.
I'll need to check if $MAKE is actually set or not.
Huh, it's actually not...we need to fix that, but that is a patch for another day.
Maybe even do make install to VIRT_PREFIX?
'distcheck' does an install step. There's no shared install tree between jobs, so the VIRT_PREFIX concept isn't applicable.
Oh, didn't realize that.
Otherwise looks good to me. Reviewed-by: Erik Skultety <skultety.erik@gmail.com>
Damn, this should have been: Reviewed-by: Erik Skultety <eskultet@redhat.com> #combinedworkingenvironment

On Fri, Mar 20, 2020 at 04:18:58PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 02:59:36PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Right, but my point was that if e.g. we introduce a FreeBSD builder, we'd want to reference the same job template in which case the directory will exist.
I've not looked in great detail about how gitlab runners work, but something on the runner needs to be responsible for checking out the git repo before any of this gitlab-ci.yml scrpit runs. IMHO that can ensure it is "git clean -xdf" so that its state matches what we see with the built-in shared runners build environment. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Mar 20, 2020 at 03:31:24PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 04:18:58PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 02:59:36PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Right, but my point was that if e.g. we introduce a FreeBSD builder, we'd want to reference the same job template in which case the directory will exist.
I've not looked in great detail about how gitlab runners work, but something on the runner needs to be responsible for checking out the git repo before any of this gitlab-ci.yml scrpit runs.
IMHO that can ensure it is "git clean -xdf" so that its state matches what we see with the built-in shared runners build environment.
This is achieved by the GIT_CLEAN_FLAGS which, if unspecified, defaults to -ffdx, so we're covered there. One more thing that crossed my mind and may speedup the builds a bit - some time ago I proposed a patch against libvirt-jenkins-ci to enable shallow cloning which was rejected because of compelling reasons. However, for gitlab purposes, we can't really do any debugging on the shared runners, so I believe we may want to apply shallow cloning there, especially since libvirt is quite a large repo. Gitlab claims [1] the default strategy to be 'fetch' if the worktree can be found on the disk (this is when used with the 'docker' executor which should be the case with shared runners). What I couldn't find is how long are these worktrees cached which could therefore be a factor and we might want to favour the shallow clones instead. [1] https://docs.gitlab.com/ee/ci/large_repositories/#git-strategy

On Mon, Mar 23, 2020 at 12:50:45PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 03:31:24PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 04:18:58PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 02:59:36PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e28ec584ea..3e15d08d17 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,7 +1,39 @@ stages: - website + - native_build - cross_build
+ +.native_build_job_template: &native_build_job_definition + stage: native_build + script: + - mkdir build + - cd build + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Right, but my point was that if e.g. we introduce a FreeBSD builder, we'd want to reference the same job template in which case the directory will exist.
I've not looked in great detail about how gitlab runners work, but something on the runner needs to be responsible for checking out the git repo before any of this gitlab-ci.yml scrpit runs.
IMHO that can ensure it is "git clean -xdf" so that its state matches what we see with the built-in shared runners build environment.
This is achieved by the GIT_CLEAN_FLAGS which, if unspecified, defaults to -ffdx, so we're covered there.
One more thing that crossed my mind and may speedup the builds a bit - some time ago I proposed a patch against libvirt-jenkins-ci to enable shallow cloning which was rejected because of compelling reasons. However, for gitlab purposes, we can't really do any debugging on the shared runners, so I believe we may want to apply shallow cloning there, especially since libvirt is quite a large repo. Gitlab claims [1] the default strategy to be 'fetch' if the worktree can be found on the disk (this is when used with the 'docker' executor which should be the case with shared runners). What I couldn't find is how long are these worktrees cached which could therefore be a factor and we might want to favour the shallow clones instead.
[1] https://docs.gitlab.com/ee/ci/large_repositories/#git-strategy
I wonder if GitLab has the same problem with shallow cloning as we hit before. I can't remember if it was with Travis or Jenkins now, but there was a race where it was schedule the job, but before the git repo was checked out someone would push another series. The shallow clone would thus get the commit of the newly pushed series, and would not have enough history to get to the commit that actually needed testing. I guess we could get a win even if we set the clone depth to something fairly large like 100 commits of history. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Mar 23, 2020 at 12:01:49PM +0000, Daniel P. Berrangé wrote:
On Mon, Mar 23, 2020 at 12:50:45PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 03:31:24PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 04:18:58PM +0100, Erik Skultety wrote:
On Fri, Mar 20, 2020 at 02:59:36PM +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 03:52:15PM +0100, Erik Skultety wrote:
On Tue, Mar 10, 2020 at 10:09:44AM +0000, Daniel P. Berrangé wrote: > With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we > need to expand the coverage to include native builds. This patch adds > all the jobs currently run in Travis. Compared to Jenkins we obviously > miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the > Ubuntu 1804 job as a substitute for Debian. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > --- > .gitlab-ci.yml | 41 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 41 insertions(+) > > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > index e28ec584ea..3e15d08d17 100644 > --- a/.gitlab-ci.yml > +++ b/.gitlab-ci.yml > @@ -1,7 +1,39 @@ > stages: > - website > + - native_build > - cross_build > > + > +.native_build_job_template: &native_build_job_definition > + stage: native_build > + script: > + - mkdir build > + - cd build > + - ../autogen.sh $CONFIGURE_OPTS || (cat config.log && exit 1) > + - make -j $(getconf _NPROCESSORS_ONLN) syntax-check > + - make -j $(getconf _NPROCESSORS_ONLN) distcheck
I think ^this should more closely follow what we have in the lcitool playbooks, e.g. start with: - rm -rf build
The source tree is already pristine because this is always executed in a fresh container environment, so there's nothing that will need deleting.
Right, but my point was that if e.g. we introduce a FreeBSD builder, we'd want to reference the same job template in which case the directory will exist.
I've not looked in great detail about how gitlab runners work, but something on the runner needs to be responsible for checking out the git repo before any of this gitlab-ci.yml scrpit runs.
IMHO that can ensure it is "git clean -xdf" so that its state matches what we see with the built-in shared runners build environment.
This is achieved by the GIT_CLEAN_FLAGS which, if unspecified, defaults to -ffdx, so we're covered there.
One more thing that crossed my mind and may speedup the builds a bit - some time ago I proposed a patch against libvirt-jenkins-ci to enable shallow cloning which was rejected because of compelling reasons. However, for gitlab purposes, we can't really do any debugging on the shared runners, so I believe we may want to apply shallow cloning there, especially since libvirt is quite a large repo. Gitlab claims [1] the default strategy to be 'fetch' if the worktree can be found on the disk (this is when used with the 'docker' executor which should be the case with shared runners). What I couldn't find is how long are these worktrees cached which could therefore be a factor and we might want to favour the shallow clones instead.
[1] https://docs.gitlab.com/ee/ci/large_repositories/#git-strategy
I wonder if GitLab has the same problem with shallow cloning as we hit before. I can't remember if it was with Travis or Jenkins now, but there was a race where it was schedule the job, but before the git repo was checked out someone would push another series. The shallow clone would thus get the commit of the newly pushed series, and would not have enough history to get to the commit that actually needed testing.
I guess we could get a win even if we set the clone depth to something fairly large like 100 commits of history.
Google didn't return anything relevant in this regard, except of a few resources: [1] https://docs.gitlab.com/ce/ci/pipelines/settings.html#git-shallow-clone Although ^this doesn't apply to libvirt [2] https://gitlab.com/gitlab-org/gitlab-runner/issues/3460 ^This I believe would be the long term solution, but I wouldn't hold my breath I'm okay with anything shorter than the full history, branches and tags.

On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Once again, two questions: * why are we replicating the jobs that already exist in Travis CI? I get it that we eventually want to move everything over to GitLab CI, but for the time being it looks like we're not really gaining anything from culling jobs - we're just significantly reducing our test coverage; * what's the endgame here? Are we going to rely solely on the runners that GitLab provides for free, or are we going to plug our own runners into the infrastructure? Because the former would severely limit our test coverage, and bring it down to a much, much worse state than what we currently have. -- Andrea Bolognani / Red Hat / Virtualization

On Fri, Mar 20, 2020 at 06:13:13PM +0100, Andrea Bolognani wrote:
On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Once again, two questions:
* why are we replicating the jobs that already exist in Travis CI? I get it that we eventually want to move everything over to GitLab CI, but for the time being it looks like we're not really gaining anything from culling jobs - we're just significantly reducing our test coverage;
* what's the endgame here? Are we going to rely solely on the runners that GitLab provides for free, or are we going to plug our own runners into the infrastructure? Because the former would severely limit our test coverage, and bring it down to a much, much worse state than what we currently have.
The goal is to consolidate everything onto GitLab CI so that we have a single source of truth for people to look at to determine build status. IOW, the end game is to eliminate both Jenkins and Travis. The only technical gaps that GitLab has compared to our current platforms are FreeBSD and macOS. Linux coverage is fine since it can run arbitrary containers. For FreeBSD we have virtual hardware we can use to provide custom runners for GitLab to enable use of VMs with FreeBSD installs. I'm not proposing to turn off the Linux Jenkins jobs just yet, as the sub-package chain builds still rely on that. The sub-packages will all need wiring up into GitLab CI too. Ultimately though there's nothing Jenkins does that we can't replicate in GitLab. We don't have an immediate solution for macOS due to the licensing issues around use of macOS, so Travis will unfortunately still be needed for that platform, but the remainder of jobs are not required. It might be possible to wire up a GitLab job that pushes the changes over to Travis, and pulls its build result back for macOS, so we still have a single viewing point, even if we still use Travis. This series has cut down on the number of cross-build jobs we're running, but my experience of looking at the results we've seen with them, is that they've almost never identified a bug we didn't already see with one for the other jobs. So I think it is reasonable to cut cross-build jobs down to focus on the important unique aspects - a 32-bit build and a big endian build. Overall we're not loosing any notable CI coverage with this series, and we'll gain an improved view of our results. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, 2020-03-20 at 17:44 +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 06:13:13PM +0100, Andrea Bolognani wrote:
On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Once again, two questions:
* why are we replicating the jobs that already exist in Travis CI? I get it that we eventually want to move everything over to GitLab CI, but for the time being it looks like we're not really gaining anything from culling jobs - we're just significantly reducing our test coverage;
* what's the endgame here? Are we going to rely solely on the runners that GitLab provides for free, or are we going to plug our own runners into the infrastructure? Because the former would severely limit our test coverage, and bring it down to a much, much worse state than what we currently have.
The goal is to consolidate everything onto GitLab CI so that we have a single source of truth for people to look at to determine build status. IOW, the end game is to eliminate both Jenkins and Travis.
The only technical gaps that GitLab has compared to our current platforms are FreeBSD and macOS. Linux coverage is fine since it can run arbitrary containers.
But we are limited in the number of containers we can run for each build.
For FreeBSD we have virtual hardware we can use to provide custom runners for GitLab to enable use of VMs with FreeBSD installs.
Since we're going to need a FreeBSD machine configured as a GitLab runner, wouldn't it make sense for us to also create one or more Linux machines that would run jobs on behalf of GitLab CI? That way we'd remove the limitation on the number of containers we can have running at the same time, and still have everything controlled by GitLab.
I'm not proposing to turn off the Linux Jenkins jobs just yet, as the sub-package chain builds still rely on that. The sub-packages will all need wiring up into GitLab CI too. Ultimately though there's nothing Jenkins does that we can't replicate in GitLab.
We don't have an immediate solution for macOS due to the licensing issues around use of macOS, so Travis will unfortunately still be needed for that platform, but the remainder of jobs are not required. It might be possible to wire up a GitLab job that pushes the changes over to Travis, and pulls its build result back for macOS, so we still have a single viewing point, even if we still use Travis.
That sounds like it would get very hairy, very quickly. But it would surely be nice, and an improvement over our current status quo, to have a consolidated view of all build jobs.
This series has cut down on the number of cross-build jobs we're running, but my experience of looking at the results we've seen with them, is that they've almost never identified a bug we didn't already see with one for the other jobs. So I think it is reasonable to cut cross-build jobs down to focus on the important unique aspects - a 32-bit build and a big endian build.
Overall we're not loosing any notable CI coverage with this series, and we'll gain an improved view of our results.
While, for example, Ubuntu 18.04 is an acceptable enough stand-in for Debian 10, I'd really much rather test on the real thing. I'm okay with this kind of cutback in the short term, but only if the long term plan is to restore (and hopefully increase!) coverage, and I don't see how that can be achieved without bringing up our own Linux runners. -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Mar 23, 2020 at 04:00:17PM +0100, Andrea Bolognani wrote:
On Fri, 2020-03-20 at 17:44 +0000, Daniel P. Berrangé wrote:
On Fri, Mar 20, 2020 at 06:13:13PM +0100, Andrea Bolognani wrote:
On Tue, 2020-03-10 at 10:09 +0000, Daniel P. Berrangé wrote:
With GitLab CI aiming to replace Jenkins and Travis for CI purposes, we need to expand the coverage to include native builds. This patch adds all the jobs currently run in Travis. Compared to Jenkins we obviously miss the FreeBSD jobs, but also Debian 10 and Fedora 30, but we gain the Ubuntu 1804 job as a substitute for Debian.
Once again, two questions:
* why are we replicating the jobs that already exist in Travis CI? I get it that we eventually want to move everything over to GitLab CI, but for the time being it looks like we're not really gaining anything from culling jobs - we're just significantly reducing our test coverage;
* what's the endgame here? Are we going to rely solely on the runners that GitLab provides for free, or are we going to plug our own runners into the infrastructure? Because the former would severely limit our test coverage, and bring it down to a much, much worse state than what we currently have.
The goal is to consolidate everything onto GitLab CI so that we have a single source of truth for people to look at to determine build status. IOW, the end game is to eliminate both Jenkins and Travis.
The only technical gaps that GitLab has compared to our current platforms are FreeBSD and macOS. Linux coverage is fine since it can run arbitrary containers.
But we are limited in the number of containers we can run for each build.
For FreeBSD we have virtual hardware we can use to provide custom runners for GitLab to enable use of VMs with FreeBSD installs.
Since we're going to need a FreeBSD machine configured as a GitLab runner, wouldn't it make sense for us to also create one or more Linux machines that would run jobs on behalf of GitLab CI? That way we'd remove the limitation on the number of containers we can have running at the same time, and still have everything controlled by GitLab.
It is bit more complicated. Custom GitLab runners are associated with a project (and its git repos). Those runners will be used for any CI jobs run in the context of repositories owned by that project. Obviously this covers post-merge build jobs, or regularly scheduled build jobs. It gets more interesting with merge requests though, because in a normal project, the merge request is not being submmitted from a branch on the main repo, instead it is submitted from a branch on the user's forked repo copy. As such the CI jobs for the merge request run in the contxt of the forked repo and do not have access to the customer runners owned by the project. This is a intentional security restriction to prevent denial of service from arbitrary users from submitting (malicious) pull requests that trigger builds on 3rd party runners. Thus, we want to maximise the amount of build coverage we get on the shared runners, as this is what merge requests will be using, and what contributors will be using for tests before submitting code for review. We'll have custom runners for providing coverage of platforms where containers aren't viable (non-Linux), as well as providing additional capacity. We can selectively grant access to the custom runners to regular contributors (whom we trust and would benefit from broader access), but we canot make them freely available to everyone by default. There's gitlab RFEs about making custom runners more flexible, while still avoiding the security issues. For example, by allowing a project maintainer to explicitly approve CI start for a merge request. No ETA for this though. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, 2020-03-23 at 15:16 +0000, Daniel P. Berrangé wrote:
On Mon, Mar 23, 2020 at 04:00:17PM +0100, Andrea Bolognani wrote:
Since we're going to need a FreeBSD machine configured as a GitLab runner, wouldn't it make sense for us to also create one or more Linux machines that would run jobs on behalf of GitLab CI? That way we'd remove the limitation on the number of containers we can have running at the same time, and still have everything controlled by GitLab.
It is bit more complicated. Custom GitLab runners are associated with a project (and its git repos). Those runners will be used for any CI jobs run in the context of repositories owned by that project. Obviously this covers post-merge build jobs, or regularly scheduled build jobs.
It gets more interesting with merge requests though, because in a normal project, the merge request is not being submmitted from a branch on the main repo, instead it is submitted from a branch on the user's forked repo copy. As such the CI jobs for the merge request run in the contxt of the forked repo and do not have access to the customer runners owned by the project.
Where is this documented? I've been looking through the gitlab-runner documentation but I haven't been able to find anything about this.
This is a intentional security restriction to prevent denial of service from arbitrary users from submitting (malicious) pull requests that trigger builds on 3rd party runners.
It should be possible to prevent a DoS by configuring a limit on the number of concurrent jobs that are allowed for the runner (see [1]), so this looks like a moot point. As for running the build jobs securely, the Docker executor should take care of that when it comes to Linux; for FreeBSD, I think we'd have to use a custom executor that drops privileges instead, because Docker is not a possibility there.
Thus, we want to maximise the amount of build coverage we get on the shared runners, as this is what merge requests will be using, and what contributors will be using for tests before submitting code for review.
We'll have custom runners for providing coverage of platforms where containers aren't viable (non-Linux), as well as providing additional capacity.
I don't understand how this would work. If the CI configuration contains say 20 jobs, 18 of which are to be run on Linux and the remaining two are for FreeBSD, and the shared runners are limited to 10 jobs per build, how would GitLab decide which 10 jobs to ignore when testing a merge request?
We can selectively grant access to the custom runners to regular contributors (whom we trust and would benefit from broader access), but we canot make them freely available to everyone by default.
Can you please point me to the relevant documentation?
There's gitlab RFEs about making custom runners more flexible, while still avoiding the security issues. For example, by allowing a project maintainer to explicitly approve CI start for a merge request. No ETA for this though.
Do you have links to these RFEs handy? [1] https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the... -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Mar 24, 2020 at 04:41:20PM +0100, Andrea Bolognani wrote:
On Mon, 2020-03-23 at 15:16 +0000, Daniel P. Berrangé wrote:
On Mon, Mar 23, 2020 at 04:00:17PM +0100, Andrea Bolognani wrote:
Since we're going to need a FreeBSD machine configured as a GitLab runner, wouldn't it make sense for us to also create one or more Linux machines that would run jobs on behalf of GitLab CI? That way we'd remove the limitation on the number of containers we can have running at the same time, and still have everything controlled by GitLab.
It is bit more complicated. Custom GitLab runners are associated with a project (and its git repos). Those runners will be used for any CI jobs run in the context of repositories owned by that project. Obviously this covers post-merge build jobs, or regularly scheduled build jobs.
It gets more interesting with merge requests though, because in a normal project, the merge request is not being submmitted from a branch on the main repo, instead it is submitted from a branch on the user's forked repo copy. As such the CI jobs for the merge request run in the contxt of the forked repo and do not have access to the customer runners owned by the project.
Where is this documented? I've been looking through the gitlab-runner documentation but I haven't been able to find anything about this.
It isn't clearly documented anywhere, but you can see it in practice quite easily. Fork a project, and open a merge request. You'll see the pipelines are reported against your fork, and not against the original repo. Your fork won't have access to anything except the standard shared runners.
This is a intentional security restriction to prevent denial of service from arbitrary users from submitting (malicious) pull requests that trigger builds on 3rd party runners.
It should be possible to prevent a DoS by configuring a limit on the number of concurrent jobs that are allowed for the runner (see [1]), so this looks like a moot point.
As for running the build jobs securely, the Docker executor should take care of that when it comes to Linux; for FreeBSD, I think we'd have to use a custom executor that drops privileges instead, because Docker is not a possibility there.
That's all fine, but there's nothing we cna do about it. This is not a restriction we have the ability to relax, as it is policy set by the GitLab CI infra on runner usage.
Thus, we want to maximise the amount of build coverage we get on the shared runners, as this is what merge requests will be using, and what contributors will be using for tests before submitting code for review.
We'll have custom runners for providing coverage of platforms where containers aren't viable (non-Linux), as well as providing additional capacity.
I don't understand how this would work. If the CI configuration contains say 20 jobs, 18 of which are to be run on Linux and the remaining two are for FreeBSD, and the shared runners are limited to 10 jobs per build, how would GitLab decide which 10 jobs to ignore when testing a merge request?
I don't know what settings you're referring to, but there's no hard limit on total job count on the shared runners. There is some limit on the concurrent jobs for fairness between users, so further jobs merely have to wait for the previous jobs to complete, just as we see with Travis. https://docs.gitlab.com/ee/ci/runners/README.html#how-shared-runners-pick-jo...
We can selectively grant access to the custom runners to regular contributors (whom we trust and would benefit from broader access), but we canot make them freely available to everyone by default.
Can you please point me to the relevant documentation?
There are some docs here, but they're not comprehensive: https://docs.gitlab.com/ee/ci/runners/README.html
There's gitlab RFEs about making custom runners more flexible, while still avoiding the security issues. For example, by allowing a project maintainer to explicitly approve CI start for a merge request. No ETA for this though.
Do you have links to these RFEs handy?
I don't have the links handy any more, as this was a few weeks back, and anyway we need to set things up with what exists today, not what might exist in future. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

The pipeline UI will truncate the names of jobs after about 15 characters. As a result with the cross-builds, we truncate the most important part of the job name. Putting the most important part first is robust against truncation. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 3e15d08d17..3254ec4d4f 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -50,19 +50,19 @@ ubuntu-1804: # platform, and s390x as an interesting big endian platform. We split # between Debian 10 and sid to help detect problems on the horizon. -debian-10-cross-s390x: +s390x-cross-debian-10: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-s390x:latest -debian-sid-cross-armv7l: +armv7l-cross-debian-sid: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-armv7l:latest -fedora-30-cross-mingw32: +mingw32-cross-fedora-30: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw32:latest -fedora-30-cross-mingw64: +mingw64-cross-fedora-30: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw64:latest -- 2.24.1

On Tue, Mar 10, 2020 at 10:09:45AM +0000, Daniel P. Berrangé wrote:
The pipeline UI will truncate the names of jobs after about 15 characters. As a result with the cross-builds, we truncate the most important part of the job name. Putting the most important part first is robust against truncation.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- .gitlab-ci.yml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 3e15d08d17..3254ec4d4f 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -50,19 +50,19 @@ ubuntu-1804: # platform, and s390x as an interesting big endian platform. We split # between Debian 10 and sid to help detect problems on the horizon.
-debian-10-cross-s390x: +s390x-cross-debian-10: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-10-cross-s390x:latest
-debian-sid-cross-armv7l: +armv7l-cross-debian-sid: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-debian-sid-cross-armv7l:latest
-fedora-30-cross-mingw32: +mingw32-cross-fedora-30: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw32:latest
-fedora-30-cross-mingw64: +mingw64-cross-fedora-30: <<: *cross_build_job_definition image: quay.io/libvirt/buildenv-libvirt-fedora-30-cross-mingw64:latest
Alternatively, we could drop the -cross- part completely, it can be deducted from the job template being pulled in. Reviewed-by: Erik Skultety <eskultet@redhat.com>

ping On Tue, Mar 10, 2020 at 10:09:40AM +0000, Daniel P. Berrangé wrote:
There are two main goals with this series
- Introduce a minimal job building the website and publishing an artifact which can be deployed onto libvirt.org - Expanding CI jobs to get coverage closer to Travis/Jenkins
Previous posting lost last two mails due to transient SMTP problem
Daniel P. Berrangé (5): gitlab: use CI for building website contents gitlab: reduce number of cross-build CI jobs gitlab: group jobs into stages gitlab: add several native CI jobs gitlab: rename the cross build jobs
.gitlab-ci.yml | 105 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 75 insertions(+), 30 deletions(-)
-- 2.24.1
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Andrea Bolognani
-
Daniel P. Berrangé
-
Erik Skultety