[libvirt PATCH 0/4] RFE: ci: A couple of minor improvements

First of all, this builds on top of [1]. As I was testing some builds in local containers I noticed that when one does: $ make -C ci ci-list-images it will return images that we no longer support in lcitool nor in libvirt, but one can still pull them and run a build in those. So I created a Python script that compares the current registry with what we support in lcitool and if the= re images of distro we no longer support, it will flag those (see below). Now, I wonder where the proper place to run this would be, originally I creat= ed for our GitLab containers and ran it during the sanity-check phase [2]. But having it marked as a "soft" failure didn't seem correct. So I'm proposing to run this within the context of the "refresh" script. I also pondered whether I should go ahead and rewrite the refresh script to Python as well (like this series does for list-images.sh), but I actually like how straight forward the refresh script is, unlike list-images, it would not look better in Python for sure. The 3 resulting script hierarchy isn't perfec= t, but I'd like to hear some comments first. [1] https://listman.redhat.com/archives/libvir-list/2021-February/msg00636.ht= ml [2] https://gitlab.com/eskultety/libvirt/-/pipelines/253570058 EXAMPLE OUTPUT OF THE REGISTRY CHECKER: The following images can be purged from the registry: libvirt/libvirt/ci-debian-9 libvirt/libvirt/ci-debian-9-cross-aarch64 libvirt/libvirt/ci-debian-9-cross-mipsel libvirt/libvirt/ci-debian-9-cross-armv7l libvirt/libvirt/ci-debian-9-cross-armv6l libvirt/libvirt/ci-debian-9-cross-mips libvirt/libvirt/ci-debian-9-cross-ppc64le libvirt/libvirt/ci-debian-9-cross-s390x libvirt/libvirt/ci-debian-9-cross-mips64el libvirt/libvirt/ci-fedora-31 libvirt/libvirt/ci-opensuse-151 You can remove the above images over the API with the following code snippet: $ for image_id in 1154661 1154667 1154669 1154671 1154676 1154678 1154682 11= 54683 1154686 1154687 1154724 \ ;do \ curl --request DELETE --header "PRIVATE-TOKEN: <access_token>" \ https://gitlab.com/api/v4/projects/192693/registry/repositories/$image_id \ ;done Erik Skultety (4): ci: Makefile: Specify a help target to replace ci-help ci: Rewrite list-images from Bash to Python ci: list-images: Split some generic logic to a util module ci: Introduce a new checker script 'check-registry.py' ci/Makefile | 12 ++--- ci/containers/check-registry.py | 96 +++++++++++++++++++++++++++++++++ ci/containers/refresh | 5 ++ ci/containers/util.py | 25 +++++++++ ci/list-images.py | 32 +++++++++++ 5 files changed, 161 insertions(+), 9 deletions(-) create mode 100644 ci/containers/check-registry.py create mode 100644 ci/containers/util.py create mode 100644 ci/list-images.py --=20 2.29.2

It's quite pointless to have a 'ci-help' target in the Makefile when one needs to actually open the Makefile to go through the list of targets to know what functionality the Makefile actually provides. It's much more intuitive to run "make help" in that case. Therefore, add a 'help' target and replace the old 'ci-help' target with it. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ci/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ci/Makefile b/ci/Makefile index 7938e14c15..1f71701047 100644 --- a/ci/Makefile +++ b/ci/Makefile @@ -226,7 +226,7 @@ ci-list-images: @sh list-images.sh "$(CI_IMAGE_PREFIX)" | grep cross @echo -ci-help: +help: @echo "Build libvirt inside containers used for CI" @echo @echo "Available targets:" @@ -235,7 +235,7 @@ ci-help: @echo " ci-check@\$$IMAGE - run a 'ninja test'" @echo " ci-shell@\$$IMAGE - run an interactive shell" @echo " ci-list-images - list available images" - @echo " ci-help - show this help message" + @echo " help - show this help message" @echo @echo "Available make variables:" @echo -- 2.29.2

On Wed, 2021-02-10 at 18:24 +0100, Erik Skultety wrote:
It's quite pointless to have a 'ci-help' target in the Makefile when one needs to actually open the Makefile to go through the list of targets to know what functionality the Makefile actually provides. It's much more intuitive to run "make help" in that case. Therefore, add a 'help' target and replace the old 'ci-help' target with it.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ci/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

Convert the script to some human readable form. Speed-wise, both are comparable. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ci/Makefile | 8 +------- ci/list-images.py | 29 +++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 7 deletions(-) create mode 100644 ci/list-images.py diff --git a/ci/Makefile b/ci/Makefile index 1f71701047..92032e8ca0 100644 --- a/ci/Makefile +++ b/ci/Makefile @@ -217,13 +217,7 @@ ci-test@%: ci-list-images: @echo - @echo "Available x86 container images:" - @echo - @sh list-images.sh "$(CI_IMAGE_PREFIX)" | grep -v cross - @echo - @echo "Available cross-compiler container images:" - @echo - @sh list-images.sh "$(CI_IMAGE_PREFIX)" | grep cross + @python3 list-images.py @echo help: diff --git a/ci/list-images.py b/ci/list-images.py new file mode 100644 index 0000000000..6862ac347f --- /dev/null +++ b/ci/list-images.py @@ -0,0 +1,29 @@ +#!/usr/bin/env python3 + +import json +import urllib.request as urllib + +PROJECT_ID = 192693 +DOMAIN = "https://gitlab.com" +API = "api/v4" + +uri = f"{DOMAIN}/{API}/projects/{PROJECT_ID}/registry/repositories" +r = urllib.urlopen(uri + "?per_page=100") + +# read the HTTP response and load the JSON part of it +data = json.loads(r.read().decode()) + +# skip the "ci-" prefix each of our container images' name has +names = [i["name"][3:] for i in data] +names.sort() + +names_native = [name for name in names if "cross" not in name] +names_cross = [name for name in names if "cross" in name] + +print("Available x86 container images:\n") +print("\t" + "\n\t".join(names_native)) + +if names_cross: + print() + print("Available cross-compiler container images:\n") + print("\t" + "\n\t".join(names_cross)) -- 2.29.2

On Wed, 2021-02-10 at 18:24 +0100, Erik Skultety wrote:
+# skip the "ci-" prefix each of our container images' name has +names = [i["name"][3:] for i in data]
I would prefer something like PREFIX = "ci-" names = [i["name"][len(PREFIX):] for i in data] here, rather than hardcoding the length of the string.
+names_native = [name for name in names if "cross" not in name] +names_cross = [name for name in names if "cross" in name] + +print("Available x86 container images:\n") +print("\t" + "\n\t".join(names_native))
Please use two or four spaces instead of tabs. More high-level feedback about the patch: * you should drop the sh implementation at the same time as you're adding the python one; * adding this standalone script in one patch only to refactor most of its code away to a separate module in the next one leads to unnecessary churn and more work for the reviewer: please just add the script in its intended form in the first place. -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 16, 2021 at 02:40:32PM +0100, Andrea Bolognani wrote:
On Wed, 2021-02-10 at 18:24 +0100, Erik Skultety wrote:
+# skip the "ci-" prefix each of our container images' name has +names = [i["name"][3:] for i in data]
I would prefer something like
PREFIX = "ci-" names = [i["name"][len(PREFIX):] for i in data]
here, rather than hardcoding the length of the string.
+names_native = [name for name in names if "cross" not in name] +names_cross = [name for name in names if "cross" in name] + +print("Available x86 container images:\n") +print("\t" + "\n\t".join(names_native))
Please use two or four spaces instead of tabs.
Sure, will do.
More high-level feedback about the patch:
* you should drop the sh implementation at the same time as you're adding the python one;
* adding this standalone script in one patch only to refactor most of its code away to a separate module in the next one leads to unnecessary churn and more work for the reviewer: please just add the script in its intended form in the first place.
As you wish, I approached the way I'd appreciate it - introduce a drop 1:1 replacement and then change individual bits later...oh well, all of us have different taste. Thanks, Erik

Parts of the code can be reused by another registry checker script introduced in the next patch. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ci/containers/util.py | 25 +++++++++++++++++++++++++ ci/list-images.py | 43 +++++++++++++++++++++++-------------------- 2 files changed, 48 insertions(+), 20 deletions(-) create mode 100644 ci/containers/util.py diff --git a/ci/containers/util.py b/ci/containers/util.py new file mode 100644 index 0000000000..1ce326bba2 --- /dev/null +++ b/ci/containers/util.py @@ -0,0 +1,25 @@ +import json +import urllib.request as urllib + + +PROJECT_ID = 192693 + + +def get_registry_uri(project_id, + base_uri="https://gitlab.com", + api_version=4): + uri = f"{base_uri}/api/v{str(api_version)}/projects/{project_id}/registry/repositories" + return uri + + +def list_images(uri): + """ + Returns all container images as currently available for the given GitLab + project. + + ret: list of container image names + """ + r = urllib.urlopen(uri) + + # read the HTTP response and load the JSON part of it + return json.loads(r.read().decode()) diff --git a/ci/list-images.py b/ci/list-images.py index 6862ac347f..1a1d87c10b 100644 --- a/ci/list-images.py +++ b/ci/list-images.py @@ -1,29 +1,32 @@ #!/usr/bin/env python3 -import json -import urllib.request as urllib +import containers.util as util -PROJECT_ID = 192693 -DOMAIN = "https://gitlab.com" -API = "api/v4" -uri = f"{DOMAIN}/{API}/projects/{PROJECT_ID}/registry/repositories" -r = urllib.urlopen(uri + "?per_page=100") +def list_image_names(uri): + images = util.list_images(uri) -# read the HTTP response and load the JSON part of it -data = json.loads(r.read().decode()) + # skip the "ci-" prefix each of our container images' name has + names = [i["name"][3:] for i in images] + names.sort() -# skip the "ci-" prefix each of our container images' name has -names = [i["name"][3:] for i in data] -names.sort() + return names -names_native = [name for name in names if "cross" not in name] -names_cross = [name for name in names if "cross" in name] -print("Available x86 container images:\n") -print("\t" + "\n\t".join(names_native)) +def main(): + names = list_image_names(util.get_registry_uri(util.PROJECT_ID) + "?per_page=100") -if names_cross: - print() - print("Available cross-compiler container images:\n") - print("\t" + "\n\t".join(names_cross)) + names_native = [name for name in names if "cross" not in name] + names_cross = [name for name in names if "cross" in name] + + print("Available x86 container images:\n") + print("\t" + "\n\t".join(names_native)) + + if names_cross: + print() + print("Available cross-compiler container images:\n") + print("\t" + "\n\t".join(names_cross)) + + +if __name__ == "__main__": + main() -- 2.29.2

On Wed, 2021-02-10 at 18:24 +0100, Erik Skultety wrote:
+++ b/ci/list-images.py @@ -1,29 +1,32 @@ #!/usr/bin/env python3
-import json -import urllib.request as urllib +import containers.util as util
As we add more Python code here, I think it makes sense to have a single util module in the top-level ci/ directory instead of one for the ci/containers/ directory and potentially another one for the ci/cirrus directory. -- Andrea Bolognani / Red Hat / Virtualization

When removing the dependency on sudo in our CI Makefile, I realized that the list of images that we pull contains some old images that we no longer support in lcitool, but can be still pulled from the registry. This is a simple checker which runs in context of the refresh script which informs the developer/maintainer that there are some old redundant images that should be purged from the registry, so that people don't try to run local builds in those (which will most likely fail anyway). Signed-off-by: Erik Skultety <eskultet@redhat.com> --- ci/containers/check-registry.py | 96 +++++++++++++++++++++++++++++++++ ci/containers/refresh | 5 ++ 2 files changed, 101 insertions(+) create mode 100644 ci/containers/check-registry.py diff --git a/ci/containers/check-registry.py b/ci/containers/check-registry.py new file mode 100644 index 0000000000..f0159da54d --- /dev/null +++ b/ci/containers/check-registry.py @@ -0,0 +1,96 @@ +#!/usr/bin/env python3 + +import argparse +import subprocess +import shutil +import sys + +import util + + +def get_image_distro(image_name: str): + name_prefix = "ci-" + name_suffix = "-cross-" + + distro = image_name[len(name_prefix):] + + index = distro.find(name_suffix) + if index > 0: + distro = distro[:index] + + return distro + + +def get_undesirables(registry_distros_d, hosts_l): + """ Returns a dictionary of 'id':'name' pairs of images that can be purged""" + + undesirables_d = {} + for distro in registry_distros_d: + if distro not in hosts_l: + for image in registry_distros_d[distro]: + undesirables_d[str(image["id"])] = image["path"] + + return undesirables_d if undesirables_d else None + + +def get_lcitool_hosts(lcitool_path): + """ Returns a list of supported hosts by lcitool + @param lcitool_path: absolute path to local copy of lcitool, if omitted it is + assumed lcitool is installed in the current environment""" + + if not lcitool_path: + lcitool_path = "lcitool" + if not shutil.which(lcitool_path): + sys.exit("error: 'lcitool' not installed") + + lcitool_out = subprocess.check_output([lcitool_path, "hosts"]) + return lcitool_out.decode("utf-8").splitlines() + + +def main(): + parser = argparse.ArgumentParser() + parser.add_argument("project_id", + help="GitLab project ID") + parser.add_argument("--lcitool_path", + dest="lcitool", + metavar="PATH", + help="absolute path to lcitool, CWD is used if omitted") + + args = parser.parse_args() + + uri = util.get_registry_uri(util.PROJECT_ID) + images_json = util.list_images(uri + "?per_page=100") + + registry_distros_d = {} + for image in images_json: + distro_name = get_image_distro(image["name"]) + + try: + registry_distros_d[distro_name].append(image) + except KeyError: + registry_distros_d[distro_name] = [image] + + hosts_l = get_lcitool_hosts(args.lcitool) + + # print the list of images that we can safely purge from the registry + undesirables_d = get_undesirables(registry_distros_d, hosts_l) + if undesirables_d: + undesirable_image_names = "\t" + "\n\t".join(undesirables_d.values()) + undesirable_image_ids = " ".join(undesirables_d.keys()) + + sys.exit(f""" +The following images can be purged from the registry: + +{undesirable_image_names} + +You can remove the above images over the API with the following code snippet: + +\t$ for image_id in {undesirable_image_ids} \\ +\t;do \\ +\t\tcurl --request DELETE --header "PRIVATE-TOKEN: <access_token>" \\ +\t\t{util.get_registry_uri(util.PROJECT_ID)}/$image_id \\ +\t;done""") + + +if __name__ == "__main__": + main() diff --git a/ci/containers/refresh b/ci/containers/refresh index a39d0b0996..9dfaf75300 100755 --- a/ci/containers/refresh +++ b/ci/containers/refresh @@ -40,3 +40,8 @@ do $LCITOOL dockerfile $host libvirt > ci-$host.Dockerfile done + +# Check whether there are some old images in the container registry that could +# be purged - mainly because the list-images script would list them as +# available +/usr/bin/env python3 check-registry.py --lcitool_path "$LCITOOL" 192693 -- 2.29.2

On Wed, 2021-02-10 at 18:24 +0100, Erik Skultety wrote:
+def get_undesirables(registry_distros_d, hosts_l): + """ Returns a dictionary of 'id':'name' pairs of images that can be purged"""
Is the var_d and var_l a Python convention that I'm not aware of? I don't think we're using it anywhere in either libvirt or libvirt-ci, so I'd rather avoid it here as well. I see you're using some type hints for the get_image_distro() function, though. Can we perhaps have more of that? Suggestion: "stale" is both shorter and harder to misspell than "undesirable" :)
+def main(): + parser = argparse.ArgumentParser() + parser.add_argument("project_id", + help="GitLab project ID")
Is passing the project ID necessary? We're hardcoding the one for libvirt/libvirt in util.py and we don't provide a way to override it for ci-list-images, so I think this can be skipped. Having a way for the developer to point to their own fork rather than libvirt/libvirt makes sense, but in that case we probably also want to make it so the user passes "myusername/libvirt" and we convert that to a project ID using the GitLab API under the hood.
+ parser.add_argument("--lcitool_path", + dest="lcitool", + metavar="PATH", + help="absolute path to lcitool, CWD is used if omitted")
Underscores in command line arguments are inconvenient and ugly, so please don't use them. You can just call the option "--lcitool". Regarding the documentation for the option, I'm not convinced it's accurate: AFAICT shutil.which() will use $PATH, not $CWD, to look for the named binary if an absolute path to it has not been provided.
+ uri = util.get_registry_uri(util.PROJECT_ID) + images_json = util.list_images(uri + "?per_page=100")
We have "?per_page=100" hardcoded in two places now, so it looks like we can probably just fold its usage into list_images()?
+ sys.exit(f""" +The following images can be purged from the registry: + +{undesirable_image_names}
I would like to see the image ID printed along with the image name, so that I can tweak the shell snippet below to selectively remove only *some* of the images.
+You can remove the above images over the API with the following code snippet: + +\t$ for image_id in {undesirable_image_ids} \\ +\t;do \\
Having the "do" on a separate line is weird. Also please indent using either two or four spaces rather than tabs. Aside from the (mostly minor) issues that I've pointed out, I think these patches go in the right direction. However, reviewing them made me wonder what the long-term status of these scripts should be. Right now we have the following entry points: * a Makefile, which can be used to kick off local builds; * a couple of 'refresh' shell script that are used to regenerate contents derived from the libvirt-ci source of truth; * some Python bits that are used by both. We also know that, at some point in the hopefully not too distant future, we'll have the results of https://gitlab.com/libvirt/libvirt-ci/-/merge_requests/59 available, and that will allow us to turn the refresh scripts into trivial wrappers. I think it would make sense to converge towards having a single Python based entry point, which serves as the counterpart of lcitool, and which can be used to do all of the above. We can even get there gradually: the first version of the Python script could just call out to the Makefile to start builds, and then we could reimplement the logic in Python at a later point in time. What do you think? -- Andrea Bolognani / Red Hat / Virtualization

On Tue, 2021-02-16 at 16:28 +0100, Andrea Bolognani wrote:
I think it would make sense to converge towards having a single Python based entry point, which serves as the counterpart of lcitool, and which can be used to do all of the above.
We can even get there gradually: the first version of the Python script could just call out to the Makefile to start builds, and then we could reimplement the logic in Python at a later point in time.
Initial, pretty rough implementation of this idea: https://listman.redhat.com/archives/libvir-list/2021-February/msg00900.html -- Andrea Bolognani / Red Hat / Virtualization
participants (2)
-
Andrea Bolognani
-
Erik Skultety