[libvirt PATCH v2 0/3] ci: Check spelling

This is a wrapper for codespell [1], a spell checker for source code. Codespell does not compare words to a dictionary, but rather works by checking words against a list of common typos, making it produce fewer false positives than other solutions. The script in this patch works around the lack of per-directory ignore lists and some oddities regarding capitalization in ignore lists. [1] (https://github.com/codespell-project/codespell/) V1: https://listman.redhat.com/archives/libvir-list/2021-October/msg00015.html The wrapper script changed since it was Reviewed-by: Ján, hence I did not mark patch #2 as already reviewed. Regards, Tim Tim Wiederhake (3): scripts: Check spelling ci: Add spell checking Fix some typos .gitlab-ci.yml | 12 +++ scripts/check-spelling.py | 119 ++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.c | 2 +- tests/qemucapabilitiesnumbering.c | 2 +- tests/qemucapabilitiestest.c | 4 +- 5 files changed, 135 insertions(+), 4 deletions(-) create mode 100755 scripts/check-spelling.py -- 2.31.1

This is a wrapper for codespell [1], a spell checker for source code. Codespell does not compare words to a dictionary, but rather works by checking words against a list of common typos, making it produce fewer false positives than other solutions. The script in this patch works around the lack of per-directory ignore lists and some oddities regarding capitalization in ignore lists. [1] (https://github.com/codespell-project/codespell/) Signed-off-by: Tim Wiederhake <twiederh@redhat.com> --- scripts/check-spelling.py | 119 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100755 scripts/check-spelling.py diff --git a/scripts/check-spelling.py b/scripts/check-spelling.py new file mode 100755 index 0000000000..0480a506e8 --- /dev/null +++ b/scripts/check-spelling.py @@ -0,0 +1,119 @@ +#!/usr/bin/env python3 + +import argparse +import re +import subprocess +import os + + +IGNORE_LIST = [ + # ignore all translation files + ("/po/", []), + + # ignore all git files + ("/.git/", []), + + # ignore this script + ("/scripts/check-spelling.py", []), + + # 3rd-party: keycodemapdb + ("/src/keycodemapdb/", []), + + # 3rd-party: VirtualBox SDK + ("/src/vbox/vbox_CAPI", []), + + # 3rd-party: qemu + ("/tests/qemucapabilitiesdata/caps_", []), + + # other + ("/", ["msdos", "MSDOS", "wan", "WAN", "hda", "HDA", "inout"]), + ("/NEWS.rst", "crashers"), + ("/docs/gitdm/companies/others", "Archiv"), + ("/docs/glib-adoption.rst", "preferrable"), + ("/docs/js/main.js", "whats"), + ("/examples/polkit/libvirt-acl.rules", ["userA", "userB", "userC"]), + ("/src/libvirt-domain.c", "PTD"), + ("/src/libxl/libxl_logger.c", "purposedly"), + ("/src/nwfilter/nwfilter_dhcpsnoop.c", "ether"), + ("/src/nwfilter/nwfilter_ebiptables_driver.c", "parm"), + ("/src/nwfilter/nwfilter_learnipaddr.c", "ether"), + ("/src/qemu/qemu_agent.c", "crypted"), + ("/src/qemu/qemu_agent.h", "crypted"), + ("/src/qemu/qemu_process.c", "wee"), + ("/src/security/apparmor/libvirt-lxc", "devic"), + ("/src/security/apparmor/libvirt-qemu", "readby"), + ("/src/storage_file/storage_file_probe.c", "conectix"), + ("/src/util/virnetdevmacvlan.c", "calld"), + ("/src/util/virtpm.c", "parm"), + ("/tests/qemuagenttest.c", "IST"), + ("/tests/storagepoolxml2xml", "cant"), + ("/tests/sysinfodata/", "sie"), + ("/tests/testutils.c", "nIn"), + ("/tests/vircgroupdata/ovirt-node-6.6.mounts", "hald"), + ("/tests/virhostcpudata/", "sie"), + ("/tools/virt-host-validate-common.c", "sie"), +] + + +def ignore(filename, linenumber, word, suggestion): + if len(word) <= 2: + return True + + for f, w in IGNORE_LIST: + if not filename.startswith(f): + continue + if word in w or not w: + return True + return False + + +def main(): + line_pattern = re.compile("^(.*):(.*): (.*) ==> (.*)$") + output_template = "(\"{0}\", \"{2}\"),\t# line {1}, \"{3}\"?" + + parser = argparse.ArgumentParser(description="Check spelling") + parser.add_argument( + "dir", + help="Path to source directory. " + "Defaults to parent directory of this script", + type=os.path.realpath, + nargs='?') + args = parser.parse_args() + + if not args.dir: + args.dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__))) + + process = subprocess.run( + ["codespell", args.dir], + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + universal_newlines=True) + + if process.returncode not in (0, 65): + exit("error: unexpected returncode %s" % process.returncode) + + if process.stderr: + exit("error: unexpected output to stderr: \"%s\"" % process.stderr) + + findings = 0 + for line in process.stdout.split("\n"): + line = line.strip().replace(args.dir, "") + if not line: + continue + + match = line_pattern.match(line) + if not match: + exit("error: unexpected line: \"%s\"" % line) + + if ignore(*match.groups()): + continue + + print(output_template.format(*match.groups())) + findings += 1 + + if findings: + exit("error: %s spelling errors" % findings) + + +if __name__ == "__main__": + main() -- 2.31.1

On Mon, Jan 10, 2022 at 16:41:25 +0100, Tim Wiederhake wrote:
This is a wrapper for codespell [1], a spell checker for source code. Codespell does not compare words to a dictionary, but rather works by checking words against a list of common typos, making it produce fewer false positives than other solutions.
The script in this patch works around the lack of per-directory ignore lists and some oddities regarding capitalization in ignore lists.
[1] (https://github.com/codespell-project/codespell/)
Signed-off-by: Tim Wiederhake <twiederh@redhat.com> --- scripts/check-spelling.py | 119 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100755 scripts/check-spelling.py
diff --git a/scripts/check-spelling.py b/scripts/check-spelling.py new file mode 100755 index 0000000000..0480a506e8 --- /dev/null +++ b/scripts/check-spelling.py @@ -0,0 +1,119 @@ +#!/usr/bin/env python3 + +import argparse +import re +import subprocess +import os + + +IGNORE_LIST = [ + # ignore all translation files + ("/po/", []), + + # ignore all git files + ("/.git/", []),
IMO this should be working on an equivalent of 'git ls-files' rather than excluding files which _might_ interfere as if you run it in a not-so-clean checkout it also looks for files which are not part of the repo: $ ./scripts/check-spelling.py ("/tags", "aLocation"), # line 10516, "allocation"? ("/tags", "aLocation"), # line 10517, "allocation"? ("/tags", "aLocation"), # line 10518, "allocation"? ("/tags", "aLocation"), # line 10521, "allocation"? ("/tags", "aLocation"), # line 10522, "allocation"? ("/tags", "aLocation"), # line 10523, "allocation"? ("/tags", "aLocation"), # line 10526, "allocation"? ("/tags", "aLocation"), # line 10527, "allocation"? ("/tags", "aLocation"), # line 10528, "allocation"? ("/tags", "independant"), # line 48256, "independent"? ("/tags", "parm"), # line 60938, "param, pram, parma"? ("/tags", "parm"), # line 60938, "param, pram, parma"? ("/tags", "calld"), # line 80099, "called"? ("/tests/qemucapabilitiesnumbering.c", "occurence"), # line 60, "occurrence"? ("/tests/virstoragetest.c.orig", "folowing"), # line 103, "following"? ("/tests/qemucapabilitiestest.c", "programatic"), # line 216, "programmatic"? error: 16 spelling errors 'tags' is a file generated by 'ctags' and "/tests/virstoragetest.c.orig" is an artifact of an editor. Also since the script is lacking integration with meson and is also missing from documentation it's not applied to local builds, only on stuff that gets CI'd. Catching these early will hopefully prevent a few pointless CI runs just to fix typos.

On Mon, Jan 10, 2022 at 04:41:25PM +0100, Tim Wiederhake wrote:
+ ("/docs/glib-adoption.rst", "preferrable"),
This is an actual typo, isn't it?
+ ("/docs/js/main.js", "whats"), + ("/src/libxl/libxl_logger.c", "purposedly"), + ("/src/qemu/qemu_process.c", "wee"), + ("/tests/storagepoolxml2xml", "cant"),
These are a few cases where I feel that rewording the existing comment or piece of code, even if it wouldn't strictly speaking count as fixing a typo, would still be preferable to having to list it as an exception.
+ ("/src/util/virnetdevmacvlan.c", "calld"),
Same for this one, but I appreciate that others might consider renaming the variable as unnecessary churn and not worth the effort.
+ ("/src/security/apparmor/libvirt-lxc", "devic"),
Looking at the context where this appears: deny /sys/d[^e]*{,/**} wklx, deny /sys/de[^v]*{,/**} wklx, deny /sys/dev[^i]*{,/**} wklx, deny /sys/devi[^c]*{,/**} wklx, deny /sys/devic[^e]*{,/**} wklx, deny /sys/device[^s]*{,/**} wklx, deny /sys/devices/[^v]*{,/**} wklx, deny /sys/devices/v[^i]*{,/**} wklx, deny /sys/devices/vi[^r]*{,/**} wklx, deny /sys/devices/vir[^t]*{,/**} wklx, deny /sys/devices/virt[^u]*{,/**} wklx, deny /sys/devices/virtu[^a]*{,/**} wklx, deny /sys/devices/virtua[^l]*{,/**} wklx, deny /sys/devices/virtual/[^n]*{,/**} wklx, deny /sys/devices/virtual/n[^e]*{,/**} wklx, deny /sys/devices/virtual/ne[^t]*{,/**} wklx, deny /sys/devices/virtual/net?*{,/**} wklx, deny /sys/devices/virtual?*{,/**} wklx, deny /sys/devices?*{,/**} wklx, I mean, I don't speak AppArmor but this can't be right, can it? :D Jim, do you think we actually need such a slippery slope of deny rules, or can we simplify things a bit?
+ ("/src/security/apparmor/libvirt-qemu", "readby"),
This should probably be made to apply to all libvirt-* files in that directory, as it's apparently part of the format specification.
+ ("/tests/vircgroupdata/ovirt-node-6.6.mounts", "hald"),
In this case I think it's perfectly fine to just drop the offending line and move on. -- Andrea Bolognani / Red Hat / Virtualization

On 1/10/22 11:21, Andrea Bolognani wrote:
On Mon, Jan 10, 2022 at 04:41:25PM +0100, Tim Wiederhake wrote:
+ ("/docs/glib-adoption.rst", "preferrable"),
This is an actual typo, isn't it?
+ ("/docs/js/main.js", "whats"), + ("/src/libxl/libxl_logger.c", "purposedly"), + ("/src/qemu/qemu_process.c", "wee"), + ("/tests/storagepoolxml2xml", "cant"),
These are a few cases where I feel that rewording the existing comment or piece of code, even if it wouldn't strictly speaking count as fixing a typo, would still be preferable to having to list it as an exception.
+ ("/src/util/virnetdevmacvlan.c", "calld"),
Same for this one, but I appreciate that others might consider renaming the variable as unnecessary churn and not worth the effort.
+ ("/src/security/apparmor/libvirt-lxc", "devic"),
Looking at the context where this appears:
deny /sys/d[^e]*{,/**} wklx, deny /sys/de[^v]*{,/**} wklx, deny /sys/dev[^i]*{,/**} wklx, deny /sys/devi[^c]*{,/**} wklx, deny /sys/devic[^e]*{,/**} wklx, deny /sys/device[^s]*{,/**} wklx, deny /sys/devices/[^v]*{,/**} wklx, deny /sys/devices/v[^i]*{,/**} wklx, deny /sys/devices/vi[^r]*{,/**} wklx, deny /sys/devices/vir[^t]*{,/**} wklx, deny /sys/devices/virt[^u]*{,/**} wklx, deny /sys/devices/virtu[^a]*{,/**} wklx, deny /sys/devices/virtua[^l]*{,/**} wklx, deny /sys/devices/virtual/[^n]*{,/**} wklx, deny /sys/devices/virtual/n[^e]*{,/**} wklx, deny /sys/devices/virtual/ne[^t]*{,/**} wklx, deny /sys/devices/virtual/net?*{,/**} wklx, deny /sys/devices/virtual?*{,/**} wklx, deny /sys/devices?*{,/**} wklx,
I mean, I don't speak AppArmor but this can't be right, can it? :D
It's valid apparmor. At least the apparmor parser doesn't complain :-). ISTM the last rule should cover the others.
Jim, do you think we actually need such a slippery slope of deny rules, or can we simplify things a bit?
I don't know why all of these deny rules are defined in this manner. /sys/class, /proc/sys/kernel, and others are defined similarly. They were added by Cedric in commit 9265f8ab67d. Cedric, do you recall the purpose of defining the rules in this way?
+ ("/src/security/apparmor/libvirt-qemu", "readby"),
This should probably be made to apply to all libvirt-* files in that directory, as it's apparently part of the format specification.
Agree, it's a valid permissions value https://gitlab.com/apparmor/apparmor/-/wikis/TechnicalDoc_Proc_and_ptrace Regards, Jim

On Mon, Jan 10, 2022 at 03:58:55PM -0700, Jim Fehlig wrote:
On 1/10/22 11:21, Andrea Bolognani wrote:
On Mon, Jan 10, 2022 at 04:41:25PM +0100, Tim Wiederhake wrote:
+ ("/src/security/apparmor/libvirt-lxc", "devic"),
Looking at the context where this appears:
deny /sys/d[^e]*{,/**} wklx, deny /sys/de[^v]*{,/**} wklx, deny /sys/dev[^i]*{,/**} wklx, deny /sys/devi[^c]*{,/**} wklx, deny /sys/devic[^e]*{,/**} wklx, deny /sys/device[^s]*{,/**} wklx, deny /sys/devices/[^v]*{,/**} wklx, deny /sys/devices/v[^i]*{,/**} wklx, deny /sys/devices/vi[^r]*{,/**} wklx, deny /sys/devices/vir[^t]*{,/**} wklx, deny /sys/devices/virt[^u]*{,/**} wklx, deny /sys/devices/virtu[^a]*{,/**} wklx, deny /sys/devices/virtua[^l]*{,/**} wklx, deny /sys/devices/virtual/[^n]*{,/**} wklx, deny /sys/devices/virtual/n[^e]*{,/**} wklx, deny /sys/devices/virtual/ne[^t]*{,/**} wklx, deny /sys/devices/virtual/net?*{,/**} wklx, deny /sys/devices/virtual?*{,/**} wklx, deny /sys/devices?*{,/**} wklx,
I mean, I don't speak AppArmor but this can't be right, can it? :D
It's valid apparmor. At least the apparmor parser doesn't complain :-). ISTM the last rule should cover the others.
I was not really suggesting that it was not a valid configuration, it's just that looking at it immediately triggered a "that can't be the best way to do it" reaction in me ;)
Jim, do you think we actually need such a slippery slope of deny rules, or can we simplify things a bit?
I don't know why all of these deny rules are defined in this manner. /sys/class, /proc/sys/kernel, and others are defined similarly. They were added by Cedric in commit 9265f8ab67d. Cedric, do you recall the purpose of defining the rules in this way?
The script that generated those rules is https://github.com/lxc/lxc/blob/master/config/apparmor/lxc-generate-aa-rules... and that's apparently its intended behavior. So there has to be a reason why it's done this way, right? I just have no idea what it could possibly be. -- Andrea Bolognani / Red Hat / Virtualization

Allow failure to not have false-positives fail the builds. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> --- .gitlab-ci.yml | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 6ba11a0431..5e46f3dcb8 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -87,6 +87,18 @@ codestyle: - meson test -C build --suite syntax-check --no-rebuild --print-errorlogs +spellcheck: + stage: sanity_checks + image: $CI_REGISTRY_IMAGE/ci-fedora-35:latest + allow_failure: true + needs: + - x86_64-fedora-35-container + before_script: + - *script_variables + script: + - scripts/check-spelling.py + + # This artifact published by this job is downloaded to push to Weblate # for translation usage: # https://gitlab.com/libvirt/libvirt/-/jobs/artifacts/master/download?job=potf... -- 2.31.1

Signed-off-by: Tim Wiederhake <twiederh@redhat.com> --- src/qemu/qemu_cgroup.c | 2 +- tests/qemucapabilitiesnumbering.c | 2 +- tests/qemucapabilitiestest.c | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 308be5b00f..22a6f56cf9 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -447,7 +447,7 @@ qemuSetupHostdevCgroup(virDomainObj *vm, /** * qemuTeardownHostdevCgroup: - * @vm: doamin object + * @vm: domain object * @dev: device to tear down * * For given host device @dev deny access to it in CGroups. diff --git a/tests/qemucapabilitiesnumbering.c b/tests/qemucapabilitiesnumbering.c index eff7dbe2a1..aae21893c2 100644 --- a/tests/qemucapabilitiesnumbering.c +++ b/tests/qemucapabilitiesnumbering.c @@ -57,7 +57,7 @@ modify(struct qmpCommandList *list G_GNUC_UNUSED) if (STREQ_NULLABLE(cmdname, "qom-list-properties")) { found = i; - // break; /* uncomment if you want to find the first occurence */ + // break; /* uncomment if you want to find the first occurrence */ } } diff --git a/tests/qemucapabilitiestest.c b/tests/qemucapabilitiestest.c index ae208f442c..79dd358ef4 100644 --- a/tests/qemucapabilitiestest.c +++ b/tests/qemucapabilitiestest.c @@ -213,8 +213,8 @@ mymain(void) * * If you manually edit replies files you can run * VIR_TEST_REGENERATE_OUTPUT=1 tests/qemucapabilitiesnumbering - * to fix the replies ids. The tool also allows for programatic modification - * of the replies file. + * to fix the replies ids. The tool also allows for programmatic + * modification of the replies file. * * Once a replies file has been generated and tweaked if necessary, * you can drop it into tests/qemucapabilitiesdata/ (with a sensible -- 2.31.1

On Mon, Jan 10, 2022 at 04:41:27PM +0100, Tim Wiederhake wrote:
Signed-off-by: Tim Wiederhake <twiederh@redhat.com> --- src/qemu/qemu_cgroup.c | 2 +- tests/qemucapabilitiesnumbering.c | 2 +- tests/qemucapabilitiestest.c | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-)
This patch should have been first in the series. In fact, you can push it right away :) Reviewed-by: Andrea Bolognani <abologna@redhat.com> and safe for freeze. -- Andrea Bolognani / Red Hat / Virtualization
participants (4)
-
Andrea Bolognani
-
Jim Fehlig
-
Peter Krempa
-
Tim Wiederhake