On Tue, Jun 8, 2021 at 8:51 AM Michal Prívozník <mprivozn(a)redhat.com> wrote:
On 6/7/21 6:22 PM, Fabiano Fidêncio wrote:
> Currently `virt-host-validate` will fail whenever one of its calls fail,
> regardless of virHostValidateLevel set.
>
> This behaviour is not optimal and makes it not exactly reliable as a
> command line tool as other tools or scripts using it would have to check
> its output to figure out whether something really failed or if a warning
> was mistakenly treated as failure.
>
> With this change, the behaviour of whether to fail or not, is defined by
> the caller of those functions, based on the virHostValidateLevel passed
> to them.
>
>
https://gitlab.com/libvirt/libvirt/-/issues/175
>
> Signed-off-by: Fabiano Fidêncio <fabiano(a)fidencio.org>
> ---
> Changes since v1:
> * Replace the `goto out;` and the `out` labels by the
> `VIR_HOST_VALIDATE_FAILURE` macro
Yeah, this opened pandora's box...
Oh yeah, it looked pretty much like that when I first looked at this
code over the weekend.
> ---
> tools/virt-host-validate-common.c | 30 +++++++++++++++---------------
> tools/virt-host-validate-common.h | 14 ++++++++++++++
> 2 files changed, 29 insertions(+), 15 deletions(-)
>
> diff --git a/tools/virt-host-validate-common.c b/tools/virt-host-validate-common.c
> index 6dd851f07d..9412bb7514 100644
> --- a/tools/virt-host-validate-common.c
> +++ b/tools/virt-host-validate-common.c
> @@ -142,7 +142,7 @@ int virHostValidateDeviceExists(const char *hvname,
>
> if (access(dev_name, F_OK) < 0) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
> virHostMsgPass();
> @@ -159,7 +159,7 @@ int virHostValidateDeviceAccessible(const char *hvname,
>
> if (access(dev_name, R_OK|W_OK) < 0) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
> virHostMsgPass();
> @@ -180,7 +180,7 @@ int virHostValidateNamespace(const char *hvname,
>
> if (access(nspath, F_OK) < 0) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
> virHostMsgPass();
> @@ -264,17 +264,17 @@ int virHostValidateLinuxKernel(const char *hvname,
>
> if (STRNEQ(uts.sysname, "Linux")) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
> if (virParseVersionString(uts.release, &thisversion, true) < 0) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
> if (thisversion < version) {
> virHostMsgFail(level, "%s", hint);
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
Up until here the use of VIR_HOST_VALIDATE_FAILURE() is good.
> } else {
> virHostMsgPass();
> return 0;
> @@ -291,7 +291,7 @@ int virHostValidateCGroupControllers(const char *hvname,
> size_t i;
>
> if (virCgroupNew("/", -1, &group) < 0)
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
But this looks somewhat suspicious. What virCgroupNew() does is it
detects controllers and their mountpoints (and return this in @group).
Upon error (e.g. unable to open /proc/mounts or /proc/self/cgroups or
failure in their parsing) a libvirt error is reported. And this reminds
me of pandora's box. Firstly, the error object is never initialized
(i.e. there's no virInitialize() call from main(), or better said before
the first function that has potential of reporting an error). This leads
to virResetError() (called from virReportError()) to free() random
pointers.
Secondly, I'm unsure what to do in this case. I mean, caller told us
severity of this check (@level), so if we failed to initialize an
internal structure we should honour caller's wish and do/do not return
an error. BUT, with the way this code is currently written this function
exits early, not printing anything anywhere. E.g.:
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for device assignment IOMMU support : PASS
Whereas this is from a regular run:
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support :
PASS
QEMU: Checking for cgroup 'cpuacct' controller support :
PASS
QEMU: Checking for cgroup 'cpuset' controller support :
PASS
QEMU: Checking for cgroup 'memory' controller support :
PASS
QEMU: Checking for cgroup 'devices' controller support :
PASS
QEMU: Checking for cgroup 'blkio' controller support :
PASS
QEMU: Checking for device assignment IOMMU support : PASS
This becomes more obvious [1]
>
> for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) {
> int flag = 1 << i;
> @@ -303,7 +303,7 @@ int virHostValidateCGroupControllers(const char *hvname,
> virHostMsgCheck(hvname, "for cgroup '%s' controller
support", cg_name);
>
> if (!virCgroupHasController(group, i)) {
> - ret = -1;
> + ret = VIR_HOST_VALIDATE_FAILURE(level);
> virHostMsgFail(level, "Enable '%s' in kernel Kconfig file
or "
> "mount/enable cgroup controller in your
system",
> cg_name);
> @@ -320,7 +320,7 @@ int virHostValidateCGroupControllers(const char *hvname
G_GNUC_UNUSED,
> virHostValidateLevel level)
> {
> virHostMsgFail(level, "%s", "This platform does not support
cgroups");
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
> #endif /* !__linux__ */
>
> @@ -354,7 +354,7 @@ int virHostValidateIOMMU(const char *hvname,
> "No ACPI DMAR table found, IOMMU either "
> "disabled in BIOS or not supported by this "
> "hardware platform");
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
> } else if (isAMD) {
> virHostMsgCheck(hvname, "%s", _("for device assignment IOMMU
support"));
> @@ -366,7 +366,7 @@ int virHostValidateIOMMU(const char *hvname,
> "No ACPI IVRS table found, IOMMU either "
> "disabled in BIOS or not supported by this "
> "hardware platform");
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
> } else if (ARCH_IS_PPC64(arch)) {
> /* Empty Block */
> @@ -385,7 +385,7 @@ int virHostValidateIOMMU(const char *hvname,
> } else {
> virHostMsgFail(level,
> "Unknown if this platform has IOMMU support");
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
>
>
> @@ -404,7 +404,7 @@ int virHostValidateIOMMU(const char *hvname,
> "Add %s to kernel cmdline arguments",
bootarg);
> else
> virHostMsgFail(level, "IOMMU capability not compiled into
kernel.");
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
> }
> virHostMsgPass();
> return 0;
> @@ -468,7 +468,7 @@ int virHostValidateSecureGuests(const char *hvname,
> }
>
> if (virFileReadValueString(&cmdline, "/proc/cmdline")
< 0)
> - return -1;
> + return VIR_HOST_VALIDATE_FAILURE(level);
1: here. IIUC, at the beginning of this function (not included in this
context) a message is printed out:
virHostMsgCheck(hvname, "%s", _("for secure guest support"));
but this 'return' makes it exit early (even without terminating the
line, corrupting the output).
Mind that this is something that's been exposed, rather than
introduced by this patch.
```
[fidencio@dentola ~]$ virt-host-validate qemu
QEMU: comprobando if device /dev/kvm exists
: PASA
QEMU: comprobando if device /dev/kvm is accessible
: PASA
QEMU: comprobando if device /dev/vhost-net exists
: PASA
QEMU: comprobando if device /dev/net/tun exists
: PASA
QEMU: comprobando for cgroup 'cpu' controller support
: PASA
QEMU: comprobando for cgroup 'cpuacct' controller support
: PASA
QEMU: comprobando for cgroup 'cpuset' controller support
: PASA
QEMU: comprobando for cgroup 'memory' controller support
: PASA
QEMU: comprobando for cgroup 'devices' controller support
: ADVERTENCIA (Enable 'devices' in kernel Kconfig file or
mount/enable cgroup controller in your system)
QEMU: comprobando for cgroup 'blkio' controller support
: PASA
ADVERTENCIA (Unknown if this platform has IOMMU support) -----> THIS
IS ALREADY WRONG!
QEMU: comprobando for secure guest support
: ADVERTENCIA (Unknown if this platform has Secure Guest
support)
```
Right now we've been lucky enough that the majority of the users never
hit those, but those problems are existent problems.
Let me send a v3, just for the sake of not adding more bugs to this
bucket, so we'll only respect the caller's choice when verifying the
support, but not when we fail to initialize an internal structure.
This will leave the code 1:1 with what it was before, and then
clean-ups can come later. What do you think?
So, long story short, I'm inclined to merge your patch and will
post
some cleanups shortly.
See my suggestion above.
I don't have many free cycles in my hands to work on those cleanups,
but I'd be super happy to review & test them in case you have the
time.
Best Regards,
--
Fabiano Fidêncio