[libvirt] [PATCH v3 0/3] unplug timeout changes for PPC64

changes in v3: - redesigned patch 1 based on Cole Robinson feedback v2: https://www.redhat.com/archives/libvir-list/2019-September/msg00447.html v1: https://www.redhat.com/archives/libvir-list/2019-August/msg00698.html Daniel Henrique Barboza (3): qemu_hotplug.c: adding qemuDomainGetUnplugTimeout qemu: Remove qemu_hotplugpriv.h and qemuDomainRemoveDeviceWaitTime qemu_hotplug.c: user-friendlier setvcpus timeout error message src/qemu/Makefile.inc.am | 1 - src/qemu/qemu_hotplug.c | 28 ++++++++++++++----- src/qemu/qemu_hotplug.h | 2 ++ tests/Makefile.am | 13 ++++++++- .../qemuhotplugmock.c | 27 +++++++++--------- tests/qemuhotplugtest.c | 7 ++--- 6 files changed, 51 insertions(+), 27 deletions(-) rename src/qemu/qemu_hotplugpriv.h => tests/qemuhotplugmock.c (61%) -- 2.21.0

For some architectures and setups, device removal can take longer than the default 5 seconds. This results in commands such as 'virsh setvcpus' to fire timeout messages even if the operation were successful in the guest, confusing the user. This patch sets a new 10 seconds unplug timeout for PPC64 guests. All other archs will keep the default 5 seconds timeout. Instead of putting 'if PPC64' conditionals inside qemu_hotplug.c to set the new timeout value, a new function called qemuDomainGetUnplugTimeout was added. The timeout value is then retrieved when needed, by passing the correspondent DomainDef object. This approach allows for different guest architectures to have distint unplug timeout intervals, regardless of the host architecture. This design also makes it easier to modify/enhance the unplug timeout logic in the future (allow for special timeouts for TCG domains, for example). A new mock file was created to work with qemuhotplugtest.c, given that the test timeout is significantly shorter than the actual timeout value in qemu_hotplug.c. The now unused 'qemuDomainRemoveDeviceWaitTime' global can't be simply erased from qemu_hotplug.c though. Next patch will remove it properly. Suggested-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_hotplug.c | 20 +++++++++++++++++++- src/qemu/qemu_hotplug.h | 2 ++ tests/Makefile.am | 13 ++++++++++++- tests/qemuhotplugmock.c | 33 +++++++++++++++++++++++++++++++++ tests/qemuhotplugtest.c | 3 ++- 5 files changed, 68 insertions(+), 3 deletions(-) create mode 100644 tests/qemuhotplugmock.c diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index bf301919cc..a7955e8062 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -63,6 +63,13 @@ VIR_LOG_INIT("qemu.qemu_hotplug"); #define CHANGE_MEDIA_TIMEOUT 5000 +/* Timeout in miliseconds for device removal. PPC64 domains + * can experience a bigger delay in unplug operations during + * heavy guest activity (vcpu being the most notable case), thus + * the timeout for PPC64 is also bigger. */ +#define QEMU_UNPLUG_TIMEOUT 1000ull * 5 +#define QEMU_UNPLUG_TIMEOUT_PPC64 1000ull * 10 + /* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; @@ -5112,6 +5119,17 @@ qemuDomainResetDeviceRemoval(virDomainObjPtr vm) priv->unplug.eventSeen = false; } + +unsigned long long +qemuDomainGetUnplugTimeout(virDomainObjPtr vm) +{ + if (qemuDomainIsPSeries(vm->def)) + return QEMU_UNPLUG_TIMEOUT_PPC64; + + return QEMU_UNPLUG_TIMEOUT; +} + + /* Returns: * -1 Unplug of the device failed * @@ -5130,7 +5148,7 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) if (virTimeMillisNow(&until) < 0) return 1; - until += qemuDomainRemoveDeviceWaitTime; + until += qemuDomainGetUnplugTimeout(vm); while (priv->unplug.alias) { if ((rc = virDomainObjWaitUntil(vm, until)) == 1) diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h index 6d2cd34dbc..1dfc601110 100644 --- a/src/qemu/qemu_hotplug.h +++ b/src/qemu/qemu_hotplug.h @@ -161,3 +161,5 @@ int qemuDomainDetachDBusVMState(virQEMUDriverPtr driver, virDomainObjPtr vm, const char *id, qemuDomainAsyncJob asyncJob); + +unsigned long long qemuDomainGetUnplugTimeout(virDomainObjPtr vm); diff --git a/tests/Makefile.am b/tests/Makefile.am index a9acd88670..8674b1f9da 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -223,6 +223,7 @@ test_libraries = libshunload.la \ libvirhostcpumock.la \ libdomaincapsmock.la \ libvirfilecachemock.la \ + libqemuhotplugmock.la \ $(NULL) if WITH_REMOTE @@ -565,6 +566,11 @@ libqemucpumock_la_SOURCES = \ libqemucpumock_la_LDFLAGS = $(MOCKLIBS_LDFLAGS) libqemucpumock_la_LIBADD = $(MOCKLIBS_LIBS) +libqemuhotplugmock_la_SOURCES = \ + qemuhotplugmock.c +libqemuhotplugmock_la_LDFLAGS = $(MOCKLIBS_LDFLAGS) +libqemuhotplugmock_la_LIBADD = $(MOCKLIBS_LIBS) + qemuxml2argvtest_SOURCES = \ qemuxml2argvtest.c testutilsqemu.c testutilsqemu.h \ testutils.c testutils.h \ @@ -642,7 +648,11 @@ qemuhotplugtest_SOURCES = \ testutils.c testutils.h \ testutilsqemu.c testutilsqemu.h \ $(NULL) -qemuhotplugtest_LDADD = libqemumonitortestutils.la $(qemu_LDADDS) +qemuhotplugtest_LDADD = \ + libqemutestdriver.la \ + libqemumonitortestutils.la \ + $(qemu_LDADDS) \ + $(NULL) qemublocktest_SOURCES = \ qemublocktest.c \ @@ -716,6 +726,7 @@ EXTRA_DIST += qemuxml2argvtest.c qemuxml2xmltest.c \ qemusecuritymock.c \ qemufirmwaretest.c \ qemuvhostusertest.c \ + qemuhotplugmock.c \ $(QEMUMONITORTESTUTILS_SOURCES) endif ! WITH_QEMU diff --git a/tests/qemuhotplugmock.c b/tests/qemuhotplugmock.c new file mode 100644 index 0000000000..43a9d79051 --- /dev/null +++ b/tests/qemuhotplugmock.c @@ -0,0 +1,33 @@ +/* + * Copyright (C) 2019 IBM Corporation + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include "qemu/qemu_hotplug.h" +#include "conf/domain_conf.h" + +unsigned long long +qemuDomainGetUnplugTimeout(virDomainObjPtr vm G_GNUC_UNUSED) +{ + /* Wait only 100ms for DEVICE_DELETED event. Give a greater + * timeout in case of PSeries guest to be consistent with the + * original logic. */ + if (qemuDomainIsPSeries(vm->def)) + return 200; + return 100; +} diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index d3da08875a..5f2fc6a598 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -888,4 +888,5 @@ mymain(void) VIR_TEST_MAIN_PRELOAD(mymain, VIR_TEST_MOCK("virpci"), - VIR_TEST_MOCK("virprocess")); + VIR_TEST_MOCK("virprocess"), + VIR_TEST_MOCK("qemuhotplug")); -- 2.21.0

qemu_hotplugpriv.h is a header file created to share a global variable called 'qemuDomainRemoveDeviceWaitTime', declared in qemu_hotplug.c, to other files that would want to change the timeout value (currently, only tests/qemuhotplugtest.c). Previous patch deprecated the variable, using qemu_driver->unplugTimeout to set the timeout instead. This means that the header file is now unused, and can be safely discarded. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/Makefile.inc.am | 1 - src/qemu/qemu_hotplug.c | 5 ----- src/qemu/qemu_hotplugpriv.h | 32 -------------------------------- tests/qemuhotplugtest.c | 4 ---- 4 files changed, 42 deletions(-) delete mode 100644 src/qemu/qemu_hotplugpriv.h diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index e66da76c0a..1827ef6e04 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -29,7 +29,6 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_hostdev.h \ qemu/qemu_hotplug.c \ qemu/qemu_hotplug.h \ - qemu/qemu_hotplugpriv.h \ qemu/qemu_conf.c \ qemu/qemu_conf.h \ qemu/qemu_interop_config.c \ diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a7955e8062..32100b140e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -23,8 +23,6 @@ #include <config.h> #include "qemu_hotplug.h" -#define LIBVIRT_QEMU_HOTPLUGPRIV_H_ALLOW -#include "qemu_hotplugpriv.h" #include "qemu_alias.h" #include "qemu_capabilities.h" #include "qemu_domain.h" @@ -70,9 +68,6 @@ VIR_LOG_INIT("qemu.qemu_hotplug"); #define QEMU_UNPLUG_TIMEOUT 1000ull * 5 #define QEMU_UNPLUG_TIMEOUT_PPC64 1000ull * 10 -/* Wait up to 5 seconds for device removal to finish. */ -unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5; - static void qemuDomainResetDeviceRemoval(virDomainObjPtr vm); diff --git a/src/qemu/qemu_hotplugpriv.h b/src/qemu/qemu_hotplugpriv.h deleted file mode 100644 index a5c443ba85..0000000000 --- a/src/qemu/qemu_hotplugpriv.h +++ /dev/null @@ -1,32 +0,0 @@ -/* - * qemu_hotplugpriv.h: private declarations for QEMU device hotplug management - * - * Copyright (C) 2013 Red Hat, Inc. - * - * This library is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * This library is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this library. If not, see - * <http://www.gnu.org/licenses/>. - * - */ - -#ifndef LIBVIRT_QEMU_HOTPLUGPRIV_H_ALLOW -# error "qemu_hotplugpriv.h may only be included by qemu_hotplug.c or test suites" -#endif /* LIBVIRT_QEMU_HOTPLUGPRIV_H_ALLOW */ - -#pragma once - -/* - * This header file should never be used outside unit tests. - */ - -extern unsigned long long qemuDomainRemoveDeviceWaitTime; diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index 5f2fc6a598..140ab902ce 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -22,8 +22,6 @@ #include "qemu/qemu_alias.h" #include "qemu/qemu_conf.h" #include "qemu/qemu_hotplug.h" -#define LIBVIRT_QEMU_HOTPLUGPRIV_H_ALLOW -#include "qemu/qemu_hotplugpriv.h" #include "qemumonitortestutils.h" #include "testutils.h" #include "testutilsqemu.h" @@ -643,8 +641,6 @@ mymain(void) driver.hostdevMgr = virHostdevManagerGetDefault(); - /* wait only 100ms for DEVICE_DELETED event */ - qemuDomainRemoveDeviceWaitTime = 100; #define DO_TEST(file, ACTION, dev, fial, kep, ...) \ do { \ -- 2.21.0

The current 'setvcpus' timeout message requires a deeper understanding of QEMU/Libvirt internals to proper react to it. One who knows how setvcpus unplug work (it is an asynchronous operation between QEMU and guest that Libvirt can't know for sure if it failed, unless an explicit error happened during the timeout period) will read the message and not assume a failed operation. But the regular user, most often than not, will read it and believe that the unplug operation failed. This leads to situations where the user isn't exactly relieved when accessing the guest and seeing that the unplug operation worked. Instead, the user feel mislead by the timeout message setvcpus threw. Changing the timeout message to let the user know that the unplug status is not known, and manual inspection in the guest is required, is not a silver bullet. But it gives a more realistic expectation of what happened, as best as we can tell from Libvirt side anyways. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_hotplug.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 32100b140e..72015e02e2 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -6007,8 +6007,9 @@ qemuDomainHotplugDelVcpu(virQEMUDriverPtr driver, if ((rc = qemuDomainWaitForDeviceRemoval(vm)) <= 0) { if (rc == 0) - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("vcpu unplug request timed out")); + virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s", + _("vcpu unplug request timed out. Unplug result " + "must be manually inspected in the domain")); goto cleanup; } -- 2.21.0

On 10/18/19 2:36 PM, Daniel Henrique Barboza wrote:
changes in v3: - redesigned patch 1 based on Cole Robinson feedback
v2: https://www.redhat.com/archives/libvir-list/2019-September/msg00447.html v1: https://www.redhat.com/archives/libvir-list/2019-August/msg00698.html
Daniel Henrique Barboza (3): qemu_hotplug.c: adding qemuDomainGetUnplugTimeout qemu: Remove qemu_hotplugpriv.h and qemuDomainRemoveDeviceWaitTime qemu_hotplug.c: user-friendlier setvcpus timeout error message
src/qemu/Makefile.inc.am | 1 - src/qemu/qemu_hotplug.c | 28 ++++++++++++++----- src/qemu/qemu_hotplug.h | 2 ++ tests/Makefile.am | 13 ++++++++- .../qemuhotplugmock.c | 27 +++++++++--------- tests/qemuhotplugtest.c | 7 ++--- 6 files changed, 51 insertions(+), 27 deletions(-) rename src/qemu/qemu_hotplugpriv.h => tests/qemuhotplugmock.c (61%)
Reviewed and pushed. Sorry for the delay. One comment though: patch #1 was a mix of refactoring, and the PPC timeout change. In the future please separate changes like that into two patches Thanks, Cole
participants (2)
-
Cole Robinson
-
Daniel Henrique Barboza