[libvirt] [PATCH v4 00/23] Introduce metadata locking

Technically, this is v4 of: https://www.redhat.com/archives/libvir-list/2018-August/msg01627.html However, this is implementing different approach than any of the previous versions. One of the problems with previous version was that it was too complicated. The main reason for that was that we could not close the connection whilst there was a file locked. So we had to invent a mechanism that would prevent that (on the client side). These patches implement different approach. They rely on secdriver's transactions which bring all the paths we want to label into one place so that they can be relabelled within different namespace. I'm extending this idea so that transactions run all the time (regardless of domain namespacing) and only at the very last moment is decided which namespace would the relabeling run in. Metadata locking is then as easy as putting lock/unlock calls around one function. You can find the patches at my github too: https://github.com/zippy2/libvirt/tree/disk_metadata_lock_v4_alt Michal Prívozník (23): qemu_security: Fully implement qemuSecurityDomainSetPathLabel qemu_security: Fully implement qemuSecurity{Set,Restore}SavedStateLabel qemu_security: Require full wrappers for APIs that might touch a file virSecurityManagerTransactionCommit: Accept pid == -1 qemu_security: Run transactions more frequently virlockspace: Allow caller to specify start and length offset in virLockSpaceAcquireResource lock_driver_lockd: Introduce VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA flag lock_driver: Introduce new VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON _virLockManagerLockDaemonPrivate: Move @hasRWDisks into dom union lock_driver: Introduce VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA lock_driver: Introduce VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK lock_daemon_dispatch: Check for ownerPid rather than ownerId lock_manager: Allow disabling configFile for virLockManagerPluginNew qemu_conf: Introduce metadata_lock_manager security_manager: Load lock plugin on init security_manager: Introduce metadata locking APIs security_dac: Move transaction handling up one level security_dac: Fix info messages when chown()-ing security_dac: Lock metadata when running transaction virSecuritySELinuxRestoreFileLabel: Rename 'err' label virSecuritySELinuxRestoreFileLabel: Adjust code pattern security_selinux: Move transaction handling up one level security_dac: Lock metadata when running transaction cfg.mk | 4 +- src/locking/lock_daemon_dispatch.c | 25 ++- src/locking/lock_driver.h | 12 ++ src/locking/lock_driver_lockd.c | 417 +++++++++++++++++++++++++------------ src/locking/lock_driver_lockd.h | 1 + src/locking/lock_driver_sanlock.c | 44 ++-- src/locking/lock_manager.c | 10 +- src/lxc/lxc_controller.c | 3 +- src/lxc/lxc_driver.c | 2 +- src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 8 + src/qemu/qemu_conf.c | 13 ++ src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 3 +- src/qemu/qemu_driver.c | 10 +- src/qemu/qemu_process.c | 15 +- src/qemu/qemu_security.c | 272 +++++++++++++++++------- src/qemu/qemu_security.h | 18 +- src/qemu/test_libvirtd_qemu.aug.in | 1 + src/security/security_dac.c | 134 ++++++++---- src/security/security_manager.c | 171 ++++++++++++++- src/security/security_manager.h | 9 + src/security/security_selinux.c | 118 ++++++++--- src/util/virlockspace.c | 15 +- src/util/virlockspace.h | 4 + tests/seclabeltest.c | 2 +- tests/securityselinuxlabeltest.c | 2 +- tests/securityselinuxtest.c | 2 +- tests/testutilsqemu.c | 2 +- tests/virlockspacetest.c | 29 ++- 30 files changed, 1006 insertions(+), 342 deletions(-) -- 2.16.4

Even though the current use of the function does not require full implementation with transactions (none of the callers pass a path somewhere under /dev), it doesn't hurt either. Moreover, in future patches the paradigm is going to shift so that any API that touches a file is required to use transactions. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 3 +-- src/qemu/qemu_process.c | 15 ++++++--------- src/qemu/qemu_security.c | 30 ++++++++++++++++++++++++++++++ src/qemu/qemu_security.h | 6 +++++- 4 files changed, 42 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5329899b13..6425c886a3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -808,8 +808,7 @@ qemuDomainWriteMasterKeyFile(virQEMUDriverPtr driver, goto cleanup; } - if (qemuSecurityDomainSetPathLabel(driver->securityManager, - vm->def, path, false) < 0) + if (qemuSecurityDomainSetPathLabel(driver, vm, path, false) < 0) goto cleanup; ret = 0; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index eb9904b7ba..3820e04f91 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2790,8 +2790,7 @@ qemuProcessStartManagedPRDaemon(virDomainObjPtr vm) virCgroupAddMachineTask(priv->cgroup, cpid) < 0) goto cleanup; - if (qemuSecurityDomainSetPathLabel(driver->securityManager, - vm->def, socketPath, true) < 0) + if (qemuSecurityDomainSetPathLabel(driver, vm, socketPath, true) < 0) goto cleanup; priv->prDaemonRunning = true; @@ -3653,7 +3652,7 @@ qemuProcessNeedMemoryBackingPath(virDomainDefPtr def, static int qemuProcessBuildDestroyMemoryPathsImpl(virQEMUDriverPtr driver, - virDomainDefPtr def, + virDomainObjPtr vm, const char *path, bool build) { @@ -3668,8 +3667,7 @@ qemuProcessBuildDestroyMemoryPathsImpl(virQEMUDriverPtr driver, return -1; } - if (qemuSecurityDomainSetPathLabel(driver->securityManager, - def, path, true) < 0) + if (qemuSecurityDomainSetPathLabel(driver, vm, path, true) < 0) return -1; } else { if (virFileDeleteTree(path) < 0) @@ -3705,7 +3703,7 @@ qemuProcessBuildDestroyMemoryPaths(virQEMUDriverPtr driver, if (!path) goto cleanup; - if (qemuProcessBuildDestroyMemoryPathsImpl(driver, vm->def, + if (qemuProcessBuildDestroyMemoryPathsImpl(driver, vm, path, build) < 0) goto cleanup; @@ -3717,7 +3715,7 @@ qemuProcessBuildDestroyMemoryPaths(virQEMUDriverPtr driver, if (qemuGetMemoryBackingDomainPath(vm->def, cfg, &path) < 0) goto cleanup; - if (qemuProcessBuildDestroyMemoryPathsImpl(driver, vm->def, + if (qemuProcessBuildDestroyMemoryPathsImpl(driver, vm, path, build) < 0) goto cleanup; @@ -4909,8 +4907,7 @@ qemuProcessMakeDir(virQEMUDriverPtr driver, goto cleanup; } - if (qemuSecurityDomainSetPathLabel(driver->securityManager, - vm->def, path, true) < 0) + if (qemuSecurityDomainSetPathLabel(driver, vm, path, true) < 0) goto cleanup; ret = 0; diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index af3be42854..268def309a 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -493,3 +493,33 @@ qemuSecurityCleanupTPMEmulator(virQEMUDriverPtr driver, { virSecurityManagerRestoreTPMLabels(driver->securityManager, def); } + + +int +qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *path, + bool allowSubtree) +{ + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerDomainSetPathLabel(driver->securityManager, + vm->def, + path, + allowSubtree) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; +} diff --git a/src/qemu/qemu_security.h b/src/qemu/qemu_security.h index a189b63828..fd11fbdd9d 100644 --- a/src/qemu/qemu_security.h +++ b/src/qemu/qemu_security.h @@ -95,12 +95,16 @@ int qemuSecurityStartTPMEmulator(virQEMUDriverPtr driver, void qemuSecurityCleanupTPMEmulator(virQEMUDriverPtr driver, virDomainDefPtr def); +int qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *path, + bool allowSubtree); + /* Please note that for these APIs there is no wrapper yet. Do NOT blindly add * new APIs here. If an API can touch a /dev file add a proper wrapper instead. */ # define qemuSecurityCheckAllLabel virSecurityManagerCheckAllLabel # define qemuSecurityClearSocketLabel virSecurityManagerClearSocketLabel -# define qemuSecurityDomainSetPathLabel virSecurityManagerDomainSetPathLabel # define qemuSecurityGenLabel virSecurityManagerGenLabel # define qemuSecurityGetBaseLabel virSecurityManagerGetBaseLabel # define qemuSecurityGetDOI virSecurityManagerGetDOI -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Even though the current use of the function does not require full implementation with transactions (none of the callers pass a path somewhere under /dev), it doesn't hurt either. Moreover, in future patches the paradigm is going to shift so that any API that touches a file is required to use transactions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 3 +-- src/qemu/qemu_process.c | 15 ++++++--------- src/qemu/qemu_security.c | 30 ++++++++++++++++++++++++++++++ src/qemu/qemu_security.h | 6 +++++- 4 files changed, 42 insertions(+), 12 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

Even though the current use of the functions does not require full implementation with transactions (none of the callers passes a path somewhere under /dev), it doesn't hurt either. Moreover, in future patches the paradigm is going to shift so that any API that touches a file is required to use transactions. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 7 +++--- src/qemu/qemu_security.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_security.h | 10 +++++++-- 3 files changed, 67 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2f8d6915e1..6763c8cddc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4043,7 +4043,7 @@ qemuDomainScreenshot(virDomainPtr dom, } unlink_tmp = true; - qemuSecuritySetSavedStateLabel(driver->securityManager, vm->def, tmp); + qemuSecuritySetSavedStateLabel(driver, vm, tmp); qemuDomainObjEnterMonitor(driver, vm); if (qemuMonitorScreendump(priv->mon, videoAlias, screen, tmp) < 0) { @@ -6662,8 +6662,7 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, virObjectUnref(cookie); virCommandFree(cmd); VIR_FREE(errbuf); - if (qemuSecurityRestoreSavedStateLabel(driver->securityManager, - vm->def, path) < 0) + if (qemuSecurityRestoreSavedStateLabel(driver, vm, path) < 0) VIR_WARN("failed to restore save state label on %s", path); virObjectUnref(cfg); return ret; @@ -11828,7 +11827,7 @@ qemuDomainMemoryPeek(virDomainPtr dom, goto endjob; } - qemuSecuritySetSavedStateLabel(driver->securityManager, vm->def, tmp); + qemuSecuritySetSavedStateLabel(driver, vm, tmp); priv = vm->privateData; qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 268def309a..c64fbdda38 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -523,3 +523,59 @@ qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, virSecurityManagerTransactionAbort(driver->securityManager); return ret; } + + +int +qemuSecuritySetSavedStateLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *savefile) +{ + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetSavedStateLabel(driver->securityManager, + vm->def, + savefile) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; +} + + +int +qemuSecurityRestoreSavedStateLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *savefile) +{ + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerRestoreSavedStateLabel(driver->securityManager, + vm->def, + savefile) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; +} diff --git a/src/qemu/qemu_security.h b/src/qemu/qemu_security.h index fd11fbdd9d..c57774deba 100644 --- a/src/qemu/qemu_security.h +++ b/src/qemu/qemu_security.h @@ -100,6 +100,14 @@ int qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, const char *path, bool allowSubtree); +int qemuSecuritySetSavedStateLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *savefile); + +int qemuSecurityRestoreSavedStateLabel(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *savefile); + /* Please note that for these APIs there is no wrapper yet. Do NOT blindly add * new APIs here. If an API can touch a /dev file add a proper wrapper instead. */ @@ -119,11 +127,9 @@ int qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, # define qemuSecurityPreFork virSecurityManagerPreFork # define qemuSecurityReleaseLabel virSecurityManagerReleaseLabel # define qemuSecurityReserveLabel virSecurityManagerReserveLabel -# define qemuSecurityRestoreSavedStateLabel virSecurityManagerRestoreSavedStateLabel # define qemuSecuritySetChildProcessLabel virSecurityManagerSetChildProcessLabel # define qemuSecuritySetDaemonSocketLabel virSecurityManagerSetDaemonSocketLabel # define qemuSecuritySetImageFDLabel virSecurityManagerSetImageFDLabel -# define qemuSecuritySetSavedStateLabel virSecurityManagerSetSavedStateLabel # define qemuSecuritySetSocketLabel virSecurityManagerSetSocketLabel # define qemuSecuritySetTapFDLabel virSecurityManagerSetTapFDLabel # define qemuSecurityStackAddNested virSecurityManagerStackAddNested -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Even though the current use of the functions does not require full implementation with transactions (none of the callers passes a path somewhere under /dev), it doesn't hurt either. Moreover, in future patches the paradigm is going to shift so that any API that touches a file is required to use transactions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 7 +++--- src/qemu/qemu_security.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_security.h | 10 +++++++-- 3 files changed, 67 insertions(+), 6 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

In the future, the transactions are not going to be optional and they will be run regardless of domain using namespace to collect list of paths to be relabeled. To make sure there won't be an API that goes behind transaction code back update the comment that serves as decision manual whether an API must be fully implemented or plain #define is sufficient. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_security.h b/src/qemu/qemu_security.h index c57774deba..ba12eb3caf 100644 --- a/src/qemu/qemu_security.h +++ b/src/qemu/qemu_security.h @@ -109,7 +109,7 @@ int qemuSecurityRestoreSavedStateLabel(virQEMUDriverPtr driver, const char *savefile); /* Please note that for these APIs there is no wrapper yet. Do NOT blindly add - * new APIs here. If an API can touch a /dev file add a proper wrapper instead. + * new APIs here. If an API can touch a file add a proper wrapper instead. */ # define qemuSecurityCheckAllLabel virSecurityManagerCheckAllLabel # define qemuSecurityClearSocketLabel virSecurityManagerClearSocketLabel -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
In the future, the transactions are not going to be optional and they will be run regardless of domain using namespace to collect list of paths to be relabeled.
To make sure there won't be an API that goes behind transaction code back update the comment that serves as decision manual whether an API must be fully implemented or plain #define is sufficient.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

It will be desirable to run transactions more often than we currently do. Even if the domain we're relabeling the paths for does not run in a namespace. If that's the case, there is no need to fork() as we are already running in the right namespace. To differentiate whether transaction code should fork() or not the @pid argument now accepts -1 (which means do not fork). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 22 ++++++++++++++-------- src/security/security_manager.c | 14 +++++++++----- src/security/security_selinux.c | 23 +++++++++++++++-------- 3 files changed, 38 insertions(+), 21 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 2a5f8639fe..926c9a33c1 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -485,11 +485,14 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) * @mgr: security manager * @pid: domain's PID * - * Enters the @pid namespace (usually @pid refers to a domain) and - * performs all the chown()-s on the list. Note that the transaction is - * also freed, therefore new one has to be started after successful - * return from this function. Also it is considered as error if there's - * no transaction set and this function is called. + * If @pid is not -1 then enter the @pid namespace (usually @pid refers + * to a domain) and perform all the chown()-s on the list. If @pid is -1 + * then the transaction is performed in the namespace of the caller. + * + * Note that the transaction is also freed, therefore new one has to be + * started after successful return from this function. Also it is + * considered as error if there's no transaction set and this function + * is called. * * Returns: 0 on success, * -1 otherwise. @@ -514,9 +517,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if (virProcessRunInMountNamespace(pid, - virSecurityDACTransactionRun, - list) < 0) + if ((pid == -1 && + virSecurityDACTransactionRun(pid, list) < 0) || + (pid != -1 && + virProcessRunInMountNamespace(pid, + virSecurityDACTransactionRun, + list) < 0)) goto cleanup; ret = 0; diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 21eb6f7452..9f770d8c53 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -267,11 +267,15 @@ virSecurityManagerTransactionStart(virSecurityManagerPtr mgr) * @mgr: security manager * @pid: domain's PID * - * Enters the @pid namespace (usually @pid refers to a domain) and - * performs all the operations on the transaction list. Note that the - * transaction is also freed, therefore new one has to be started after - * successful return from this function. Also it is considered as error - * if there's no transaction set and this function is called. + * If @pid is not -1 then enter the @pid namespace (usually @pid refers + * to a domain) and perform all the operations on the transaction list. + * If @pid is -1 then the transaction is performed in the namespace of + * the caller. + * + * Note that the transaction is also freed, therefore new one has to be + * started after successful return from this function. Also it is + * considered as error if there's no transaction set and this function + * is called. * * Returns: 0 on success, * -1 otherwise. diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 96944d0202..288f3628f7 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1040,11 +1040,15 @@ virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) * @mgr: security manager * @pid: domain's PID * - * Enters the @pid namespace (usually @pid refers to a domain) and - * performs all the sefilecon()-s on the list. Note that the - * transaction is also freed, therefore new one has to be started after - * successful return from this function. Also it is considered as error - * if there's no transaction set and this function is called. + * If @pis is not -1 then enter the @pid namespace (usually @pid refers + * to a domain) and perform all the sefilecon()-s on the list. If @pid + * is -1 then the transaction is performed in the namespace of the + * caller. + * + * Note that the transaction is also freed, therefore new one has to be + * started after successful return from this function. Also it is + * considered as error if there's no transaction set and this function + * is called. * * Returns: 0 on success, * -1 otherwise. @@ -1066,9 +1070,12 @@ virSecuritySELinuxTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if (virProcessRunInMountNamespace(pid, - virSecuritySELinuxTransactionRun, - list) < 0) + if ((pid == -1 && + virSecuritySELinuxTransactionRun(pid, list) < 0) || + (pid != -1 && + virProcessRunInMountNamespace(pid, + virSecuritySELinuxTransactionRun, + list) < 0)) goto cleanup; ret = 0; -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
It will be desirable to run transactions more often than we currently do. Even if the domain we're relabeling the paths for does not run in a namespace. If that's the case, there is no need to fork() as we are already running in the right namespace. To differentiate whether transaction code should fork() or not the @pid argument now accepts -1 (which means do not fork).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 22 ++++++++++++++-------- src/security/security_manager.c | 14 +++++++++----- src/security/security_selinux.c | 23 +++++++++++++++-------- 3 files changed, 38 insertions(+), 21 deletions(-)
I do have a dislike for the code format chosen - I prefer: if (pid == -1) else is so much easier for me to read. Reviewed-by: John Ferlan <jferlan@redhat.com> John

And by "more frequently" I mean always. This is needed so that we have a single place where all the paths a thread wants to relabel are stored. This enables us to lock them all at once (for metadata), do the relabel and unlock at once again. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 216 ++++++++++++++++++++++++++++------------------- 1 file changed, 129 insertions(+), 87 deletions(-) diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index c64fbdda38..b538e08616 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -39,9 +39,12 @@ qemuSecuritySetAllLabel(virQEMUDriverPtr driver, { int ret = -1; qemuDomainObjPrivatePtr priv = vm->privateData; + pid_t pid = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetAllLabel(driver->securityManager, @@ -50,9 +53,7 @@ qemuSecuritySetAllLabel(virQEMUDriverPtr driver, priv->chardevStdioLogd) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -69,16 +70,21 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, { qemuDomainObjPrivatePtr priv = vm->privateData; - /* In contrast to qemuSecuritySetAllLabel, do not use - * secdriver transactions here. This function is called from - * qemuProcessStop() which is meant to do cleanup after qemu - * process died. If it did do, the namespace is gone as qemu - * was the only process running there. We would not succeed - * in entering the namespace then. */ + /* In contrast to qemuSecuritySetAllLabel, do not use vm->pid + * here. This function is called from qemuProcessStop() which + * is meant to do cleanup after qemu process died. The + * domain's namespace is gone as qemu was the only process + * running there. We would not succeed in entering the + * namespace then. */ + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) + return; + virSecurityManagerRestoreAllLabel(driver->securityManager, vm->def, migrated, priv->chardevStdioLogd); + + virSecurityManagerTransactionCommit(driver->securityManager, -1); } @@ -87,10 +93,13 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetDiskLabel(driver->securityManager, @@ -98,9 +107,7 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, disk) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -115,10 +122,13 @@ qemuSecurityRestoreDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreDiskLabel(driver->securityManager, @@ -126,9 +136,7 @@ qemuSecurityRestoreDiskLabel(virQEMUDriverPtr driver, disk) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -143,10 +151,13 @@ qemuSecuritySetImageLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virStorageSourcePtr src) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetImageLabel(driver->securityManager, @@ -154,9 +165,7 @@ qemuSecuritySetImageLabel(virQEMUDriverPtr driver, src) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -171,10 +180,13 @@ qemuSecurityRestoreImageLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virStorageSourcePtr src) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreImageLabel(driver->securityManager, @@ -182,9 +194,7 @@ qemuSecurityRestoreImageLabel(virQEMUDriverPtr driver, src) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -199,10 +209,13 @@ qemuSecuritySetHostdevLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetHostdevLabel(driver->securityManager, @@ -211,9 +224,7 @@ qemuSecuritySetHostdevLabel(virQEMUDriverPtr driver, NULL) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -228,10 +239,13 @@ qemuSecurityRestoreHostdevLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreHostdevLabel(driver->securityManager, @@ -240,9 +254,7 @@ qemuSecurityRestoreHostdevLabel(virQEMUDriverPtr driver, NULL) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -257,10 +269,13 @@ qemuSecuritySetMemoryLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetMemoryLabel(driver->securityManager, @@ -268,9 +283,7 @@ qemuSecuritySetMemoryLabel(virQEMUDriverPtr driver, mem) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -285,10 +298,13 @@ qemuSecurityRestoreMemoryLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreMemoryLabel(driver->securityManager, @@ -296,9 +312,7 @@ qemuSecurityRestoreMemoryLabel(virQEMUDriverPtr driver, mem) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -314,10 +328,13 @@ qemuSecuritySetInputLabel(virDomainObjPtr vm, { qemuDomainObjPrivatePtr priv = vm->privateData; virQEMUDriverPtr driver = priv->driver; + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetInputLabel(driver->securityManager, @@ -325,9 +342,7 @@ qemuSecuritySetInputLabel(virDomainObjPtr vm, input) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -343,10 +358,13 @@ qemuSecurityRestoreInputLabel(virDomainObjPtr vm, { qemuDomainObjPrivatePtr priv = vm->privateData; virQEMUDriverPtr driver = priv->driver; + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreInputLabel(driver->securityManager, @@ -354,9 +372,7 @@ qemuSecurityRestoreInputLabel(virDomainObjPtr vm, input) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -373,9 +389,12 @@ qemuSecuritySetChardevLabel(virQEMUDriverPtr driver, { int ret = -1; qemuDomainObjPrivatePtr priv = vm->privateData; + pid_t pid = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetChardevLabel(driver->securityManager, @@ -384,9 +403,7 @@ qemuSecuritySetChardevLabel(virQEMUDriverPtr driver, priv->chardevStdioLogd) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -403,9 +420,12 @@ qemuSecurityRestoreChardevLabel(virQEMUDriverPtr driver, { int ret = -1; qemuDomainObjPrivatePtr priv = vm->privateData; + pid_t pid = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreChardevLabel(driver->securityManager, @@ -414,9 +434,7 @@ qemuSecurityRestoreChardevLabel(virQEMUDriverPtr driver, priv->chardevStdioLogd) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -454,10 +472,21 @@ qemuSecurityStartTPMEmulator(virQEMUDriverPtr driver, int *cmdret) { int ret = -1; + bool transactionStarted = false; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) + return -1; + transactionStarted = true; if (virSecurityManagerSetTPMLabels(driver->securityManager, - def) < 0) + def) < 0) { + virSecurityManagerTransactionAbort(driver->securityManager); + return -1; + } + + if (virSecurityManagerTransactionCommit(driver->securityManager, -1) < 0) goto cleanup; + transactionStarted = false; if (virSecurityManagerSetChildProcessLabel(driver->securityManager, def, cmd) < 0) @@ -481,8 +510,13 @@ qemuSecurityStartTPMEmulator(virQEMUDriverPtr driver, return 0; cleanup: + if (!transactionStarted) + virSecurityManagerTransactionStart(driver->securityManager); + virSecurityManagerRestoreTPMLabels(driver->securityManager, def); + virSecurityManagerTransactionCommit(driver->securityManager, -1); + virSecurityManagerTransactionAbort(driver->securityManager); return ret; } @@ -491,7 +525,12 @@ void qemuSecurityCleanupTPMEmulator(virQEMUDriverPtr driver, virDomainDefPtr def) { + virSecurityManagerTransactionStart(driver->securityManager); + virSecurityManagerRestoreTPMLabels(driver->securityManager, def); + + virSecurityManagerTransactionCommit(driver->securityManager, -1); + virSecurityManagerTransactionAbort(driver->securityManager); } @@ -501,10 +540,13 @@ qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, const char *path, bool allowSubtree) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerDomainSetPathLabel(driver->securityManager, @@ -513,9 +555,7 @@ qemuSecurityDomainSetPathLabel(virQEMUDriverPtr driver, allowSubtree) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -530,10 +570,13 @@ qemuSecuritySetSavedStateLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, const char *savefile) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerSetSavedStateLabel(driver->securityManager, @@ -541,9 +584,7 @@ qemuSecuritySetSavedStateLabel(virQEMUDriverPtr driver, savefile) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; @@ -558,10 +599,13 @@ qemuSecurityRestoreSavedStateLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, const char *savefile) { + pid_t pid = -1; int ret = -1; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup; if (virSecurityManagerRestoreSavedStateLabel(driver->securityManager, @@ -569,9 +613,7 @@ qemuSecurityRestoreSavedStateLabel(virQEMUDriverPtr driver, savefile) < 0) goto cleanup; - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup; ret = 0; -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
And by "more frequently" I mean always. This is needed so that we have a single place where all the paths a thread wants to relabel are stored. This enables us to lock them all at once (for metadata), do the relabel and unlock at once again.
Perhaps the first sentence or so "differently said": Now that committing transactions using pid == -1 means that we're not fork()-ing to run the transaction in a specific namespace, we can utilize the transaction processing semantics in order to start, run a or multiple commands, and then commit the transaction without being concerned with other interactions or transactions interrupting the processing. This will eventually allow us to have a single place where all the paths... (as you've already written).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 216 ++++++++++++++++++++++++++++------------------- 1 file changed, 129 insertions(+), 87 deletions(-)
This patch made Coverity unhappy because some of the void callers to virSecurityManagerTransactionStart or virSecurityManagerTransactionCommit don't check status: qemuSecurityStartTPMEmulator (cleanup: processing) qemuSecurityCleanupTPMEmulator qemuSecurityRestoreAllLabel
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index c64fbdda38..b538e08616 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -39,9 +39,12 @@ qemuSecuritySetAllLabel(virQEMUDriverPtr driver, { int ret = -1; qemuDomainObjPrivatePtr priv = vm->privateData; + pid_t pid = -1;
- if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + pid = vm->pid; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) goto cleanup;
I know existing, if a transaction fails to start does it need to be aborted... I know it doesn't matter by looking at the code, but that thought could be important later on.
if (virSecurityManagerSetAllLabel(driver->securityManager, @@ -50,9 +53,7 @@ qemuSecuritySetAllLabel(virQEMUDriverPtr driver, priv->chardevStdioLogd) < 0) goto cleanup;
- if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) + if (virSecurityManagerTransactionCommit(driver->securityManager, pid) < 0) goto cleanup;
ret = 0; @@ -69,16 +70,21 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, { qemuDomainObjPrivatePtr priv = vm->privateData;
- /* In contrast to qemuSecuritySetAllLabel, do not use - * secdriver transactions here. This function is called from - * qemuProcessStop() which is meant to do cleanup after qemu - * process died. If it did do, the namespace is gone as qemu - * was the only process running there. We would not succeed - * in entering the namespace then. */ + /* In contrast to qemuSecuritySetAllLabel, do not use vm->pid + * here. This function is called from qemuProcessStop() which + * is meant to do cleanup after qemu process died. The + * domain's namespace is gone as qemu was the only process + * running there. We would not succeed in entering the + * namespace then. */ + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) + return; +
Interesting that in this instance you choose to honor the status, but for another void function qemuSecurityCleanupTPMEmulator the choice was to not care. IDC which way this is done, but it should be consistent. I can also see a reason that if a TransactionStart fails, that we at least attempt the following rather than just returning. In that case, I'd say to use the < 0 status as a way to not call TransactionCommit and perhaps throw a VIR_WARN to indicate that transactions weren't being used. Probably also want to save/restore LastError around all this so that a failure in TransactionStart or TransactionCommit doesn't overwrite an error we already have.
virSecurityManagerRestoreAllLabel(driver->securityManager, vm->def, migrated, priv->chardevStdioLogd); + + virSecurityManagerTransactionCommit(driver->securityManager, -1);
If we wrap this in ignore_value, then Coverity will be "happy". Since the only consumer is ProcessStop, at least we don't have the fear that a Start succeeds and a Commit fails that we'll be "stuck" the next time we go to do a Start... I guess to some degree I wonder if the transaction even matters - meaning - I think you can stick with the original code so that Start/Commit failure doesn't mean anything. I also think this function should be separated from the others since its adding transaction semantics where they weren't present previously. That way we can debate need or decide later to revert if necessary...
}
[...]
@@ -454,10 +472,21 @@ qemuSecurityStartTPMEmulator(virQEMUDriverPtr driver, int *cmdret)
I almost want to say this is "different" than any of the other functions which you are converting the existing transaction model. In this instance you are introducing a transaction model. I think this should be a separate patch to make it easier for a later decision to revert just in case that decision needs to be made. The only current consumer is virSecuritySELinuxSetTPMLabels and it already has the multiple operation type semantics, so I'm wondering if transaction semantics are actually necessary.
{ int ret = -1; + bool transactionStarted = false; + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) + return -1; + transactionStarted = true;
if (virSecurityManagerSetTPMLabels(driver->securityManager, - def) < 0) + def) < 0) { + virSecurityManagerTransactionAbort(driver->securityManager); + return -1; + } + + if (virSecurityManagerTransactionCommit(driver->securityManager, -1) < 0) goto cleanup; + transactionStarted = false;
if (virSecurityManagerSetChildProcessLabel(driver->securityManager, def, cmd) < 0) @@ -481,8 +510,13 @@ qemuSecurityStartTPMEmulator(virQEMUDriverPtr driver, return 0;
cleanup:
We should change this to "error:" because it's not cleanup in normal path...
+ if (!transactionStarted) + virSecurityManagerTransactionStart(driver->securityManager);
I think this should be: if (!transactionStarted) { if (virSecurityManagerTransactionStart(...) == 0) transactionStarted = true; }
+ virSecurityManagerRestoreTPMLabels(driver->securityManager, def);
Making the subsequent be: if (transactionStarted) { ignore_value(virSecurityManagerTransactionCommit(...); virSecurityManagerTransactionAbort(...); } Because if the "second" Start fails, then next ones aren't necessary. I'd also be inclined to say, why bother with a transaction. If something in the Set processing failed, I'd assume we're going to get a similar failure in Restore processing - so let the TPM specific code handle that. We can document the crap out of it^w^w^w^w, err, ah decision made.
+ virSecurityManagerTransactionCommit(driver->securityManager, -1); + virSecurityManagerTransactionAbort(driver->securityManager); return ret; }
@@ -491,7 +525,12 @@ void qemuSecurityCleanupTPMEmulator(virQEMUDriverPtr driver, virDomainDefPtr def) { + virSecurityManagerTransactionStart(driver->securityManager);
So this is different than qemuSecurityRestoreAllLabel... Similar to above - we should know if Start fails, because if it does, then commit and abort won't be happy either... This should be a separate patch with the StartTPMEmulator...
+ virSecurityManagerRestoreTPMLabels(driver->securityManager, def); + + virSecurityManagerTransactionCommit(driver->securityManager, -1); + virSecurityManagerTransactionAbort(driver->securityManager); }
[...] So, let's do this for any function that is not adding transaction semantics, but merely modifying existing semantics, Reviewed-by: John Ferlan <jferlan@redhat.com> But for qemuSecurityRestoreAllLabel, qemuSecurityStartTPMEmulator, and qemuSecurityCleanupTPMEmulator - we need to extract those and consider them separately. Whether you alter the commit message to my suggestion above is your choice. John

So far the virLockSpaceAcquireResource() locks the first byte in the underlying file. But caller might want to lock other range. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> --- src/locking/lock_daemon_dispatch.c | 3 +++ src/util/virlockspace.c | 15 ++++++++++----- src/util/virlockspace.h | 4 ++++ tests/virlockspacetest.c | 29 ++++++++++++++++++++++++----- 4 files changed, 41 insertions(+), 10 deletions(-) diff --git a/src/locking/lock_daemon_dispatch.c b/src/locking/lock_daemon_dispatch.c index 1b479db55d..10248ec0b5 100644 --- a/src/locking/lock_daemon_dispatch.c +++ b/src/locking/lock_daemon_dispatch.c @@ -50,6 +50,8 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU virNetServerClientGetPrivateData(client); virLockSpacePtr lockspace; unsigned int newFlags; + off_t start = 0; + off_t len = 1; virMutexLock(&priv->lock); @@ -84,6 +86,7 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU if (virLockSpaceAcquireResource(lockspace, args->name, priv->ownerPid, + start, len, newFlags) < 0) goto cleanup; diff --git a/src/util/virlockspace.c b/src/util/virlockspace.c index 3364c843aa..60bfef4c5f 100644 --- a/src/util/virlockspace.c +++ b/src/util/virlockspace.c @@ -115,8 +115,10 @@ static void virLockSpaceResourceFree(virLockSpaceResourcePtr res) static virLockSpaceResourcePtr virLockSpaceResourceNew(virLockSpacePtr lockspace, const char *resname, - unsigned int flags, - pid_t owner) + pid_t owner, + off_t start, + off_t len, + unsigned int flags) { virLockSpaceResourcePtr res; bool shared = !!(flags & VIR_LOCK_SPACE_ACQUIRE_SHARED); @@ -157,7 +159,7 @@ virLockSpaceResourceNew(virLockSpacePtr lockspace, goto error; } - if (virFileLock(res->fd, shared, 0, 1, false) < 0) { + if (virFileLock(res->fd, shared, start, len, false) < 0) { if (errno == EACCES || errno == EAGAIN) { virReportError(VIR_ERR_RESOURCE_BUSY, _("Lockspace resource '%s' is locked"), @@ -204,7 +206,7 @@ virLockSpaceResourceNew(virLockSpacePtr lockspace, goto error; } - if (virFileLock(res->fd, shared, 0, 1, false) < 0) { + if (virFileLock(res->fd, shared, start, len, false) < 0) { if (errno == EACCES || errno == EAGAIN) { virReportError(VIR_ERR_RESOURCE_BUSY, _("Lockspace resource '%s' is locked"), @@ -612,6 +614,8 @@ int virLockSpaceDeleteResource(virLockSpacePtr lockspace, int virLockSpaceAcquireResource(virLockSpacePtr lockspace, const char *resname, pid_t owner, + off_t start, + off_t len, unsigned int flags) { int ret = -1; @@ -641,7 +645,8 @@ int virLockSpaceAcquireResource(virLockSpacePtr lockspace, goto cleanup; } - if (!(res = virLockSpaceResourceNew(lockspace, resname, flags, owner))) + if (!(res = virLockSpaceResourceNew(lockspace, resname, + owner, start, len, flags))) goto cleanup; if (virHashAddEntry(lockspace->resources, resname, res) < 0) { diff --git a/src/util/virlockspace.h b/src/util/virlockspace.h index 041cf20396..24f2c89be6 100644 --- a/src/util/virlockspace.h +++ b/src/util/virlockspace.h @@ -22,6 +22,8 @@ #ifndef __VIR_LOCK_SPACE_H__ # define __VIR_LOCK_SPACE_H__ +# include <sys/types.h> + # include "internal.h" # include "virjson.h" @@ -50,6 +52,8 @@ typedef enum { int virLockSpaceAcquireResource(virLockSpacePtr lockspace, const char *resname, pid_t owner, + off_t start, + off_t len, unsigned int flags); int virLockSpaceReleaseResource(virLockSpacePtr lockspace, diff --git a/tests/virlockspacetest.c b/tests/virlockspacetest.c index 75ad98a02c..2409809353 100644 --- a/tests/virlockspacetest.c +++ b/tests/virlockspacetest.c @@ -99,6 +99,8 @@ static int testLockSpaceResourceLockExcl(const void *args ATTRIBUTE_UNUSED) { virLockSpacePtr lockspace; int ret = -1; + const off_t start = 0; + const off_t len = 1; rmdir(LOCKSPACE_DIR); @@ -111,13 +113,13 @@ static int testLockSpaceResourceLockExcl(const void *args ATTRIBUTE_UNUSED) if (virLockSpaceCreateResource(lockspace, "foo") < 0) goto cleanup; - if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), 0) < 0) + if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), start, len, 0) < 0) goto cleanup; if (!virFileExists(LOCKSPACE_DIR "/foo")) goto cleanup; - if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), 0) == 0) + if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), start, len, 0) == 0) goto cleanup; if (virLockSpaceDeleteResource(lockspace, "foo") == 0) @@ -145,6 +147,8 @@ static int testLockSpaceResourceLockExclAuto(const void *args ATTRIBUTE_UNUSED) { virLockSpacePtr lockspace; int ret = -1; + const off_t start = 0; + const off_t len = 1; rmdir(LOCKSPACE_DIR); @@ -158,6 +162,7 @@ static int testLockSpaceResourceLockExclAuto(const void *args ATTRIBUTE_UNUSED) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_AUTOCREATE) < 0) goto cleanup; @@ -183,6 +188,8 @@ static int testLockSpaceResourceLockShr(const void *args ATTRIBUTE_UNUSED) { virLockSpacePtr lockspace; int ret = -1; + const off_t start = 0; + const off_t len = 1; rmdir(LOCKSPACE_DIR); @@ -196,13 +203,16 @@ static int testLockSpaceResourceLockShr(const void *args ATTRIBUTE_UNUSED) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_SHARED) < 0) goto cleanup; - if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), 0) == 0) + if (virLockSpaceAcquireResource(lockspace, "foo", + geteuid(), start, len, 0) == 0) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_SHARED) < 0) goto cleanup; @@ -237,6 +247,8 @@ static int testLockSpaceResourceLockShrAuto(const void *args ATTRIBUTE_UNUSED) { virLockSpacePtr lockspace; int ret = -1; + const off_t start = 0; + const off_t len = 1; rmdir(LOCKSPACE_DIR); @@ -250,6 +262,7 @@ static int testLockSpaceResourceLockShrAuto(const void *args ATTRIBUTE_UNUSED) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_SHARED | VIR_LOCK_SPACE_ACQUIRE_AUTOCREATE) < 0) goto cleanup; @@ -258,6 +271,7 @@ static int testLockSpaceResourceLockShrAuto(const void *args ATTRIBUTE_UNUSED) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_AUTOCREATE) == 0) goto cleanup; @@ -265,6 +279,7 @@ static int testLockSpaceResourceLockShrAuto(const void *args ATTRIBUTE_UNUSED) goto cleanup; if (virLockSpaceAcquireResource(lockspace, "foo", geteuid(), + start, len, VIR_LOCK_SPACE_ACQUIRE_SHARED | VIR_LOCK_SPACE_ACQUIRE_AUTOCREATE) < 0) goto cleanup; @@ -297,6 +312,8 @@ static int testLockSpaceResourceLockPath(const void *args ATTRIBUTE_UNUSED) { virLockSpacePtr lockspace; int ret = -1; + const off_t start = 0; + const off_t len = 1; rmdir(LOCKSPACE_DIR); @@ -309,13 +326,15 @@ static int testLockSpaceResourceLockPath(const void *args ATTRIBUTE_UNUSED) if (virLockSpaceCreateResource(lockspace, LOCKSPACE_DIR "/foo") < 0) goto cleanup; - if (virLockSpaceAcquireResource(lockspace, LOCKSPACE_DIR "/foo", geteuid(), 0) < 0) + if (virLockSpaceAcquireResource(lockspace, LOCKSPACE_DIR "/foo", + geteuid(), start, len, 0) < 0) goto cleanup; if (!virFileExists(LOCKSPACE_DIR "/foo")) goto cleanup; - if (virLockSpaceAcquireResource(lockspace, LOCKSPACE_DIR "/foo", geteuid(), 0) == 0) + if (virLockSpaceAcquireResource(lockspace, LOCKSPACE_DIR "/foo", + geteuid(), start, len, 0) == 0) goto cleanup; if (virLockSpaceDeleteResource(lockspace, LOCKSPACE_DIR "/foo") == 0) -- 2.16.4

This flag causes virtlockd to use different offset when locking the file. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/locking/lock_daemon_dispatch.c | 10 ++++++++-- src/locking/lock_driver_lockd.c | 3 ++- src/locking/lock_driver_lockd.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/src/locking/lock_daemon_dispatch.c b/src/locking/lock_daemon_dispatch.c index 10248ec0b5..a683ad3d6b 100644 --- a/src/locking/lock_daemon_dispatch.c +++ b/src/locking/lock_daemon_dispatch.c @@ -37,6 +37,9 @@ VIR_LOG_INIT("locking.lock_daemon_dispatch"); #include "lock_daemon_dispatch_stubs.h" +#define DEFAULT_OFFSET 0 +#define METADATA_OFFSET 1 + static int virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNUSED, virNetServerClientPtr client, @@ -50,13 +53,14 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU virNetServerClientGetPrivateData(client); virLockSpacePtr lockspace; unsigned int newFlags; - off_t start = 0; + off_t start = DEFAULT_OFFSET; off_t len = 1; virMutexLock(&priv->lock); virCheckFlagsGoto(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE, cleanup); + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA, cleanup); if (priv->restricted) { virReportError(VIR_ERR_OPERATION_DENIED, "%s", @@ -82,6 +86,8 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU newFlags |= VIR_LOCK_SPACE_ACQUIRE_SHARED; if (flags & VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE) newFlags |= VIR_LOCK_SPACE_ACQUIRE_AUTOCREATE; + if (flags & VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA) + start = METADATA_OFFSET; if (virLockSpaceAcquireResource(lockspace, args->name, diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 16fce551c3..ca825e6026 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -723,7 +723,8 @@ static int virLockManagerLockDaemonRelease(virLockManagerPtr lock, args.flags &= ~(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE); + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA); if (virNetClientProgramCall(program, client, diff --git a/src/locking/lock_driver_lockd.h b/src/locking/lock_driver_lockd.h index 6931fe7425..bebd804365 100644 --- a/src/locking/lock_driver_lockd.h +++ b/src/locking/lock_driver_lockd.h @@ -25,6 +25,7 @@ enum virLockSpaceProtocolAcquireResourceFlags { VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED = (1 << 0), VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE = (1 << 1), + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA = (1 << 2), }; #endif /* __VIR_LOCK_DRIVER_LOCKD_H__ */ -- 2.16.4

We will want virtlockd to lock files on behalf of libvirtd and not qemu process, because it is libvirtd that needs an exclusive access not qemu. This requires new lock context. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver.h | 2 + src/locking/lock_driver_lockd.c | 291 ++++++++++++++++++++++++-------------- src/locking/lock_driver_sanlock.c | 37 +++-- 3 files changed, 214 insertions(+), 116 deletions(-) diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index 8b7cccc521..a9d2041c30 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -42,6 +42,8 @@ typedef enum { typedef enum { /* The managed object is a virtual guest domain */ VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN = 0, + /* The managed object is a daemon (e.g. libvirtd) */ + VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON = 1, } virLockManagerObjectType; typedef enum { diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index ca825e6026..8580d12340 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -56,10 +56,21 @@ struct _virLockManagerLockDaemonResource { }; struct _virLockManagerLockDaemonPrivate { - unsigned char uuid[VIR_UUID_BUFLEN]; - char *name; - int id; - pid_t pid; + virLockManagerObjectType type; + union { + struct { + unsigned char uuid[VIR_UUID_BUFLEN]; + char *name; + int id; + pid_t pid; + } dom; + + struct { + unsigned char uuid[VIR_UUID_BUFLEN]; + char *name; + pid_t pid; + } daemon; + } t; size_t nresources; virLockManagerLockDaemonResourcePtr resources; @@ -156,10 +167,24 @@ virLockManagerLockDaemonConnectionRegister(virLockManagerPtr lock, memset(&args, 0, sizeof(args)); args.flags = 0; - memcpy(args.owner.uuid, priv->uuid, VIR_UUID_BUFLEN); - args.owner.name = priv->name; - args.owner.id = priv->id; - args.owner.pid = priv->pid; + + switch (priv->type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: + memcpy(args.owner.uuid, priv->t.dom.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.dom.name; + args.owner.id = priv->t.dom.id; + args.owner.pid = priv->t.dom.pid; + break; + + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + memcpy(args.owner.uuid, priv->t.daemon.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.daemon.name; + args.owner.pid = priv->t.daemon.pid; + break; + + default: + return -1; + } if (virNetClientProgramCall(program, client, @@ -391,7 +416,18 @@ virLockManagerLockDaemonPrivateFree(virLockManagerLockDaemonPrivatePtr priv) } VIR_FREE(priv->resources); - VIR_FREE(priv->name); + switch (priv->type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: + VIR_FREE(priv->t.dom.name); + break; + + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + VIR_FREE(priv->t.daemon.name); + break; + + default: + break; + } VIR_FREE(priv); } @@ -420,46 +456,82 @@ static int virLockManagerLockDaemonNew(virLockManagerPtr lock, if (VIR_ALLOC(priv) < 0) return -1; - switch (type) { + priv->type = type; + + switch ((virLockManagerObjectType) type) { case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: for (i = 0; i < nparams; i++) { if (STREQ(params[i].key, "uuid")) { - memcpy(priv->uuid, params[i].value.uuid, VIR_UUID_BUFLEN); + memcpy(priv->t.dom.uuid, params[i].value.uuid, VIR_UUID_BUFLEN); } else if (STREQ(params[i].key, "name")) { - if (VIR_STRDUP(priv->name, params[i].value.str) < 0) + if (VIR_STRDUP(priv->t.dom.name, params[i].value.str) < 0) goto cleanup; } else if (STREQ(params[i].key, "id")) { - priv->id = params[i].value.iv; + priv->t.dom.id = params[i].value.iv; } else if (STREQ(params[i].key, "pid")) { - priv->pid = params[i].value.iv; + priv->t.dom.pid = params[i].value.iv; } else if (STREQ(params[i].key, "uri")) { /* ignored */ } else { virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unexpected parameter %s for object"), + _("Unexpected parameter %s for domain object"), params[i].key); goto cleanup; } } - if (priv->id == 0) { + if (priv->t.dom.id == 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Missing ID parameter for domain object")); goto cleanup; } - if (priv->pid == 0) + if (priv->t.dom.pid == 0) VIR_DEBUG("Missing PID parameter for domain object"); - if (!priv->name) { + if (!priv->t.dom.name) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Missing name parameter for domain object")); goto cleanup; } - if (!virUUIDIsValid(priv->uuid)) { + if (!virUUIDIsValid(priv->t.dom.uuid)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Missing UUID parameter for domain object")); goto cleanup; } break; + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + for (i = 0; i < nparams; i++) { + if (STREQ(params[i].key, "uuid")) { + memcpy(priv->t.daemon.uuid, params[i].value.uuid, VIR_UUID_BUFLEN); + } else if (STREQ(params[i].key, "name")) { + if (VIR_STRDUP(priv->t.daemon.name, params[i].value.str) < 0) + goto cleanup; + } else if (STREQ(params[i].key, "pid")) { + priv->t.daemon.pid = params[i].value.iv; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unexpected parameter %s for daemon object"), + params[i].key); + goto cleanup; + } + } + + if (!virUUIDIsValid(priv->t.daemon.uuid)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing UUID parameter for daemon object")); + goto cleanup; + } + if (!priv->t.daemon.name) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing name parameter for daemon object")); + goto cleanup; + } + if (priv->t.daemon.pid == 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing PID parameter for daemon object")); + goto cleanup; + } + break; + default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown lock manager object type %d"), @@ -494,107 +566,119 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) return 0; - switch (type) { - case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: - if (params || nparams) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Unexpected parameters for disk resource")); - goto cleanup; - } - if (!driver->autoDiskLease) { - if (!(flags & (VIR_LOCK_MANAGER_RESOURCE_SHARED | - VIR_LOCK_MANAGER_RESOURCE_READONLY))) - priv->hasRWDisks = true; - return 0; - } + switch (priv->type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: - /* XXX we should somehow pass in TYPE=BLOCK info - * from the domain_lock code, instead of assuming /dev - */ - if (STRPREFIX(name, "/dev") && - driver->lvmLockSpaceDir) { - VIR_DEBUG("Trying to find an LVM UUID for %s", name); - if (virStorageFileGetLVMKey(name, &newName) < 0) + switch (type) { + case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: + if (params || nparams) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unexpected parameters for disk resource")); goto cleanup; + } + if (!driver->autoDiskLease) { + if (!(flags & (VIR_LOCK_MANAGER_RESOURCE_SHARED | + VIR_LOCK_MANAGER_RESOURCE_READONLY))) + priv->hasRWDisks = true; + return 0; + } - if (newName) { - VIR_DEBUG("Got an LVM UUID %s for %s", newName, name); - if (VIR_STRDUP(newLockspace, driver->lvmLockSpaceDir) < 0) + /* XXX we should somehow pass in TYPE=BLOCK info + * from the domain_lock code, instead of assuming /dev + */ + if (STRPREFIX(name, "/dev") && + driver->lvmLockSpaceDir) { + VIR_DEBUG("Trying to find an LVM UUID for %s", name); + if (virStorageFileGetLVMKey(name, &newName) < 0) goto cleanup; - autoCreate = true; - break; + + if (newName) { + VIR_DEBUG("Got an LVM UUID %s for %s", newName, name); + if (VIR_STRDUP(newLockspace, driver->lvmLockSpaceDir) < 0) + goto cleanup; + autoCreate = true; + break; + } + virResetLastError(); + /* Fallback to generic non-block code */ } - virResetLastError(); - /* Fallback to generic non-block code */ - } - if (STRPREFIX(name, "/dev") && - driver->scsiLockSpaceDir) { - VIR_DEBUG("Trying to find an SCSI ID for %s", name); - if (virStorageFileGetSCSIKey(name, &newName) < 0) - goto cleanup; + if (STRPREFIX(name, "/dev") && + driver->scsiLockSpaceDir) { + VIR_DEBUG("Trying to find an SCSI ID for %s", name); + if (virStorageFileGetSCSIKey(name, &newName) < 0) + goto cleanup; + + if (newName) { + VIR_DEBUG("Got an SCSI ID %s for %s", newName, name); + if (VIR_STRDUP(newLockspace, driver->scsiLockSpaceDir) < 0) + goto cleanup; + autoCreate = true; + break; + } + virResetLastError(); + /* Fallback to generic non-block code */ + } - if (newName) { - VIR_DEBUG("Got an SCSI ID %s for %s", newName, name); - if (VIR_STRDUP(newLockspace, driver->scsiLockSpaceDir) < 0) + if (driver->fileLockSpaceDir) { + if (VIR_STRDUP(newLockspace, driver->fileLockSpaceDir) < 0) + goto cleanup; + if (virCryptoHashString(VIR_CRYPTO_HASH_SHA256, name, &newName) < 0) goto cleanup; autoCreate = true; - break; + VIR_DEBUG("Using indirect lease %s for %s", newName, name); + } else { + if (VIR_STRDUP(newLockspace, "") < 0) + goto cleanup; + if (VIR_STRDUP(newName, name) < 0) + goto cleanup; + VIR_DEBUG("Using direct lease for %s", name); } - virResetLastError(); - /* Fallback to generic non-block code */ - } - if (driver->fileLockSpaceDir) { - if (VIR_STRDUP(newLockspace, driver->fileLockSpaceDir) < 0) - goto cleanup; - if (virCryptoHashString(VIR_CRYPTO_HASH_SHA256, name, &newName) < 0) + break; + case VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE: { + size_t i; + char *path = NULL; + char *lockspace = NULL; + for (i = 0; i < nparams; i++) { + if (STREQ(params[i].key, "offset")) { + if (params[i].value.ul != 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Offset must be zero for this lock manager")); + goto cleanup; + } + } else if (STREQ(params[i].key, "lockspace")) { + lockspace = params[i].value.str; + } else if (STREQ(params[i].key, "path")) { + path = params[i].value.str; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unexpected parameter %s for lease resource"), + params[i].key); + goto cleanup; + } + } + if (!path || !lockspace) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing path or lockspace for lease resource")); goto cleanup; - autoCreate = true; - VIR_DEBUG("Using indirect lease %s for %s", newName, name); - } else { - if (VIR_STRDUP(newLockspace, "") < 0) + } + if (virAsprintf(&newLockspace, "%s/%s", + path, lockspace) < 0) goto cleanup; if (VIR_STRDUP(newName, name) < 0) goto cleanup; - VIR_DEBUG("Using direct lease for %s", name); - } + } break; + default: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unknown lock manager object type %d for domain lock object"), + type); + goto cleanup; + } break; - case VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE: { - size_t i; - char *path = NULL; - char *lockspace = NULL; - for (i = 0; i < nparams; i++) { - if (STREQ(params[i].key, "offset")) { - if (params[i].value.ul != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Offset must be zero for this lock manager")); - goto cleanup; - } - } else if (STREQ(params[i].key, "lockspace")) { - lockspace = params[i].value.str; - } else if (STREQ(params[i].key, "path")) { - path = params[i].value.str; - } else { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unexpected parameter %s for lease resource"), - params[i].key); - goto cleanup; - } - } - if (!path || !lockspace) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Missing path or lockspace for lease resource")); - goto cleanup; - } - if (virAsprintf(&newLockspace, "%s/%s", - path, lockspace) < 0) - goto cleanup; - if (VIR_STRDUP(newName, name) < 0) - goto cleanup; - } break; + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown lock manager object type %d"), @@ -639,7 +723,8 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, virCheckFlags(VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY | VIR_LOCK_MANAGER_ACQUIRE_RESTRICT, -1); - if (priv->nresources == 0 && + if (priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN && + priv->nresources == 0 && priv->hasRWDisks && driver->requireLeaseForDisks) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index 39c2f94a76..fe422d3be6 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -513,21 +513,32 @@ static int virLockManagerSanlockNew(virLockManagerPtr lock, priv->flags = flags; - for (i = 0; i < nparams; i++) { - param = ¶ms[i]; + switch ((virLockManagerObjectType) type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: + for (i = 0; i < nparams; i++) { + param = ¶ms[i]; - if (STREQ(param->key, "uuid")) { - memcpy(priv->vm_uuid, param->value.uuid, 16); - } else if (STREQ(param->key, "name")) { - if (VIR_STRDUP(priv->vm_name, param->value.str) < 0) - goto error; - } else if (STREQ(param->key, "pid")) { - priv->vm_pid = param->value.iv; - } else if (STREQ(param->key, "id")) { - priv->vm_id = param->value.ui; - } else if (STREQ(param->key, "uri")) { - priv->vm_uri = param->value.cstr; + if (STREQ(param->key, "uuid")) { + memcpy(priv->vm_uuid, param->value.uuid, 16); + } else if (STREQ(param->key, "name")) { + if (VIR_STRDUP(priv->vm_name, param->value.str) < 0) + goto error; + } else if (STREQ(param->key, "pid")) { + priv->vm_pid = param->value.iv; + } else if (STREQ(param->key, "id")) { + priv->vm_id = param->value.ui; + } else if (STREQ(param->key, "uri")) { + priv->vm_uri = param->value.cstr; + } } + break; + + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + default: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unknown lock manager object type %d"), + type); + goto error; } /* Sanlock needs process registration, but the only way how to probe -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
We will want virtlockd to lock files on behalf of libvirtd and not qemu process, because it is libvirtd that needs an exclusive access not qemu. This requires new lock context.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver.h | 2 + src/locking/lock_driver_lockd.c | 291 ++++++++++++++++++++++++-------------- src/locking/lock_driver_sanlock.c | 37 +++-- 3 files changed, 214 insertions(+), 116 deletions(-)
[...]
@@ -156,10 +167,24 @@ virLockManagerLockDaemonConnectionRegister(virLockManagerPtr lock, memset(&args, 0, sizeof(args));
args.flags = 0; - memcpy(args.owner.uuid, priv->uuid, VIR_UUID_BUFLEN); - args.owner.name = priv->name; - args.owner.id = priv->id; - args.owner.pid = priv->pid; + + switch (priv->type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: + memcpy(args.owner.uuid, priv->t.dom.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.dom.name; + args.owner.id = priv->t.dom.id; + args.owner.pid = priv->t.dom.pid; + break; + + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + memcpy(args.owner.uuid, priv->t.daemon.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.daemon.name; + args.owner.pid = priv->t.daemon.pid; + break; + + default:
Should there be an error message? Since virNetClientProgramCall would provide one on error and it seems callers would expect it... So we don't end up with an error happened, but we have no clue where.
+ return -1; + }
if (virNetClientProgramCall(program, client,
[...] I suspect you can make the right choice above... Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 09/17/2018 06:17 PM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
We will want virtlockd to lock files on behalf of libvirtd and not qemu process, because it is libvirtd that needs an exclusive access not qemu. This requires new lock context.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver.h | 2 + src/locking/lock_driver_lockd.c | 291 ++++++++++++++++++++++++-------------- src/locking/lock_driver_sanlock.c | 37 +++-- 3 files changed, 214 insertions(+), 116 deletions(-)
[...]
@@ -156,10 +167,24 @@ virLockManagerLockDaemonConnectionRegister(virLockManagerPtr lock, memset(&args, 0, sizeof(args));
args.flags = 0; - memcpy(args.owner.uuid, priv->uuid, VIR_UUID_BUFLEN); - args.owner.name = priv->name; - args.owner.id = priv->id; - args.owner.pid = priv->pid; + + switch (priv->type) { + case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: + memcpy(args.owner.uuid, priv->t.dom.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.dom.name; + args.owner.id = priv->t.dom.id; + args.owner.pid = priv->t.dom.pid; + break; + + case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + memcpy(args.owner.uuid, priv->t.daemon.uuid, VIR_UUID_BUFLEN); + args.owner.name = priv->t.daemon.name; + args.owner.pid = priv->t.daemon.pid; + break; + + default:
Should there be an error message? Since virNetClientProgramCall would provide one on error and it seems callers would expect it... So we don't end up with an error happened, but we have no clue where.
The 'default' label exists merely to shut gcc up. However, I'll copy the error message from virLockManagerLockDaemonNew().
+ return -1; + }
if (virNetClientProgramCall(program, client,
[...]
I suspect you can make the right choice above...
Reviewed-by: John Ferlan <jferlan@redhat.com>
Michal

The fact whether domain has or doesn't have RW disks is specific to VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN and therefore should reside in union specific to it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver_lockd.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 8580d12340..98953500b7 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -63,6 +63,8 @@ struct _virLockManagerLockDaemonPrivate { char *name; int id; pid_t pid; + + bool hasRWDisks; } dom; struct { @@ -74,8 +76,6 @@ struct _virLockManagerLockDaemonPrivate { size_t nresources; virLockManagerLockDaemonResourcePtr resources; - - bool hasRWDisks; }; @@ -579,7 +579,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, if (!driver->autoDiskLease) { if (!(flags & (VIR_LOCK_MANAGER_RESOURCE_SHARED | VIR_LOCK_MANAGER_RESOURCE_READONLY))) - priv->hasRWDisks = true; + priv->t.dom.hasRWDisks = true; return 0; } @@ -725,7 +725,7 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, if (priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN && priv->nresources == 0 && - priv->hasRWDisks && + priv->t.dom.hasRWDisks && driver->requireLeaseForDisks) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Read/write, exclusive access, disks were present, but no leases specified")); -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
The fact whether domain has or doesn't have RW disks is specific to VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN and therefore should reside in union specific to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver_lockd.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

This is a new type of object that lock drivers can handle. Currently, it is supported by lockd driver only. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/locking/lock_driver.h | 2 ++ src/locking/lock_driver_lockd.c | 47 ++++++++++++++++++++++++++++----------- src/locking/lock_driver_sanlock.c | 3 ++- 3 files changed, 38 insertions(+), 14 deletions(-) diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index a9d2041c30..9be0abcfba 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -51,6 +51,8 @@ typedef enum { VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK = 0, /* A lease against an arbitrary resource */ VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE = 1, + /* The resource to be locked is a metadata */ + VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA = 2, } virLockManagerResourceType; typedef enum { diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 98953500b7..cb294ac694 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -557,7 +557,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, virLockManagerLockDaemonPrivatePtr priv = lock->privateData; char *newName = NULL; char *newLockspace = NULL; - bool autoCreate = false; + int newFlags = 0; int ret = -1; virCheckFlags(VIR_LOCK_MANAGER_RESOURCE_READONLY | @@ -569,7 +569,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, switch (priv->type) { case VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN: - switch (type) { + switch ((virLockManagerResourceType) type) { case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: if (params || nparams) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -596,7 +596,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, VIR_DEBUG("Got an LVM UUID %s for %s", newName, name); if (VIR_STRDUP(newLockspace, driver->lvmLockSpaceDir) < 0) goto cleanup; - autoCreate = true; + newFlags |= VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE; break; } virResetLastError(); @@ -613,7 +613,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, VIR_DEBUG("Got an SCSI ID %s for %s", newName, name); if (VIR_STRDUP(newLockspace, driver->scsiLockSpaceDir) < 0) goto cleanup; - autoCreate = true; + newFlags |= VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE; break; } virResetLastError(); @@ -625,7 +625,7 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, goto cleanup; if (virCryptoHashString(VIR_CRYPTO_HASH_SHA256, name, &newName) < 0) goto cleanup; - autoCreate = true; + newFlags |= VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE; VIR_DEBUG("Using indirect lease %s for %s", newName, name); } else { if (VIR_STRDUP(newLockspace, "") < 0) @@ -670,6 +670,8 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, goto cleanup; } break; + + case VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA: default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown lock manager object type %d for domain lock object"), @@ -679,6 +681,29 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, break; case VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON: + switch ((virLockManagerResourceType) type) { + case VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA: + if (params || nparams) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unexpected parameters for metadata resource")); + goto cleanup; + } + if (VIR_STRDUP(newLockspace, "") < 0 || + VIR_STRDUP(newName, name) < 0) + goto cleanup; + newFlags |= VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA; + break; + + case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: + case VIR_LOCK_MANAGER_RESOURCE_TYPE_LEASE: + default: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unknown lock manager object type %d for daemon lock object"), + type); + goto cleanup; + } + break; + default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown lock manager object type %d"), @@ -686,19 +711,15 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, goto cleanup; } + if (flags & VIR_LOCK_MANAGER_RESOURCE_SHARED) + newFlags |= VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED; + if (VIR_EXPAND_N(priv->resources, priv->nresources, 1) < 0) goto cleanup; VIR_STEAL_PTR(priv->resources[priv->nresources-1].lockspace, newLockspace); VIR_STEAL_PTR(priv->resources[priv->nresources-1].name, newName); - - if (flags & VIR_LOCK_MANAGER_RESOURCE_SHARED) - priv->resources[priv->nresources-1].flags |= - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED; - - if (autoCreate) - priv->resources[priv->nresources-1].flags |= - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE; + priv->resources[priv->nresources-1].flags = newFlags; ret = 0; cleanup: diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index fe422d3be6..9393e7d9a2 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -815,7 +815,7 @@ static int virLockManagerSanlockAddResource(virLockManagerPtr lock, if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) return 0; - switch (type) { + switch ((virLockManagerResourceType) type) { case VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK: if (driver->autoDiskLease) { if (virLockManagerSanlockAddDisk(driver, lock, name, nparams, params, @@ -839,6 +839,7 @@ static int virLockManagerSanlockAddResource(virLockManagerPtr lock, return -1; break; + case VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA: default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown lock manager object type %d for domain lock object"), -- 2.16.4

Soon there will be a virtlockd client that wants to either lock all the resources or none (in order to avoid virtlockd killing the client on connection close). Because on the RPC layer we can only acquire one resource at a time, we have to perform a rollback once we hit a resource that can't be acquired. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver.h | 4 ++ src/locking/lock_driver_lockd.c | 84 +++++++++++++++++++++++++++++------------ 2 files changed, 64 insertions(+), 24 deletions(-) diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index 9be0abcfba..8d236471d4 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -67,6 +67,10 @@ typedef enum { VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY = (1 << 0), /* Prevent further lock/unlock calls from this process */ VIR_LOCK_MANAGER_ACQUIRE_RESTRICT = (1 << 1), + /* In case when acquiring more resources which one of them + * can't be acquired, perform a rollback and release all + * resources acquired so far. */ + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK = (1 << 2), } virLockManagerAcquireFlags; typedef enum { diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index cb294ac694..3068a72507 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -729,6 +729,34 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, } +static int virLockManagerLockDaemonReleaseImpl(virNetClientPtr client, + virNetClientProgramPtr program, + int *counter, + virLockManagerLockDaemonResourcePtr res) +{ + virLockSpaceProtocolReleaseResourceArgs args; + + memset(&args, 0, sizeof(args)); + + args.path = res->lockspace; + args.name = res->name; + args.flags = res->flags; + + args.flags &= + ~(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA); + + return virNetClientProgramCall(program, + client, + (*counter)++, + VIR_LOCK_SPACE_PROTOCOL_PROC_RELEASE_RESOURCE, + 0, NULL, NULL, NULL, + (xdrproc_t)xdr_virLockSpaceProtocolReleaseResourceArgs, &args, + (xdrproc_t)xdr_void, NULL); +} + + static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, const char *state ATTRIBUTE_UNUSED, unsigned int flags, @@ -739,10 +767,13 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, virNetClientProgramPtr program = NULL; int counter = 0; int rv = -1; + ssize_t i; + ssize_t lastGood = -1; virLockManagerLockDaemonPrivatePtr priv = lock->privateData; virCheckFlags(VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY | - VIR_LOCK_MANAGER_ACQUIRE_RESTRICT, -1); + VIR_LOCK_MANAGER_ACQUIRE_RESTRICT | + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, -1); if (priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN && priv->nresources == 0 && @@ -761,7 +792,6 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, goto cleanup; if (!(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) { - size_t i; for (i = 0; i < priv->nresources; i++) { virLockSpaceProtocolAcquireResourceArgs args; @@ -779,6 +809,7 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, (xdrproc_t)xdr_virLockSpaceProtocolAcquireResourceArgs, &args, (xdrproc_t)xdr_void, NULL) < 0) goto cleanup; + lastGood = i; } } @@ -789,8 +820,30 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, rv = 0; cleanup: - if (rv != 0 && fd) - VIR_FORCE_CLOSE(*fd); + if (rv < 0) { + int saved_errno = errno; + virErrorPtr origerr; + + virErrorPreserveLast(&origerr); + if (fd) + VIR_FORCE_CLOSE(*fd); + + if (client && program && + flags & VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK && + !(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) { + for (i = lastGood; i >= 0; i--) { + virLockManagerLockDaemonResourcePtr res = &priv->resources[i]; + + if (virLockManagerLockDaemonReleaseImpl(client, program, + &counter, res) < 0) + VIR_WARN("Unable to release resource lockspace=%s name=%s", + res->lockspace, res->name); + } + } + + virErrorRestore(&origerr); + errno = saved_errno; + } virNetClientClose(client); virObjectUnref(client); virObjectUnref(program); @@ -818,27 +871,10 @@ static int virLockManagerLockDaemonRelease(virLockManagerPtr lock, goto cleanup; for (i = 0; i < priv->nresources; i++) { - virLockSpaceProtocolReleaseResourceArgs args; + virLockManagerLockDaemonResourcePtr res = &priv->resources[i]; - memset(&args, 0, sizeof(args)); - - if (priv->resources[i].lockspace) - args.path = priv->resources[i].lockspace; - args.name = priv->resources[i].name; - args.flags = priv->resources[i].flags; - - args.flags &= - ~(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA); - - if (virNetClientProgramCall(program, - client, - counter++, - VIR_LOCK_SPACE_PROTOCOL_PROC_RELEASE_RESOURCE, - 0, NULL, NULL, NULL, - (xdrproc_t)xdr_virLockSpaceProtocolReleaseResourceArgs, &args, - (xdrproc_t)xdr_void, NULL) < 0) + if (virLockManagerLockDaemonReleaseImpl(client, program, + &counter, res) < 0) goto cleanup; } -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Soon there will be a virtlockd client that wants to either lock all the resources or none (in order to avoid virtlockd killing the client on connection close). Because on the RPC layer we can only acquire one resource at a time, we have to perform a rollback once we hit a resource that can't be acquired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_driver.h | 4 ++ src/locking/lock_driver_lockd.c | 84 +++++++++++++++++++++++++++++------------ 2 files changed, 64 insertions(+), 24 deletions(-)
diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index 9be0abcfba..8d236471d4 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -67,6 +67,10 @@ typedef enum { VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY = (1 << 0), /* Prevent further lock/unlock calls from this process */ VIR_LOCK_MANAGER_ACQUIRE_RESTRICT = (1 << 1), + /* In case when acquiring more resources which one of them
s/In case/Used/ s/which/in which/
+ * can't be acquired, perform a rollback and release all + * resources acquired so far. */ + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK = (1 << 2), } virLockManagerAcquireFlags;
typedef enum { diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index cb294ac694..3068a72507 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -729,6 +729,34 @@ static int virLockManagerLockDaemonAddResource(virLockManagerPtr lock, }
+static int virLockManagerLockDaemonReleaseImpl(virNetClientPtr client, + virNetClientProgramPtr program, + int *counter, + virLockManagerLockDaemonResourcePtr res) +{ + virLockSpaceProtocolReleaseResourceArgs args; + + memset(&args, 0, sizeof(args)); + + args.path = res->lockspace; + args.name = res->name; + args.flags = res->flags; + + args.flags &= + ~(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | + VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA); + + return virNetClientProgramCall(program, + client, + (*counter)++,
This sticks out like a sore thumb in the annals of C-isms for me. Something about "when" one can assume/expect the autoincr to happen when there's a *value involved. Regardless, I don't think it needs to be done this way, just pass counter and do the autoincr in the caller... see my continued comments below [1]
+ VIR_LOCK_SPACE_PROTOCOL_PROC_RELEASE_RESOURCE, + 0, NULL, NULL, NULL, + (xdrproc_t)xdr_virLockSpaceProtocolReleaseResourceArgs, &args, + (xdrproc_t)xdr_void, NULL); +} + + static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, const char *state ATTRIBUTE_UNUSED, unsigned int flags, @@ -739,10 +767,13 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, virNetClientProgramPtr program = NULL; int counter = 0; int rv = -1; + ssize_t i; + ssize_t lastGood = -1; virLockManagerLockDaemonPrivatePtr priv = lock->privateData;
virCheckFlags(VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY | - VIR_LOCK_MANAGER_ACQUIRE_RESTRICT, -1); + VIR_LOCK_MANAGER_ACQUIRE_RESTRICT | + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, -1);
if (priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DOMAIN && priv->nresources == 0 && @@ -761,7 +792,6 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, goto cleanup;
if (!(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) { - size_t i; for (i = 0; i < priv->nresources; i++) { virLockSpaceProtocolAcquireResourceArgs args;
@@ -779,6 +809,7 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, (xdrproc_t)xdr_virLockSpaceProtocolAcquireResourceArgs, &args, (xdrproc_t)xdr_void, NULL) < 0) goto cleanup; + lastGood = i; } }
@@ -789,8 +820,30 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, rv = 0;
cleanup: - if (rv != 0 && fd) - VIR_FORCE_CLOSE(*fd); + if (rv < 0) { + int saved_errno = errno; + virErrorPtr origerr; + + virErrorPreserveLast(&origerr); + if (fd) + VIR_FORCE_CLOSE(*fd); + + if (client && program && + flags & VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK && + !(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) {
Not sure any of the above 3 mean anything since lastGood is only set > 0 in one place so I would think the subsequent loop is good alone. I haven't looked ahead though ;-)
+ for (i = lastGood; i >= 0; i--) { + virLockManagerLockDaemonResourcePtr res = &priv->resources[i]; + + if (virLockManagerLockDaemonReleaseImpl(client, program, + &counter, res) < 0) + VIR_WARN("Unable to release resource lockspace=%s name=%s", + res->lockspace, res->name); + } + } + + virErrorRestore(&origerr); + errno = saved_errno; + } virNetClientClose(client); virObjectUnref(client); virObjectUnref(program); @@ -818,27 +871,10 @@ static int virLockManagerLockDaemonRelease(virLockManagerPtr lock, goto cleanup;
for (i = 0; i < priv->nresources; i++) {
[1] I think you can get away with this: s/i++/i++, counter++/ and not pass &counter for it to be incremented in the helper. There's no way "out of" the helper without incrementing it (thus far at least). So since this code owns it and can increment more sanely, I'd stick with it here.
- virLockSpaceProtocolReleaseResourceArgs args; + virLockManagerLockDaemonResourcePtr res = &priv->resources[i];
- memset(&args, 0, sizeof(args)); - - if (priv->resources[i].lockspace) - args.path = priv->resources[i].lockspace; - args.name = priv->resources[i].name; - args.flags = priv->resources[i].flags; - - args.flags &= - ~(VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE | - VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA); - - if (virNetClientProgramCall(program, - client, - counter++, - VIR_LOCK_SPACE_PROTOCOL_PROC_RELEASE_RESOURCE, - 0, NULL, NULL, NULL, - (xdrproc_t)xdr_virLockSpaceProtocolReleaseResourceArgs, &args, - (xdrproc_t)xdr_void, NULL) < 0) + if (virLockManagerLockDaemonReleaseImpl(client, program, + &counter, res) < 0) goto cleanup;
[1] if not in the predicate of the for loop, then right here with an autoincr.
}
I'm not insistent about the @lastGood stuff - go whichever way you want in the cleanup code; however, I would much prefer the counter increment to happen in this context not in the called helper context for the Reviewed-by: John Ferlan <jferlan@redhat.com> John I you have a reason to keep it as it, then convince me ;-) especially since by the end there is no extra change in either place.

[...]
cleanup: - if (rv != 0 && fd) - VIR_FORCE_CLOSE(*fd); + if (rv < 0) { + int saved_errno = errno; + virErrorPtr origerr; + + virErrorPreserveLast(&origerr); + if (fd) + VIR_FORCE_CLOSE(*fd); + + if (client && program && + flags & VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK && + !(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) {
Not sure any of the above 3 mean anything since lastGood is only set > 0 in one place so I would think the subsequent loop is good alone. I haven't looked ahead though ;-)
+ for (i = lastGood; i >= 0; i--) {
BTW: It just dawned on me while looking at patch 16 - this loop goes backwards and the ReleaseImpl call was *incrementing* counter. I wonder if that's what Bjoern ran into during testing. Even more reason to not pass &counter! John
+ virLockManagerLockDaemonResourcePtr res = &priv->resources[i]; + + if (virLockManagerLockDaemonReleaseImpl(client, program, + &counter, res) < 0) + VIR_WARN("Unable to release resource lockspace=%s name=%s", + res->lockspace, res->name); + } + } + + virErrorRestore(&origerr); + errno = saved_errno; + }

At the beginning of each dispatch function we check if owner attributes were registered (these consist of ID, UUID, PID and name). The check then consists of checking if ID is not zero. This is not going to work with VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON which doesn't set ID. Switch to setting PID which is available for both cases. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_daemon_dispatch.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/locking/lock_daemon_dispatch.c b/src/locking/lock_daemon_dispatch.c index a683ad3d6b..36a2462592 100644 --- a/src/locking/lock_daemon_dispatch.c +++ b/src/locking/lock_daemon_dispatch.c @@ -68,7 +68,7 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -129,7 +129,7 @@ virLockSpaceProtocolDispatchCreateResource(virNetServerPtr server ATTRIBUTE_UNUS goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -178,7 +178,7 @@ virLockSpaceProtocolDispatchDeleteResource(virNetServerPtr server ATTRIBUTE_UNUS goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -227,7 +227,7 @@ virLockSpaceProtocolDispatchNew(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -282,7 +282,7 @@ virLockSpaceProtocolDispatchRegister(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; } - if (!args->owner.id) { + if (!args->owner.pid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -329,7 +329,7 @@ virLockSpaceProtocolDispatchReleaseResource(virNetServerPtr server ATTRIBUTE_UNU goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -379,7 +379,7 @@ virLockSpaceProtocolDispatchRestrict(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; } - if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
At the beginning of each dispatch function we check if owner attributes were registered (these consist of ID, UUID, PID and name). The check then consists of checking if ID is not zero. This is not going to work with VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON which doesn't set ID. Switch to setting PID which is available for both cases.
It would if daemon used @pid for ID ;-)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_daemon_dispatch.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
Looking through a bit of history and came across an oh shinola moment, see: commit id ee3eddb332 You may need to implement what I suggest above and move on from there. John (shinola is a reference to a movie called "The Jerk" and the main character not knowing sh!t from shinola)
diff --git a/src/locking/lock_daemon_dispatch.c b/src/locking/lock_daemon_dispatch.c index a683ad3d6b..36a2462592 100644 --- a/src/locking/lock_daemon_dispatch.c +++ b/src/locking/lock_daemon_dispatch.c @@ -68,7 +68,7 @@ virLockSpaceProtocolDispatchAcquireResource(virNetServerPtr server ATTRIBUTE_UNU goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -129,7 +129,7 @@ virLockSpaceProtocolDispatchCreateResource(virNetServerPtr server ATTRIBUTE_UNUS goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -178,7 +178,7 @@ virLockSpaceProtocolDispatchDeleteResource(virNetServerPtr server ATTRIBUTE_UNUS goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -227,7 +227,7 @@ virLockSpaceProtocolDispatchNew(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -282,7 +282,7 @@ virLockSpaceProtocolDispatchRegister(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; }
- if (!args->owner.id) { + if (!args->owner.pid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -329,7 +329,7 @@ virLockSpaceProtocolDispatchReleaseResource(virNetServerPtr server ATTRIBUTE_UNU goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup; @@ -379,7 +379,7 @@ virLockSpaceProtocolDispatchRestrict(virNetServerPtr server ATTRIBUTE_UNUSED, goto cleanup; }
- if (!priv->ownerId) { + if (!priv->ownerPid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("lock owner details have not been registered")); goto cleanup;

On 09/17/2018 08:40 PM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
At the beginning of each dispatch function we check if owner attributes were registered (these consist of ID, UUID, PID and name). The check then consists of checking if ID is not zero. This is not going to work with VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON which doesn't set ID. Switch to setting PID which is available for both cases.
It would if daemon used @pid for ID ;-)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/locking/lock_daemon_dispatch.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
Looking through a bit of history and came across an oh shinola moment, see:
commit id ee3eddb332
You may need to implement what I suggest above and move on from there.
Ah, this is a very good point. I'll have VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON set id to PID. This one should be dropped then. Michal

In some cases we might want to not load the lock driver config. Alter virLockManagerPluginNew() and the lock drivers to cope with this fact. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/locking/lock_driver.h | 4 ++++ src/locking/lock_driver_lockd.c | 4 +++- src/locking/lock_driver_sanlock.c | 4 +++- src/locking/lock_manager.c | 10 +++++++--- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/src/locking/lock_driver.h b/src/locking/lock_driver.h index 8d236471d4..f938b1df2a 100644 --- a/src/locking/lock_driver.h +++ b/src/locking/lock_driver.h @@ -124,6 +124,7 @@ struct _virLockManagerParam { /** * virLockDriverInit: * @version: the libvirt requested plugin ABI version + * @configFile: path to config file * @flags: the libvirt requested plugin optional extras * * Allow the plugin to validate the libvirt requested @@ -131,6 +132,9 @@ struct _virLockManagerParam { * to block its use in versions of libvirtd which are * too old to support key features. * + * The @configFile variable points to config file that the driver + * should load. If NULL, no config file should be loaded. + * * NB: A plugin may be loaded multiple times, for different * libvirt drivers (eg QEMU, LXC, UML) * diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 3068a72507..7566c4abe1 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -365,8 +365,10 @@ static int virLockManagerLockDaemonInit(unsigned int version, driver->requireLeaseForDisks = true; driver->autoDiskLease = true; - if (virLockManagerLockDaemonLoadConfig(configFile) < 0) + if (configFile && + virLockManagerLockDaemonLoadConfig(configFile) < 0) { goto error; + } if (driver->autoDiskLease) { if (driver->fileLockSpaceDir && diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index 9393e7d9a2..66953c70d5 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -450,8 +450,10 @@ static int virLockManagerSanlockInit(unsigned int version, goto error; } - if (virLockManagerSanlockLoadConfig(driver, configFile) < 0) + if (configFile && + virLockManagerSanlockLoadConfig(driver, configFile) < 0) { goto error; + } if (driver->autoDiskLease && !driver->hostID) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", diff --git a/src/locking/lock_manager.c b/src/locking/lock_manager.c index 4ef9f9e692..84d0c30d37 100644 --- a/src/locking/lock_manager.c +++ b/src/locking/lock_manager.c @@ -105,6 +105,8 @@ static void virLockManagerLogParams(size_t nparams, /** * virLockManagerPluginNew: * @name: the name of the plugin + * @driverName: the hypervisor driver that loads the plugin + * @configDir: path to dir where config files are stored * @flag: optional plugin flags * * Attempt to load the plugin $(libdir)/libvirt/lock-driver/@name.so @@ -130,11 +132,13 @@ virLockManagerPluginPtr virLockManagerPluginNew(const char *name, char *configFile = NULL; VIR_DEBUG("name=%s driverName=%s configDir=%s flags=0x%x", - name, driverName, configDir, flags); + name, NULLSTR(driverName), NULLSTR(configDir), flags); - if (virAsprintf(&configFile, "%s/%s-%s.conf", - configDir, driverName, name) < 0) + if (driverName && configDir && + virAsprintf(&configFile, "%s/%s-%s.conf", + configDir, driverName, name) < 0) { return NULL; + } if (STREQ(name, "nop")) { driver = &virLockDriverNop; -- 2.16.4

This config option allows users to set and enable lock manager for domain metadata. The lock manager is going to be used by security drivers to serialize each other when changing a file ownership or changing the SELinux label. The only supported lock manager is 'lockd' for now. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 8 ++++++++ src/qemu/qemu_conf.c | 13 +++++++++++++ src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 5 files changed, 24 insertions(+) diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index ddc4bbfd1d..42e325d4fb 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -98,6 +98,7 @@ module Libvirtd_qemu = | bool_entry "relaxed_acs_check" | bool_entry "allow_disk_format_probing" | str_entry "lock_manager" + | str_entry "metadata_lock_manager" let rpc_entry = int_entry "max_queued" | int_entry "keepalive_interval" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index cd57b3cc69..84492719c4 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -659,6 +659,14 @@ #lock_manager = "lockd" +# To serialize two or more daemons trying to change metadata on a +# file (e.g. a file on NFS share), libvirt offers a locking +# mechanism. Currently, only "lockd" is supported (or no locking +# at all if unset). Note that this is independent of lock_manager +# described above. +# +#metadata_lock_manager = "lockd" + # Set limit of maximum APIs queued on one domain. All other APIs # over this threshold will fail on acquiring job lock. Specially, diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index a4f545ef92..46318b7b2a 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -428,6 +428,7 @@ static void virQEMUDriverConfigDispose(void *obj) virStringListFree(cfg->securityDriverNames); VIR_FREE(cfg->lockManagerName); + VIR_FREE(cfg->metadataLockManagerName); virFirmwareFreeList(cfg->firmwares, cfg->nfirmwares); @@ -838,6 +839,18 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg, if (virConfGetValueString(conf, "lock_manager", &cfg->lockManagerName) < 0) goto cleanup; + + if (virConfGetValueString(conf, "metadata_lock_manager", + &cfg->metadataLockManagerName) < 0) + goto cleanup; + if (cfg->metadataLockManagerName && + STRNEQ(cfg->metadataLockManagerName, "lockd")) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown metadata lock manager name %s"), + cfg->metadataLockManagerName); + goto cleanup; + } + if (virConfGetValueString(conf, "stdio_handler", &stdioHandler) < 0) goto cleanup; if (stdioHandler) { diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index a8d84efea2..c227ac72cc 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -186,6 +186,7 @@ struct _virQEMUDriverConfig { bool autoStartBypassCache; char *lockManagerName; + char *metadataLockManagerName; int keepAliveInterval; unsigned int keepAliveCount; diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index f1e8806ad2..451e73126e 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -81,6 +81,7 @@ module Test_libvirtd_qemu = { "mac_filter" = "1" } { "relaxed_acs_check" = "1" } { "lock_manager" = "lockd" } +{ "metadata_lock_manager" = "lockd" } { "max_queued" = "0" } { "keepalive_interval" = "5" } { "keepalive_count" = "5" } -- 2.16.4

Now that we know what metadata lock manager user wishes to use we can load it when initializing security driver. This is achieved by adding new argument to virSecurityManagerNewDriver() and subsequently to all functions that end up calling it. The cfg.mk change is needed in order to allow lock_manager.h inclusion in security driver without 'syntax-check' complaining. This is safe thing to do as locking APIs will always exist (it's only backend implementation that changes). However, instead of allowing the include for all other drivers (like cpu, network, and so on) allow it only for security driver. This will still trigger the error if including from other drivers. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- cfg.mk | 4 +++- src/lxc/lxc_controller.c | 3 ++- src/lxc/lxc_driver.c | 2 +- src/qemu/qemu_driver.c | 3 +++ src/security/security_manager.c | 22 ++++++++++++++++++++++ src/security/security_manager.h | 2 ++ tests/seclabeltest.c | 2 +- tests/securityselinuxlabeltest.c | 2 +- tests/securityselinuxtest.c | 2 +- tests/testutilsqemu.c | 2 +- 10 files changed, 37 insertions(+), 7 deletions(-) diff --git a/cfg.mk b/cfg.mk index 609ae869c2..e0a7b5105a 100644 --- a/cfg.mk +++ b/cfg.mk @@ -787,8 +787,10 @@ sc_prohibit_cross_inclusion: case $$dir in \ util/) safe="util";; \ access/ | conf/) safe="($$dir|conf|util)";; \ - cpu/| network/| node_device/| rpc/| security/| storage/) \ + cpu/| network/| node_device/| rpc/| storage/) \ safe="($$dir|util|conf|storage)";; \ + security/) \ + safe="($$dir|util|conf|storage|locking)";; \ xenapi/ | xenconfig/ ) safe="($$dir|util|conf|xen|cpu)";; \ *) safe="($$dir|$(mid_dirs)|util)";; \ esac; \ diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 4e84391bf5..5f957eb1f8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -2625,7 +2625,8 @@ int main(int argc, char *argv[]) ctrl->handshakeFd = handshakeFd; if (!(ctrl->securityManager = virSecurityManagerNew(securityDriver, - LXC_DRIVER_NAME, 0))) + LXC_DRIVER_NAME, + "nop", 0))) goto cleanup; if (ctrl->def->seclabels) { diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 8867645cdc..93aa25e9e6 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1532,7 +1532,7 @@ lxcSecurityInit(virLXCDriverConfigPtr cfg) flags |= VIR_SECURITY_MANAGER_REQUIRE_CONFINED; virSecurityManagerPtr mgr = virSecurityManagerNew(cfg->securityDriverName, - LXC_DRIVER_NAME, flags); + LXC_DRIVER_NAME, "nop", flags); if (!mgr) goto error; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6763c8cddc..4ac0d86803 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -355,6 +355,7 @@ qemuSecurityInit(virQEMUDriverPtr driver) while (names && *names) { if (!(mgr = qemuSecurityNew(*names, QEMU_DRIVER_NAME, + cfg->metadataLockManagerName, flags))) goto error; if (!stack) { @@ -370,6 +371,7 @@ qemuSecurityInit(virQEMUDriverPtr driver) } else { if (!(mgr = qemuSecurityNew(NULL, QEMU_DRIVER_NAME, + cfg->metadataLockManagerName, flags))) goto error; if (!(stack = qemuSecurityNewStack(mgr))) @@ -386,6 +388,7 @@ qemuSecurityInit(virQEMUDriverPtr driver) cfg->user, cfg->group, flags, + cfg->metadataLockManagerName, qemuSecurityChownCallback))) goto error; if (!stack) { diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 9f770d8c53..5c8370c159 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -28,6 +28,7 @@ #include "viralloc.h" #include "virobject.h" #include "virlog.h" +#include "locking/lock_manager.h" #define VIR_FROM_THIS VIR_FROM_SECURITY @@ -40,6 +41,8 @@ struct _virSecurityManager { unsigned int flags; const char *virtDriver; void *privateData; + + virLockManagerPluginPtr lockPlugin; }; static virClassPtr virSecurityManagerClass; @@ -52,6 +55,9 @@ void virSecurityManagerDispose(void *obj) if (mgr->drv->close) mgr->drv->close(mgr); + + virObjectUnref(mgr->lockPlugin); + VIR_FREE(mgr->privateData); } @@ -71,6 +77,7 @@ VIR_ONCE_GLOBAL_INIT(virSecurityManager); static virSecurityManagerPtr virSecurityManagerNewDriver(virSecurityDriverPtr drv, const char *virtDriver, + const char *lockManagerPluginName, unsigned int flags) { virSecurityManagerPtr mgr = NULL; @@ -90,6 +97,14 @@ virSecurityManagerNewDriver(virSecurityDriverPtr drv, if (!(mgr = virObjectLockableNew(virSecurityManagerClass))) goto error; + if (!lockManagerPluginName) + lockManagerPluginName = "nop"; + + if (!(mgr->lockPlugin = virLockManagerPluginNew(lockManagerPluginName, + NULL, NULL, 0))) { + goto error; + } + mgr->drv = drv; mgr->flags = flags; mgr->virtDriver = virtDriver; @@ -112,6 +127,7 @@ virSecurityManagerNewStack(virSecurityManagerPtr primary) virSecurityManagerPtr mgr = virSecurityManagerNewDriver(&virSecurityDriverStack, virSecurityManagerGetDriver(primary), + NULL, primary->flags); if (!mgr) @@ -120,6 +136,8 @@ virSecurityManagerNewStack(virSecurityManagerPtr primary) if (virSecurityStackAddNested(mgr, primary) < 0) goto error; + mgr->lockPlugin = virObjectRef(mgr->lockPlugin); + return mgr; error: virObjectUnref(mgr); @@ -142,6 +160,7 @@ virSecurityManagerNewDAC(const char *virtDriver, uid_t user, gid_t group, unsigned int flags, + const char *lockManagerPluginName, virSecurityManagerDACChownCallback chownCallback) { virSecurityManagerPtr mgr; @@ -152,6 +171,7 @@ virSecurityManagerNewDAC(const char *virtDriver, mgr = virSecurityManagerNewDriver(&virSecurityDriverDAC, virtDriver, + lockManagerPluginName, flags & VIR_SECURITY_MANAGER_NEW_MASK); if (!mgr) @@ -173,6 +193,7 @@ virSecurityManagerNewDAC(const char *virtDriver, virSecurityManagerPtr virSecurityManagerNew(const char *name, const char *virtDriver, + const char *lockManagerPluginName, unsigned int flags) { virSecurityDriverPtr drv = virSecurityDriverLookup(name, virtDriver); @@ -201,6 +222,7 @@ virSecurityManagerNew(const char *name, return virSecurityManagerNewDriver(drv, virtDriver, + lockManagerPluginName, flags); } diff --git a/src/security/security_manager.h b/src/security/security_manager.h index 1ead369e82..c537e1c994 100644 --- a/src/security/security_manager.h +++ b/src/security/security_manager.h @@ -45,6 +45,7 @@ typedef enum { virSecurityManagerPtr virSecurityManagerNew(const char *name, const char *virtDriver, + const char *lockManagerPluginName, unsigned int flags); virSecurityManagerPtr virSecurityManagerNewStack(virSecurityManagerPtr primary); @@ -70,6 +71,7 @@ virSecurityManagerPtr virSecurityManagerNewDAC(const char *virtDriver, uid_t user, gid_t group, unsigned int flags, + const char *lockManagerPluginName, virSecurityManagerDACChownCallback chownCallback); int virSecurityManagerPreFork(virSecurityManagerPtr mgr); diff --git a/tests/seclabeltest.c b/tests/seclabeltest.c index 4cda80cec2..6aafc45e64 100644 --- a/tests/seclabeltest.c +++ b/tests/seclabeltest.c @@ -18,7 +18,7 @@ mymain(void) if (virThreadInitialize() < 0) return EXIT_FAILURE; - mgr = virSecurityManagerNew(NULL, "QEMU", VIR_SECURITY_MANAGER_DEFAULT_CONFINED); + mgr = virSecurityManagerNew(NULL, "QEMU", "nop", VIR_SECURITY_MANAGER_DEFAULT_CONFINED); if (mgr == NULL) { fprintf(stderr, "Failed to start security driver"); return EXIT_FAILURE; diff --git a/tests/securityselinuxlabeltest.c b/tests/securityselinuxlabeltest.c index 48fee7cd28..85797411eb 100644 --- a/tests/securityselinuxlabeltest.c +++ b/tests/securityselinuxlabeltest.c @@ -349,7 +349,7 @@ mymain(void) if (!rc) return EXIT_AM_SKIP; - if (!(mgr = virSecurityManagerNew("selinux", "QEMU", + if (!(mgr = virSecurityManagerNew("selinux", "QEMU", "nop", VIR_SECURITY_MANAGER_DEFAULT_CONFINED | VIR_SECURITY_MANAGER_PRIVILEGED))) { VIR_TEST_VERBOSE("Unable to initialize security driver: %s\n", diff --git a/tests/securityselinuxtest.c b/tests/securityselinuxtest.c index a785e9a7da..652981c895 100644 --- a/tests/securityselinuxtest.c +++ b/tests/securityselinuxtest.c @@ -275,7 +275,7 @@ mymain(void) int ret = 0; virSecurityManagerPtr mgr; - if (!(mgr = virSecurityManagerNew("selinux", "QEMU", + if (!(mgr = virSecurityManagerNew("selinux", "QEMU", "nop", VIR_SECURITY_MANAGER_DEFAULT_CONFINED | VIR_SECURITY_MANAGER_PRIVILEGED))) { fprintf(stderr, "Unable to initialize security driver: %s\n", diff --git a/tests/testutilsqemu.c b/tests/testutilsqemu.c index 70bed461b5..f9972f7adc 100644 --- a/tests/testutilsqemu.c +++ b/tests/testutilsqemu.c @@ -717,7 +717,7 @@ int qemuTestDriverInit(virQEMUDriver *driver) if (qemuTestCapsCacheInsert(driver->qemuCapsCache, NULL) < 0) goto error; - if (!(mgr = virSecurityManagerNew("none", "qemu", + if (!(mgr = virSecurityManagerNew("none", "qemu", "nop", VIR_SECURITY_MANAGER_PRIVILEGED))) goto error; if (!(driver->securityManager = virSecurityManagerNewStack(mgr))) -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Now that we know what metadata lock manager user wishes to use we can load it when initializing security driver. This is achieved by adding new argument to virSecurityManagerNewDriver() and subsequently to all functions that end up calling it.
The cfg.mk change is needed in order to allow lock_manager.h inclusion in security driver without 'syntax-check' complaining. This is safe thing to do as locking APIs will always exist (it's only backend implementation that changes). However, instead of allowing the include for all other drivers (like cpu, network, and so on) allow it only for security driver. This will still trigger the error if including from other drivers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- cfg.mk | 4 +++- src/lxc/lxc_controller.c | 3 ++- src/lxc/lxc_driver.c | 2 +- src/qemu/qemu_driver.c | 3 +++ src/security/security_manager.c | 22 ++++++++++++++++++++++ src/security/security_manager.h | 2 ++ tests/seclabeltest.c | 2 +- tests/securityselinuxlabeltest.c | 2 +- tests/securityselinuxtest.c | 2 +- tests/testutilsqemu.c | 2 +- 10 files changed, 37 insertions(+), 7 deletions(-)
diff --git a/cfg.mk b/cfg.mk index 609ae869c2..e0a7b5105a 100644 --- a/cfg.mk +++ b/cfg.mk @@ -787,8 +787,10 @@ sc_prohibit_cross_inclusion: case $$dir in \ util/) safe="util";; \ access/ | conf/) safe="($$dir|conf|util)";; \ - cpu/| network/| node_device/| rpc/| security/| storage/) \ + cpu/| network/| node_device/| rpc/| storage/) \ safe="($$dir|util|conf|storage)";; \ + security/) \ + safe="($$dir|util|conf|storage|locking)";; \ xenapi/ | xenconfig/ ) safe="($$dir|util|conf|xen|cpu)";; \ *) safe="($$dir|$(mid_dirs)|util)";; \ esac; \ diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 4e84391bf5..5f957eb1f8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -2625,7 +2625,8 @@ int main(int argc, char *argv[]) ctrl->handshakeFd = handshakeFd;
if (!(ctrl->securityManager = virSecurityManagerNew(securityDriver, - LXC_DRIVER_NAME, 0))) + LXC_DRIVER_NAME, + "nop", 0))) goto cleanup;
if (ctrl->def->seclabels) { diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 8867645cdc..93aa25e9e6 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1532,7 +1532,7 @@ lxcSecurityInit(virLXCDriverConfigPtr cfg) flags |= VIR_SECURITY_MANAGER_REQUIRE_CONFINED;
virSecurityManagerPtr mgr = virSecurityManagerNew(cfg->securityDriverName, - LXC_DRIVER_NAME, flags); + LXC_DRIVER_NAME, "nop", flags);
I think we should use NULL instead of "nop"? Keeping the default of "nop" under the covers (so to speak). Similar for other callers...
if (!mgr) goto error;
[...]
diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 9f770d8c53..5c8370c159 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c
[...]
static virClassPtr virSecurityManagerClass; @@ -52,6 +55,9 @@ void virSecurityManagerDispose(void *obj)
if (mgr->drv->close) mgr->drv->close(mgr);
Order/setup issue here mgr->drv-> cannot be assumed right now!
+ + virObjectUnref(mgr->lockPlugin); + VIR_FREE(mgr->privateData); }
@@ -71,6 +77,7 @@ VIR_ONCE_GLOBAL_INIT(virSecurityManager); static virSecurityManagerPtr virSecurityManagerNewDriver(virSecurityDriverPtr drv, const char *virtDriver, + const char *lockManagerPluginName, unsigned int flags) { virSecurityManagerPtr mgr = NULL; @@ -90,6 +97,14 @@ virSecurityManagerNewDriver(virSecurityDriverPtr drv, if (!(mgr = virObjectLockableNew(virSecurityManagerClass))) goto error;
+ if (!lockManagerPluginName) + lockManagerPluginName = "nop"; + + if (!(mgr->lockPlugin = virLockManagerPluginNew(lockManagerPluginName, + NULL, NULL, 0))) { + goto error; + } +
Want to watching things die spectacularly? Don't pass NULL, "nop", or "lockd" here... For example, change securityselinuxlabeltest.c to pass "foobar" and enjoy. Problem is that mgr->drv-> and eventually mgr->fd == -1 is assumed in virSecurityManagerDispose.
mgr->drv = drv; mgr->flags = flags; mgr->virtDriver = virtDriver;
[...]
diff --git a/tests/securityselinuxlabeltest.c b/tests/securityselinuxlabeltest.c index 48fee7cd28..85797411eb 100644 --- a/tests/securityselinuxlabeltest.c +++ b/tests/securityselinuxlabeltest.c @@ -349,7 +349,7 @@ mymain(void) if (!rc) return EXIT_AM_SKIP;
- if (!(mgr = virSecurityManagerNew("selinux", "QEMU", + if (!(mgr = virSecurityManagerNew("selinux", "QEMU", "nop",
Just for "completeness", why not pass "lockd" here? Wonder if it's worth creating a false test that passes a non existent image (foobar) just to ensure it fails properly so that someone doesn't inadvertently undo what you'll need to change...
VIR_SECURITY_MANAGER_DEFAULT_CONFINED | VIR_SECURITY_MANAGER_PRIVILEGED))) { VIR_TEST_VERBOSE("Unable to initialize security driver: %s\n",
With the callers passing NULL instead of "nop" and cleaning up the New/Dispose code properly - Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 09/17/2018 09:56 PM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Now that we know what metadata lock manager user wishes to use we can load it when initializing security driver. This is achieved by adding new argument to virSecurityManagerNewDriver() and subsequently to all functions that end up calling it.
The cfg.mk change is needed in order to allow lock_manager.h inclusion in security driver without 'syntax-check' complaining. This is safe thing to do as locking APIs will always exist (it's only backend implementation that changes). However, instead of allowing the include for all other drivers (like cpu, network, and so on) allow it only for security driver. This will still trigger the error if including from other drivers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- cfg.mk | 4 +++- src/lxc/lxc_controller.c | 3 ++- src/lxc/lxc_driver.c | 2 +- src/qemu/qemu_driver.c | 3 +++ src/security/security_manager.c | 22 ++++++++++++++++++++++ src/security/security_manager.h | 2 ++ tests/seclabeltest.c | 2 +- tests/securityselinuxlabeltest.c | 2 +- tests/securityselinuxtest.c | 2 +- tests/testutilsqemu.c | 2 +- 10 files changed, 37 insertions(+), 7 deletions(-)
diff --git a/cfg.mk b/cfg.mk index 609ae869c2..e0a7b5105a 100644 --- a/cfg.mk +++ b/cfg.mk @@ -787,8 +787,10 @@ sc_prohibit_cross_inclusion: case $$dir in \ util/) safe="util";; \ access/ | conf/) safe="($$dir|conf|util)";; \ - cpu/| network/| node_device/| rpc/| security/| storage/) \ + cpu/| network/| node_device/| rpc/| storage/) \ safe="($$dir|util|conf|storage)";; \ + security/) \ + safe="($$dir|util|conf|storage|locking)";; \ xenapi/ | xenconfig/ ) safe="($$dir|util|conf|xen|cpu)";; \ *) safe="($$dir|$(mid_dirs)|util)";; \ esac; \ diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 4e84391bf5..5f957eb1f8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -2625,7 +2625,8 @@ int main(int argc, char *argv[]) ctrl->handshakeFd = handshakeFd;
if (!(ctrl->securityManager = virSecurityManagerNew(securityDriver, - LXC_DRIVER_NAME, 0))) + LXC_DRIVER_NAME, + "nop", 0))) goto cleanup;
if (ctrl->def->seclabels) { diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 8867645cdc..93aa25e9e6 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1532,7 +1532,7 @@ lxcSecurityInit(virLXCDriverConfigPtr cfg) flags |= VIR_SECURITY_MANAGER_REQUIRE_CONFINED;
virSecurityManagerPtr mgr = virSecurityManagerNew(cfg->securityDriverName, - LXC_DRIVER_NAME, flags); + LXC_DRIVER_NAME, "nop", flags);
I think we should use NULL instead of "nop"? Keeping the default of "nop" under the covers (so to speak). Similar for other callers...
if (!mgr) goto error;
[...]
diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 9f770d8c53..5c8370c159 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c
[...]
static virClassPtr virSecurityManagerClass; @@ -52,6 +55,9 @@ void virSecurityManagerDispose(void *obj)
if (mgr->drv->close) mgr->drv->close(mgr);
Order/setup issue here mgr->drv-> cannot be assumed right now!
+ + virObjectUnref(mgr->lockPlugin); + VIR_FREE(mgr->privateData); }
@@ -71,6 +77,7 @@ VIR_ONCE_GLOBAL_INIT(virSecurityManager); static virSecurityManagerPtr virSecurityManagerNewDriver(virSecurityDriverPtr drv, const char *virtDriver, + const char *lockManagerPluginName, unsigned int flags) { virSecurityManagerPtr mgr = NULL; @@ -90,6 +97,14 @@ virSecurityManagerNewDriver(virSecurityDriverPtr drv, if (!(mgr = virObjectLockableNew(virSecurityManagerClass))) goto error;
+ if (!lockManagerPluginName) + lockManagerPluginName = "nop"; + + if (!(mgr->lockPlugin = virLockManagerPluginNew(lockManagerPluginName, + NULL, NULL, 0))) { + goto error; + } +
Want to watching things die spectacularly? Don't pass NULL, "nop", or "lockd" here... For example, change securityselinuxlabeltest.c to pass "foobar" and enjoy.
Problem is that mgr->drv-> and eventually mgr->fd == -1 is assumed in virSecurityManagerDispose.
mgr->drv = drv; mgr->flags = flags; mgr->virtDriver = virtDriver;
[...]
diff --git a/tests/securityselinuxlabeltest.c b/tests/securityselinuxlabeltest.c index 48fee7cd28..85797411eb 100644 --- a/tests/securityselinuxlabeltest.c +++ b/tests/securityselinuxlabeltest.c @@ -349,7 +349,7 @@ mymain(void) if (!rc) return EXIT_AM_SKIP;
- if (!(mgr = virSecurityManagerNew("selinux", "QEMU", + if (!(mgr = virSecurityManagerNew("selinux", "QEMU", "nop",
Just for "completeness", why not pass "lockd" here?
Because that would require virtlockd running every time the test is run. Well, not right now but after the last patch in the series.
Wonder if it's worth creating a false test that passes a non existent image (foobar) just to ensure it fails properly so that someone doesn't inadvertently undo what you'll need to change...
Maybe. But I'll save that for a follow up patch. Michal

Two new APIs are added so that security driver can lock and unlock paths it wishes to touch. These APIs are not for other drivers to call but security drivers (DAC and SELinux). That is the reason these APIs are not exposed through our libvirt_private.syms file. Three interesting things happen in this commit. The first is the global @lockManagerMutex. Unfortunately, this has to exist so that there is only on thread talking to virtlockd at a time. If there were more threads and one of them closed the connection prematurely, it would cause virtlockd killing libvirtd. Instead of complicated code that would handle that, let's have a mutex and keep the code simple. The second interesting thing is keeping connection open between lock and unlock API calls. This is achieved by duplicating client FD and keeping it open until unlock is called. This trick is used by regular disk content locking code when the FD is leaked to qemu. Finally, the third thing is polling implemented at client side. Since virtlockd has only one thread that handles locking requests, all it can do is either acquire lock or error out. Therefore, the polling has to be implemented in client. The polling is capped at 60 second timeout, which should be plenty since the metadata lock is held only for a fraction of a second. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_manager.c | 135 ++++++++++++++++++++++++++++++++++++++++ src/security/security_manager.h | 7 +++ 2 files changed, 142 insertions(+) diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 5c8370c159..dd5c3ac7e5 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -29,11 +29,15 @@ #include "virobject.h" #include "virlog.h" #include "locking/lock_manager.h" +#include "virfile.h" +#include "virtime.h" #define VIR_FROM_THIS VIR_FROM_SECURITY VIR_LOG_INIT("security.security_manager"); +virMutex lockManagerMutex = VIR_MUTEX_INITIALIZER; + struct _virSecurityManager { virObjectLockable parent; @@ -43,6 +47,7 @@ struct _virSecurityManager { void *privateData; virLockManagerPluginPtr lockPlugin; + int fd; }; static virClassPtr virSecurityManagerClass; @@ -57,6 +62,7 @@ void virSecurityManagerDispose(void *obj) mgr->drv->close(mgr); virObjectUnref(mgr->lockPlugin); + VIR_FORCE_CLOSE(mgr->fd); VIR_FREE(mgr->privateData); } @@ -109,6 +115,7 @@ virSecurityManagerNewDriver(virSecurityDriverPtr drv, mgr->flags = flags; mgr->virtDriver = virtDriver; VIR_STEAL_PTR(mgr->privateData, privateData); + mgr->fd = -1; if (drv->open(mgr) < 0) goto error; @@ -1263,3 +1270,131 @@ virSecurityManagerRestoreTPMLabels(virSecurityManagerPtr mgr, return 0; } + + +static virLockManagerPtr +virSecurityManagerNewLockManager(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virLockManagerParam params[] = { + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID, + .key = "uuid", + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING, + .key = "name", + .value = { .cstr = "libvirtd-sec" }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "pid", + .value = { .iv = getpid() }, + }, + }; + const unsigned int flags = 0; + size_t i; + + if (virGetHostUUID(params[0].value.uuid) < 0) + return NULL; + + if (!(lock = virLockManagerNew(virLockManagerPluginGetDriver(mgr->lockPlugin), + VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, + ARRAY_CARDINALITY(params), + params, + flags))) + return NULL; + + for (i = 0; i < npaths; i++) { + if (virLockManagerAddResource(lock, + VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA, + paths[i], 0, NULL, 0) < 0) + goto error; + } + + return lock; + error: + virLockManagerFree(lock); + return NULL; +} + + +/* How many seconds should we try to acquire the lock before + * giving up. */ +#define LOCK_ACQUIRE_TIMEOUT 60 + +int +virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virTimeBackOffVar timebackoff; + int fd = -1; + int rv; + int ret = -1; + + virMutexLock(&lockManagerMutex); + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; + + if (virTimeBackOffStart(&timebackoff, 1, LOCK_ACQUIRE_TIMEOUT * 1000) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + rv = virLockManagerAcquire(lock, NULL, + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, + VIR_DOMAIN_LOCK_FAILURE_DEFAULT, &fd); + + if (rv >= 0) + break; + + if (virGetLastErrorCode() == VIR_ERR_RESOURCE_BUSY) + continue; + + goto cleanup; + } + + if (rv < 0) + goto cleanup; + + mgr->fd = fd; + fd = -1; + + ret = 0; + cleanup: + virLockManagerFree(lock); + VIR_FORCE_CLOSE(fd); + if (ret < 0) + virMutexUnlock(&lockManagerMutex); + return ret; +} + + +int +virSecurityManagerMetadataUnlock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + int fd; + int ret = -1; + + /* lockManagerMutex acquired from previous + * virSecurityManagerMetadataLock() call. */ + + fd = mgr->fd; + mgr->fd = -1; + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; + + if (virLockManagerRelease(lock, NULL, 0) < 0) + goto cleanup; + + ret = 0; + cleanup: + virLockManagerFree(lock); + VIR_FORCE_CLOSE(fd); + virMutexUnlock(&lockManagerMutex); + return ret; +} diff --git a/src/security/security_manager.h b/src/security/security_manager.h index c537e1c994..10ebe5cc29 100644 --- a/src/security/security_manager.h +++ b/src/security/security_manager.h @@ -199,4 +199,11 @@ int virSecurityManagerSetTPMLabels(virSecurityManagerPtr mgr, int virSecurityManagerRestoreTPMLabels(virSecurityManagerPtr mgr, virDomainDefPtr vm); +int virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths); +int virSecurityManagerMetadataUnlock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths); + #endif /* VIR_SECURITY_MANAGER_H__ */ -- 2.16.4

Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
+int +virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virTimeBackOffVar timebackoff; + int fd = -1; + int rv;
gcc complains that rv might be uninitialized.
+ int ret = -1; + + virMutexLock(&lockManagerMutex); + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; + + if (virTimeBackOffStart(&timebackoff, 1, LOCK_ACQUIRE_TIMEOUT * 1000) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + rv = virLockManagerAcquire(lock, NULL, + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, + VIR_DOMAIN_LOCK_FAILURE_DEFAULT, &fd); + + if (rv >= 0) + break; + + if (virGetLastErrorCode() == VIR_ERR_RESOURCE_BUSY) + continue; + + goto cleanup; + } + + if (rv < 0) + goto cleanup; + + mgr->fd = fd; + fd = -1; + + ret = 0; + cleanup: + virLockManagerFree(lock); + VIR_FORCE_CLOSE(fd); + if (ret < 0) + virMutexUnlock(&lockManagerMutex); + return ret; +}
-- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/10/2018 02:19 PM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
+int +virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virTimeBackOffVar timebackoff; + int fd = -1; + int rv;
gcc complains that rv might be uninitialized.
Right, if ..
+ int ret = -1; + + virMutexLock(&lockManagerMutex); + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; + + if (virTimeBackOffStart(&timebackoff, 1, LOCK_ACQUIRE_TIMEOUT * 1000) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) {
.. this is never true (which is impossible).
+ rv = virLockManagerAcquire(lock, NULL, + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, + VIR_DOMAIN_LOCK_FAILURE_DEFAULT, &fd); + + if (rv >= 0) + break; + + if (virGetLastErrorCode() == VIR_ERR_RESOURCE_BUSY) + continue; + + goto cleanup; + } + + if (rv < 0) + goto cleanup;
Okay, I'll fix this is my local branch. Thanks, Michal

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Two new APIs are added so that security driver can lock and unlock paths it wishes to touch. These APIs are not for other drivers to call but security drivers (DAC and SELinux). That is the reason these APIs are not exposed through our libvirt_private.syms file.
Three interesting things happen in this commit. The first is the global @lockManagerMutex. Unfortunately, this has to exist so that there is only on thread talking to virtlockd at a time. If there
s/on/one
were more threads and one of them closed the connection prematurely, it would cause virtlockd killing libvirtd. Instead of complicated code that would handle that, let's have a mutex and keep the code simple.
The second interesting thing is keeping connection open between lock and unlock API calls. This is achieved by duplicating client FD and keeping it open until unlock is called. This trick is used by regular disk content locking code when the FD is leaked to qemu.
Finally, the third thing is polling implemented at client side. Since virtlockd has only one thread that handles locking requests, all it can do is either acquire lock or error out. Therefore, the polling has to be implemented in client. The polling is capped at 60 second timeout, which should be plenty since the metadata lock is held only for a fraction of a second.
I smell a configurable timeout instead of 60 seconds, but that's mainly because my senses are more acutely aware of such configurable timeout changes given some recent on list patches for client lock timeouts when (as I assume) there is a client gathering stats that takes a long time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_manager.c | 135 ++++++++++++++++++++++++++++++++++++++++ src/security/security_manager.h | 7 +++ 2 files changed, 142 insertions(+)
diff --git a/src/security/security_manager.c b/src/security/security_manager.c index 5c8370c159..dd5c3ac7e5 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -29,11 +29,15 @@ #include "virobject.h" #include "virlog.h" #include "locking/lock_manager.h" +#include "virfile.h" +#include "virtime.h"
#define VIR_FROM_THIS VIR_FROM_SECURITY
VIR_LOG_INIT("security.security_manager");
+virMutex lockManagerMutex = VIR_MUTEX_INITIALIZER; + struct _virSecurityManager { virObjectLockable parent;
@@ -43,6 +47,7 @@ struct _virSecurityManager { void *privateData;
virLockManagerPluginPtr lockPlugin; + int fd;
Would you mind renaming to "clientfd" or at least "noting" in comments here where this fd is duplicated so that future me doesn't need to go back and read the commit where this was added to refresh my memory. Seeing just fd and searching on that is like finding a needle in the haystack of fd's.
};
[...]
+ +/* How many seconds should we try to acquire the lock before + * giving up. */
How much wood could a woodchuck chuck wood if a woodchuck could chuck wood?
+#define LOCK_ACQUIRE_TIMEOUT 60 + +int +virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virTimeBackOffVar timebackoff; + int fd = -1; + int rv;
I know you know already by rv = -1
+ int ret = -1; + + virMutexLock(&lockManagerMutex); + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; +
Just a couple of minor things, Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 09/17/2018 11:14 PM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Two new APIs are added so that security driver can lock and unlock paths it wishes to touch. These APIs are not for other drivers to call but security drivers (DAC and SELinux). That is the reason these APIs are not exposed through our libvirt_private.syms file.
Three interesting things happen in this commit. The first is the global @lockManagerMutex. Unfortunately, this has to exist so that there is only on thread talking to virtlockd at a time. If there
s/on/one
were more threads and one of them closed the connection prematurely, it would cause virtlockd killing libvirtd. Instead of complicated code that would handle that, let's have a mutex and keep the code simple.
The second interesting thing is keeping connection open between lock and unlock API calls. This is achieved by duplicating client FD and keeping it open until unlock is called. This trick is used by regular disk content locking code when the FD is leaked to qemu.
Finally, the third thing is polling implemented at client side. Since virtlockd has only one thread that handles locking requests, all it can do is either acquire lock or error out. Therefore, the polling has to be implemented in client. The polling is capped at 60 second timeout, which should be plenty since the metadata lock is held only for a fraction of a second.
I smell a configurable timeout instead of 60 seconds, but that's mainly because my senses are more acutely aware of such configurable timeout changes given some recent on list patches for client lock timeouts when (as I assume) there is a client gathering stats that takes a long time.
Hopefully not. This indeed is a lock that guards chown()/setfilecon_raw(). I wouldn't expect them to took very long even if there were dozens of daemons fighting over the lock. Michal

[...] VIR_FROM_THIS VIR_FROM_SECURITY
VIR_LOG_INIT("security.security_manager");
+virMutex lockManagerMutex = VIR_MUTEX_INITIALIZER; + struct _virSecurityManager { virObjectLockable parent;
@@ -43,6 +47,7 @@ struct _virSecurityManager { void *privateData;
virLockManagerPluginPtr lockPlugin; + int fd; };
static virClassPtr virSecurityManagerClass; @@ -57,6 +62,7 @@ void virSecurityManagerDispose(void *obj) mgr->drv->close(mgr);
virObjectUnref(mgr->lockPlugin); + VIR_FORCE_CLOSE(mgr->fd);
VIR_FREE(mgr->privateData); } @@ -109,6 +115,7 @@ virSecurityManagerNewDriver(virSecurityDriverPtr drv, mgr->flags = flags; mgr->virtDriver = virtDriver; VIR_STEAL_PTR(mgr->privateData, privateData); + mgr->fd = -1;
if (drv->open(mgr) < 0) goto error; @@ -1263,3 +1270,131 @@ virSecurityManagerRestoreTPMLabels(virSecurityManagerPtr mgr,
return 0; } + + +static virLockManagerPtr +virSecurityManagerNewLockManager(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virLockManagerParam params[] = { + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID, + .key = "uuid", + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING, + .key = "name", + .value = { .cstr = "libvirtd-sec" }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "pid", + .value = { .iv = getpid() }, + }, + }; + const unsigned int flags = 0; + size_t i; + + if (virGetHostUUID(params[0].value.uuid) < 0) + return NULL; + + if (!(lock = virLockManagerNew(virLockManagerPluginGetDriver(mgr->lockPlugin), + VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, + ARRAY_CARDINALITY(params), + params, + flags))) + return NULL; + + for (i = 0; i < npaths; i++) { + if (virLockManagerAddResource(lock, + VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA, + paths[i], 0, NULL, 0) < 0) + goto error; + } + + return lock; + error: + virLockManagerFree(lock); + return NULL; +} + + +/* How many seconds should we try to acquire the lock before + * giving up. */ +#define LOCK_ACQUIRE_TIMEOUT 60 + +int +virSecurityManagerMetadataLock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + virTimeBackOffVar timebackoff; + int fd = -1; + int rv; + int ret = -1; + + virMutexLock(&lockManagerMutex); + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; +
After seeing it in use in patch 19 and thinking about it for a very short period of time, would it make more sense to store @lock somewhere so that virSecurityManagerMetadataUnlock doesn't fail because the virSecurityManagerNewLockManager fails? The @mgr is mutex locked so that nothing can change @mgr while this Meta Lock/Unlock is occurring. It'd be a shame to not call virLockManagerRelease just because we didn't save @lock John
+ if (virTimeBackOffStart(&timebackoff, 1, LOCK_ACQUIRE_TIMEOUT * 1000) < 0) + goto cleanup; + while (virTimeBackOffWait(&timebackoff)) { + rv = virLockManagerAcquire(lock, NULL, + VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK, + VIR_DOMAIN_LOCK_FAILURE_DEFAULT, &fd); + + if (rv >= 0) + break; + + if (virGetLastErrorCode() == VIR_ERR_RESOURCE_BUSY) + continue; + + goto cleanup; + } + + if (rv < 0) + goto cleanup; + + mgr->fd = fd; + fd = -1; + + ret = 0; + cleanup: + virLockManagerFree(lock); + VIR_FORCE_CLOSE(fd); + if (ret < 0) + virMutexUnlock(&lockManagerMutex); + return ret; +} + + +int +virSecurityManagerMetadataUnlock(virSecurityManagerPtr mgr, + const char * const *paths, + size_t npaths) +{ + virLockManagerPtr lock; + int fd; + int ret = -1; + + /* lockManagerMutex acquired from previous + * virSecurityManagerMetadataLock() call. */ + + fd = mgr->fd; + mgr->fd = -1; + + if (!(lock = virSecurityManagerNewLockManager(mgr, paths, npaths))) + goto cleanup; + + if (virLockManagerRelease(lock, NULL, 0) < 0) + goto cleanup; + + ret = 0; + cleanup: + virLockManagerFree(lock); + VIR_FORCE_CLOSE(fd); + virMutexUnlock(&lockManagerMutex); + return ret; +}

On 09/18/2018 12:12 AM, John Ferlan wrote:
[...]
+
After seeing it in use in patch 19 and thinking about it for a very short period of time, would it make more sense to store @lock somewhere so that virSecurityManagerMetadataUnlock doesn't fail because the virSecurityManagerNewLockManager fails?
The @mgr is mutex locked so that nothing can change @mgr while this Meta Lock/Unlock is occurring. It'd be a shame to not call virLockManagerRelease just because we didn't save @lock
I will look into this, but quick glance over virSecurityManagerNewLockManager() tells me that it can fail only in OOM case (which means you're in much bigger trouble anyway). (some time passes) So I did take a look and we would need to save both FD and @lock. However, saving those two would not save us from allocating memory. Even then the virLockManagerLockDaemonRelease() allocates memory (for client, connection, etc.). Long story short, I don't think it is worth the effort. Michal

So far the whole transaction handling is done virSecurityDACSetOwnershipInternal(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 65 ++++++++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 21 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 926c9a33c1..52e28b5fda 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -77,12 +77,13 @@ struct _virSecurityDACChownItem { const virStorageSource *src; uid_t uid; gid_t gid; + bool restore; }; typedef struct _virSecurityDACChownList virSecurityDACChownList; typedef virSecurityDACChownList *virSecurityDACChownListPtr; struct _virSecurityDACChownList { - virSecurityDACDataPtr priv; + virSecurityManagerPtr manager; virSecurityDACChownItemPtr *items; size_t nItems; }; @@ -95,7 +96,8 @@ virSecurityDACChownListAppend(virSecurityDACChownListPtr list, const char *path, const virStorageSource *src, uid_t uid, - gid_t gid) + gid_t gid, + bool restore) { int ret = -1; char *tmp = NULL; @@ -111,6 +113,7 @@ virSecurityDACChownListAppend(virSecurityDACChownListPtr list, item->src = src; item->uid = uid; item->gid = gid; + item->restore = restore; if (VIR_APPEND_ELEMENT(list->items, list->nItems, item) < 0) goto cleanup; @@ -159,25 +162,29 @@ static int virSecurityDACTransactionAppend(const char *path, const virStorageSource *src, uid_t uid, - gid_t gid) + gid_t gid, + bool restore) { virSecurityDACChownListPtr list = virThreadLocalGet(&chownList); if (!list) return 0; - if (virSecurityDACChownListAppend(list, path, src, uid, gid) < 0) + if (virSecurityDACChownListAppend(list, path, src, uid, gid, restore) < 0) return -1; return 1; } -static int virSecurityDACSetOwnershipInternal(const virSecurityDACData *priv, - const virStorageSource *src, - const char *path, - uid_t uid, - gid_t gid); +static int virSecurityDACSetOwnership(virSecurityManagerPtr mgr, + const virStorageSource *src, + const char *path, + uid_t uid, + gid_t gid); +static int virSecurityDACRestoreFileLabelInternal(virSecurityManagerPtr mgr, + const virStorageSource *src, + const char *path); /** * virSecurityDACTransactionRun: * @pid: process pid @@ -201,11 +208,16 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, virSecurityDACChownItemPtr item = list->items[i]; /* TODO Implement rollback */ - if (virSecurityDACSetOwnershipInternal(list->priv, - item->src, - item->path, - item->uid, - item->gid) < 0) + if ((!item->restore && + virSecurityDACSetOwnership(list->manager, + item->src, + item->path, + item->uid, + item->gid) < 0) || + (item->restore && + virSecurityDACRestoreFileLabelInternal(list->manager, + item->src, + item->path) < 0)) return -1; } @@ -455,7 +467,6 @@ virSecurityDACPreFork(virSecurityManagerPtr mgr) static int virSecurityDACTransactionStart(virSecurityManagerPtr mgr) { - virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); virSecurityDACChownListPtr list; list = virThreadLocalGet(&chownList); @@ -468,7 +479,7 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->priv = priv; + list->manager = mgr; if (virThreadLocalSet(&chownList, list) < 0) { virReportSystemError(errno, "%s", @@ -564,11 +575,6 @@ virSecurityDACSetOwnershipInternal(const virSecurityDACData *priv, /* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */ - if ((rc = virSecurityDACTransactionAppend(path, src, uid, gid)) < 0) - return -1; - else if (rc > 0) - return 0; - VIR_INFO("Setting DAC user and group on '%s' to '%ld:%ld'", NULLSTR(src ? src->path : path), (long)uid, (long)gid); @@ -640,11 +646,20 @@ virSecurityDACSetOwnership(virSecurityManagerPtr mgr, { virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); struct stat sb; + int rc; if (!path && src && src->path && virStorageSourceIsLocalStorage(src)) path = src->path; + /* Be aware that this function might run in a separate process. + * Therefore, any driver state changes would be thrown away. */ + + if ((rc = virSecurityDACTransactionAppend(path, src, uid, gid, false)) < 0) + return -1; + else if (rc > 0) + return 0; + if (path) { if (stat(path, &sb) < 0) { virReportSystemError(errno, _("unable to stat: %s"), path); @@ -676,6 +691,14 @@ virSecurityDACRestoreFileLabelInternal(virSecurityManagerPtr mgr, virStorageSourceIsLocalStorage(src)) path = src->path; + /* Be aware that this function might run in a separate process. + * Therefore, any driver state changes would be thrown away. */ + + if ((rv = virSecurityDACTransactionAppend(path, src, uid, gid, true)) < 0) + return -1; + else if (rv > 0) + return 0; + if (path) { rv = virSecurityDACRecallLabel(priv, path, &uid, &gid); if (rv < 0) -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
So far the whole transaction handling is done virSecurityDACSetOwnershipInternal(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it.
relabeling according to my spell checker...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 65 ++++++++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 21 deletions(-)
diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 926c9a33c1..52e28b5fda 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c
[...]
@@ -201,11 +208,16 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, virSecurityDACChownItemPtr item = list->items[i];
/* TODO Implement rollback */ - if (virSecurityDACSetOwnershipInternal(list->priv, - item->src, - item->path, - item->uid, - item->gid) < 0) + if ((!item->restore && + virSecurityDACSetOwnership(list->manager, + item->src, + item->path, + item->uid, + item->gid) < 0) ||
yuck - one of more least favorite constructs.
+ (item->restore && + virSecurityDACRestoreFileLabelInternal(list->manager, + item->src, + item->path) < 0)) return -1; }
@@ -455,7 +467,6 @@ virSecurityDACPreFork(virSecurityManagerPtr mgr) static int virSecurityDACTransactionStart(virSecurityManagerPtr mgr) { - virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); virSecurityDACChownListPtr list;
list = virThreadLocalGet(&chownList); @@ -468,7 +479,7 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1;
- list->priv = priv; + list->manager = mgr;
Should this be a virObjectRef(mgr) with the corresponding virObjectUnref at the appropriate moment in time? I don't think there's a chance @mgr could be Unref'd otherwise, but every time I see this type construct for an object I want to be 'safe'.
if (virThreadLocalSet(&chownList, list) < 0) { virReportSystemError(errno, "%s", @@ -564,11 +575,6 @@ virSecurityDACSetOwnershipInternal(const virSecurityDACData *priv, /* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */
^^ The above comment is repeated below looks strange "naked" here without the virSecurityDACTransactionAppend... Theoretically though not really true since the TransactionRun will call *SetOwnership or *RestoreFileLabelInternal... IDC, but figured I'd just note it anyway The Ref/Unref is up to you - fix a couple of nits... Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 09/17/2018 11:47 PM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
So far the whole transaction handling is done virSecurityDACSetOwnershipInternal(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it.
relabeling according to my spell checker...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 65 ++++++++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 21 deletions(-)
diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 926c9a33c1..52e28b5fda 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c
[...]
@@ -201,11 +208,16 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, virSecurityDACChownItemPtr item = list->items[i];
/* TODO Implement rollback */ - if (virSecurityDACSetOwnershipInternal(list->priv, - item->src, - item->path, - item->uid, - item->gid) < 0) + if ((!item->restore && + virSecurityDACSetOwnership(list->manager, + item->src, + item->path, + item->uid, + item->gid) < 0) ||
yuck - one of more least favorite constructs.
+ (item->restore && + virSecurityDACRestoreFileLabelInternal(list->manager, + item->src, + item->path) < 0)) return -1; }
@@ -455,7 +467,6 @@ virSecurityDACPreFork(virSecurityManagerPtr mgr) static int virSecurityDACTransactionStart(virSecurityManagerPtr mgr) { - virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); virSecurityDACChownListPtr list;
list = virThreadLocalGet(&chownList); @@ -468,7 +479,7 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1;
- list->priv = priv; + list->manager = mgr;
Should this be a virObjectRef(mgr) with the corresponding virObjectUnref at the appropriate moment in time? I don't think there's a chance @mgr could be Unref'd otherwise, but every time I see this type construct for an object I want to be 'safe'.
I hear you, but I don't think that ref/unref is necessary here. It's actually @mgr who called these driver APIs. Michal

Firstly, the message that says we're setting uid:gid shouldn't be called from virSecurityDACSetOwnershipInternal() because virSecurityDACRestoreFileLabelInternal() is calling it too. Secondly, there are places between us reporting label restore and us actually doing it where we can quit. Don't say we're doing something until we are actually about to do it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 52e28b5fda..414e226f0f 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -575,9 +575,6 @@ virSecurityDACSetOwnershipInternal(const virSecurityDACData *priv, /* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */ - VIR_INFO("Setting DAC user and group on '%s' to '%ld:%ld'", - NULLSTR(src ? src->path : path), (long)uid, (long)gid); - if (priv && src && priv->chownCallback) { rc = priv->chownCallback(src, uid, gid); /* here path is used only for error messages */ @@ -670,6 +667,9 @@ virSecurityDACSetOwnership(virSecurityManagerPtr mgr, return -1; } + VIR_INFO("Setting DAC user and group on '%s' to '%ld:%ld'", + NULLSTR(src ? src->path : path), (long)uid, (long)gid); + return virSecurityDACSetOwnershipInternal(priv, src, path, uid, gid); } @@ -684,9 +684,6 @@ virSecurityDACRestoreFileLabelInternal(virSecurityManagerPtr mgr, uid_t uid = 0; /* By default return to root:root */ gid_t gid = 0; - VIR_INFO("Restoring DAC user and group on '%s'", - NULLSTR(src ? src->path : path)); - if (!path && src && src->path && virStorageSourceIsLocalStorage(src)) path = src->path; @@ -707,6 +704,9 @@ virSecurityDACRestoreFileLabelInternal(virSecurityManagerPtr mgr, return 0; } + VIR_INFO("Restoring DAC user and group on '%s' to %ld:%ld", + NULLSTR(src ? src->path : path), (long)uid, (long)gid); + return virSecurityDACSetOwnershipInternal(priv, src, path, uid, gid); } -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Firstly, the message that says we're setting uid:gid shouldn't be called from virSecurityDACSetOwnershipInternal() because virSecurityDACRestoreFileLabelInternal() is calling it too. Secondly, there are places between us reporting label restore and us actually doing it where we can quit. Don't say we're doing something until we are actually about to do it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
Of course both of this adjustments bring to the forefront the odditity of using virSecurityDACTransactionAppend when/if @path == NULL - I mean, what's the purpose then? *Especially* if there's more than one! Dang - so does this mean the previous patch needs adjustment? For this though, Reviewed-by: John Ferlan <jferlan@redhat.com> John

Lock all the paths we want to relabel to mutually exclude other libvirt daemons. The only culprit here hitch here is that directories can't be locked. Therefore, when relabeling a directory do not lock it (this happens only when setting up some domain private paths anyway, e.g. huge pages directory). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 414e226f0f..e8fd4a9132 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -202,8 +202,28 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, void *opaque) { virSecurityDACChownListPtr list = opaque; + const char **paths = NULL; + size_t npaths = 0; size_t i; + int rv; + int ret = -1; + if (VIR_ALLOC_N(paths, list->nItems) < 0) + return -1; + + for (i = 0; i < list->nItems; i++) { + const char *p = list->items[i]->path; + + if (virFileIsDir(p)) + continue; + + VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); + } + + if (virSecurityManagerMetadataLock(list->manager, paths, npaths) < 0) + goto cleanup; + + rv = 0; for (i = 0; i < list->nItems; i++) { virSecurityDACChownItemPtr item = list->items[i]; @@ -217,11 +237,22 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, (item->restore && virSecurityDACRestoreFileLabelInternal(list->manager, item->src, - item->path) < 0)) - return -1; + item->path) < 0)) { + rv = -1; + break; + } } - return 0; + if (virSecurityManagerMetadataUnlock(list->manager, paths, npaths) < 0) + goto cleanup; + + if (rv < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_FREE(paths); + return ret; } -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Lock all the paths we want to relabel to mutually exclude other libvirt daemons.
The only culprit here hitch here is that directories can't be
reread the above and fix and fix ;-)
locked. Therefore, when relabeling a directory do not lock it (this happens only when setting up some domain private paths anyway, e.g. huge pages directory).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_dac.c | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 414e226f0f..e8fd4a9132 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -202,8 +202,28 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED,
The description for this function needs some updating to describe this extra step and what/how/why it's done this way. Yeah, I know go back to the commit message and find it...
void *opaque) { virSecurityDACChownListPtr list = opaque; + const char **paths = NULL; + size_t npaths = 0; size_t i; + int rv; + int ret = -1;
+ if (VIR_ALLOC_N(paths, list->nItems) < 0) + return -1; + + for (i = 0; i < list->nItems; i++) { + const char *p = list->items[i]->path; + + if (virFileIsDir(p)) + continue; + + VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); + } + + if (virSecurityManagerMetadataLock(list->manager, paths, npaths) < 0) + goto cleanup; + + rv = 0; for (i = 0; i < list->nItems; i++) { virSecurityDACChownItemPtr item = list->items[i];
@@ -217,11 +237,22 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, (item->restore && virSecurityDACRestoreFileLabelInternal(list->manager, item->src, - item->path) < 0)) - return -1; + item->path) < 0)) { + rv = -1; + break;
If you'd used the non || construct, I think this would look cleaner: if (!item->restore) rv = virSecurityDACSetOwnership else rv = virSecurityDACRestoreFileLabelInternal if (rv < 0) break; So much easier to read.
+ } }
- return 0; + if (virSecurityManagerMetadataUnlock(list->manager, paths, npaths) < 0) + goto cleanup;
See my second note in patch 16 - Perhaps it's better to not repeat the same sequence since paths/npaths doesn't change. In any case, for this code... With a couple of adjustments, Reviewed-by: John Ferlan <jferlan@redhat.com> John
+ + if (rv < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_FREE(paths); + return ret; }

This label is used in both successful and error paths. Therefore it should be named 'cleanup' and not 'err'. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 288f3628f7..35f18e1738 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1306,13 +1306,13 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, if (virFileResolveLink(path, &newpath) < 0) { VIR_WARN("cannot resolve symlink %s: %s", path, virStrerror(errno, ebuf, sizeof(ebuf))); - goto err; + goto cleanup; } if (stat(newpath, &buf) != 0) { VIR_WARN("cannot stat %s: %s", newpath, virStrerror(errno, ebuf, sizeof(ebuf))); - goto err; + goto cleanup; } if (getContext(mgr, newpath, buf.st_mode, &fcon) < 0) { @@ -1325,7 +1325,7 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, rc = virSecuritySELinuxSetFilecon(mgr, newpath, fcon); } - err: + cleanup: freecon(fcon); VIR_FREE(newpath); return rc; -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
This label is used in both successful and error paths. Therefore it should be named 'cleanup' and not 'err'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John (I'm smelling dinner ... so hopefully I can complete this today ;-))

Firstly, the following code pattern is harder to follow: if (func() < 0) { error(); } else { /* success */ } We should put 'goto cleanup' into the error branch and move the else branch one level up. Secondly, 'rc' should really be named 'ret' because it holds return value of the function. Not some intermediate value. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 35f18e1738..72d12c9df1 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1291,9 +1291,9 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, { struct stat buf; security_context_t fcon = NULL; - int rc = -1; char *newpath = NULL; char ebuf[1024]; + int ret = -1; /* Some paths are auto-generated, so let's be safe here and do * nothing if nothing is needed. @@ -1320,15 +1320,18 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, * which makes this an expected non error */ VIR_WARN("cannot lookup default selinux label for %s", newpath); - rc = 0; - } else { - rc = virSecuritySELinuxSetFilecon(mgr, newpath, fcon); + ret = 0; + goto cleanup; } + if (virSecuritySELinuxSetFilecon(mgr, newpath, fcon) < 0) + goto cleanup; + + ret = 0; cleanup: freecon(fcon); VIR_FREE(newpath); - return rc; + return ret; } -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Firstly, the following code pattern is harder to follow:
if (func() < 0) { error(); } else { /* success */ }
We should put 'goto cleanup' into the error branch and move the else branch one level up. Secondly, 'rc' should really be named 'ret' because it holds return value of the function. Not some intermediate value.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John can't wait to see the selinux patches for this list stuff... Oh boy! Getting punchy, must be closer to dinner...

So far the whole transaction handling is done virSecuritySELinuxSetFileconHelper(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 72d12c9df1..f6416010f9 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1146,20 +1146,14 @@ virSecuritySELinuxGetProcessLabel(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, * return 1 if labelling was not possible. Otherwise, require a label * change, and return 0 for success, -1 for failure. */ static int -virSecuritySELinuxSetFileconHelper(const char *path, const char *tcon, - bool optional, bool privileged) +virSecuritySELinuxSetFileconImpl(const char *path, const char *tcon, + bool optional, bool privileged) { security_context_t econ; - int rc; /* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */ - if ((rc = virSecuritySELinuxTransactionAppend(path, tcon, optional)) < 0) - return -1; - else if (rc > 0) - return 0; - VIR_INFO("Setting SELinux context on '%s' to '%s'", path, tcon); if (setfilecon_raw(path, (VIR_SELINUX_CTX_CONST char *)tcon) < 0) { @@ -1213,6 +1207,22 @@ virSecuritySELinuxSetFileconHelper(const char *path, const char *tcon, return 0; } + +static int +virSecuritySELinuxSetFileconHelper(const char *path, const char *tcon, + bool optional, bool privileged) +{ + int rc; + + if ((rc = virSecuritySELinuxTransactionAppend(path, tcon, optional)) < 0) + return -1; + else if (rc > 0) + return 0; + + return virSecuritySELinuxSetFileconImpl(path, tcon, optional, privileged); +} + + static int virSecuritySELinuxSetFileconOptional(virSecurityManagerPtr mgr, const char *path, const char *tcon) @@ -1289,10 +1299,12 @@ static int virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, const char *path) { + bool privileged = virSecurityManagerGetPrivileged(mgr); struct stat buf; security_context_t fcon = NULL; char *newpath = NULL; char ebuf[1024]; + int rc; int ret = -1; /* Some paths are auto-generated, so let's be safe here and do @@ -1324,7 +1336,12 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr, goto cleanup; } - if (virSecuritySELinuxSetFilecon(mgr, newpath, fcon) < 0) + if ((rc = virSecuritySELinuxTransactionAppend(path, fcon, false)) < 0) + return -1; + else if (rc > 0) + return 0; + + if (virSecuritySELinuxSetFileconImpl(newpath, fcon, false, privileged) < 0) goto cleanup; ret = 0; -- 2.16.4

On 09/10/2018 05:36 AM, Michal Privoznik wrote:
So far the whole transaction handling is done virSecuritySELinuxSetFileconHelper(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-)
I shall note only that you didn't follow what you did for DAC with regard to copying around the comment: /* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */ Beyond that - this is light years cleaner than DAC, thankfully because my wife will be not be happy if I go on much longer ;-) I trust you can move comments appropriately... Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 09/18/2018 12:25 AM, John Ferlan wrote:
On 09/10/2018 05:36 AM, Michal Privoznik wrote:
So far the whole transaction handling is done virSecuritySELinuxSetFileconHelper(). This needs to change for the sake of security label remembering and locking. Otherwise we would be locking a path when only appending it to transaction list and not when actually relabelling it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-)
I shall note only that you didn't follow what you did for DAC with regard to copying around the comment:
/* Be aware that this function might run in a separate process. * Therefore, any driver state changes would be thrown away. */
That is because in selinux driver only virSecuritySELinuxSetFileconImpl() is called from transactionCommit callback. Once I implement label remembering SELinux driver will become more complicated too. Don't worry ;-) IOW, DAC differentiates set/restore, SELinux doesn't (in transaction code). Michal

Lock all the paths we want to relabel to mutually exclude other libvirt daemons. The only culprit here hitch here is that directories can't be locked. Therefore, when relabeling a directory do not lock it (this happens only when setting up some domain private paths anyway, e.g. huge pages directory). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 43 +++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-) diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index f6416010f9..056637e4cb 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -90,7 +90,7 @@ struct _virSecuritySELinuxContextItem { typedef struct _virSecuritySELinuxContextList virSecuritySELinuxContextList; typedef virSecuritySELinuxContextList *virSecuritySELinuxContextListPtr; struct _virSecuritySELinuxContextList { - bool privileged; + virSecurityManagerPtr manager; virSecuritySELinuxContextItemPtr *items; size_t nItems; }; @@ -212,8 +212,29 @@ virSecuritySELinuxTransactionRun(pid_t pid ATTRIBUTE_UNUSED, void *opaque) { virSecuritySELinuxContextListPtr list = opaque; + bool privileged = virSecurityManagerGetPrivileged(list->manager); + const char **paths = NULL; + size_t npaths = 0; size_t i; + int rv; + int ret = -1; + if (VIR_ALLOC_N(paths, list->nItems) < 0) + return -1; + + for (i = 0; i < list->nItems; i++) { + const char *p = list->items[i]->path; + + if (virFileIsDir(p)) + continue; + + VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); + } + + if (virSecurityManagerMetadataLock(list->manager, paths, npaths) < 0) + goto cleanup; + + rv = 0; for (i = 0; i < list->nItems; i++) { virSecuritySELinuxContextItemPtr item = list->items[i]; @@ -221,11 +242,22 @@ virSecuritySELinuxTransactionRun(pid_t pid ATTRIBUTE_UNUSED, if (virSecuritySELinuxSetFileconHelper(item->path, item->tcon, item->optional, - list->privileged) < 0) - return -1; + privileged) < 0) { + rv = -1; + break; + } } - return 0; + if (virSecurityManagerMetadataUnlock(list->manager, paths, npaths) < 0) + goto cleanup; + + if (rv < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_FREE(paths); + return ret; } @@ -1010,7 +1042,6 @@ virSecuritySELinuxGetDOI(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) static int virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) { - bool privileged = virSecurityManagerGetPrivileged(mgr); virSecuritySELinuxContextListPtr list; list = virThreadLocalGet(&contextList); @@ -1023,7 +1054,7 @@ virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->privileged = privileged; + list->manager = mgr; if (virThreadLocalSet(&contextList, list) < 0) { virReportSystemError(errno, "%s", -- 2.16.4

$SUBJ s/dac/selinux On 09/10/2018 05:36 AM, Michal Privoznik wrote:
Lock all the paths we want to relabel to mutually exclude other libvirt daemons.
The only culprit here hitch here is that directories can't be
Where have I seen this before?
locked. Therefore, when relabeling a directory do not lock it (this happens only when setting up some domain private paths anyway, e.g. huge pages directory).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/security/security_selinux.c | 43 +++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-)
I shall say "similar comments to my DAC review" (ref/unref, more comments in TransactionRun, and if you want use rv = *SetFilecon* and if (rv < 0) break... And, then you can apply the Reviewed-by: John Ferlan <jferlan@redhat.com> John

Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
Technically, this is v4 of:
https://www.redhat.com/archives/libvir-list/2018-August/msg01627.html
However, this is implementing different approach than any of the previous versions.
One of the problems with previous version was that it was too complicated. The main reason for that was that we could not close the connection whilst there was a file locked. So we had to invent a mechanism that would prevent that (on the client side).
These patches implement different approach. They rely on secdriver's transactions which bring all the paths we want to label into one place so that they can be relabelled within different namespace. I'm extending this idea so that transactions run all the time (regardless of domain namespacing) and only at the very last moment is decided which namespace would the relabeling run in.
Metadata locking is then as easy as putting lock/unlock calls around one function.
You can find the patches at my github too:
https://github.com/zippy2/libvirt/tree/disk_metadata_lock_v4_alt
Hey Michal, is was running a quick test with this patch series with two domains sharing a disk image without <shareable/> and SELinux enabled. When starting the second domain, the whole libvirtd daemon hangs for almost a minute until giving the error that the image is locked. I haven't debugged it yet to figure out what happens. Otherwise it's looking good, relabeling is prevented as expected. Bjoern

On 09/12/2018 07:19 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
Technically, this is v4 of:
https://www.redhat.com/archives/libvir-list/2018-August/msg01627.html
However, this is implementing different approach than any of the previous versions.
One of the problems with previous version was that it was too complicated. The main reason for that was that we could not close the connection whilst there was a file locked. So we had to invent a mechanism that would prevent that (on the client side).
These patches implement different approach. They rely on secdriver's transactions which bring all the paths we want to label into one place so that they can be relabelled within different namespace. I'm extending this idea so that transactions run all the time (regardless of domain namespacing) and only at the very last moment is decided which namespace would the relabeling run in.
Metadata locking is then as easy as putting lock/unlock calls around one function.
You can find the patches at my github too:
https://github.com/zippy2/libvirt/tree/disk_metadata_lock_v4_alt
Hey Michal,
is was running a quick test with this patch series with two domains sharing a disk image without <shareable/> and SELinux enabled. When starting the second domain, the whole libvirtd daemon hangs for almost a minute until giving the error that the image is locked. I haven't debugged it yet to figure out what happens.
Is this on two different hosts or one? I'm unable to reproduce, so can you please attach debugger and share 't a a bt' output with me? When I try to reproduce I always get one domain running and the other failing to start because qemu is unable to do its locking. When I put <shareable/> in both, they start successfully. TBH, I don't run SELinux enabled host so I'm testing DAC only, but the changes to the code are merely the same. So I'm wondering if this is really an issue in my SELinux impl or somewhere else. BTW: The one minute timeout comes from patch 16/23: #define LOCK_ACQUIRE_TIMEOUT 60 Michal

Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
On 09/12/2018 07:19 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
Technically, this is v4 of:
https://www.redhat.com/archives/libvir-list/2018-August/msg01627.html
However, this is implementing different approach than any of the previous versions.
One of the problems with previous version was that it was too complicated. The main reason for that was that we could not close the connection whilst there was a file locked. So we had to invent a mechanism that would prevent that (on the client side).
These patches implement different approach. They rely on secdriver's transactions which bring all the paths we want to label into one place so that they can be relabelled within different namespace. I'm extending this idea so that transactions run all the time (regardless of domain namespacing) and only at the very last moment is decided which namespace would the relabeling run in.
Metadata locking is then as easy as putting lock/unlock calls around one function.
You can find the patches at my github too:
https://github.com/zippy2/libvirt/tree/disk_metadata_lock_v4_alt
Hey Michal,
is was running a quick test with this patch series with two domains sharing a disk image without <shareable/> and SELinux enabled. When starting the second domain, the whole libvirtd daemon hangs for almost a minute until giving the error that the image is locked. I haven't debugged it yet to figure out what happens.
Is this on two different hosts or one?
On the same host.
I'm unable to reproduce, so can you please attach debugger and share 't a a bt' output with me?
(gdb) thread apply all bt Thread 17 (Thread 0x3ff48dff910 (LWP 193353)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff6678c5a8 in udevEventHandleThread (opaque=<optimized out>) at node_device/node_device_udev.c:1603 #3 0x000003ff8c6bad16 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 16 (Thread 0x3ff4acfe910 (LWP 193312)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 15 (Thread 0x3ff4b4ff910 (LWP 193311)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 14 (Thread 0x3ff64efd910 (LWP 193310)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 13 (Thread 0x3ff656fe910 (LWP 193309)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 12 (Thread 0x3ff65eff910 (LWP 193308)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 11 (Thread 0x3ff67fff910 (LWP 193307)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 10 (Thread 0x3ff84bf7910 (LWP 193306)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 9 (Thread 0x3ff853f8910 (LWP 193305)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 8 (Thread 0x3ff85bf9910 (LWP 193304)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 7 (Thread 0x3ff863fa910 (LWP 193303)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 6 (Thread 0x3ff86bfb910 (LWP 193302)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 5 (Thread 0x3ff873fc910 (LWP 193301)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 4 (Thread 0x3ff87bfd910 (LWP 193300)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 3 (Thread 0x3ff883fe910 (LWP 193299)): #0 0x000003ff8b7113d4 in read () from /lib64/libpthread.so.0 #1 0x000003ff8c65a648 in saferead () from /lib64/libvirt.so.0 #2 0x000003ff8c65a718 in saferead_lim () from /lib64/libvirt.so.0 #3 0x000003ff8c65abe4 in virFileReadHeaderFD () from /lib64/libvirt.so.0 #4 0x000003ff8c69cb5a in virProcessRunInMountNamespace () from /lib64/libvirt.so.0 #5 0x000003ff8c767bd6 in virSecuritySELinuxTransactionCommit () from /lib64/libvirt.so.0 #6 0x000003ff8c75fb1c in virSecurityManagerTransactionCommit () from /lib64/libvirt.so.0 #7 0x000003ff8c75bb70 in virSecurityStackTransactionCommit () from /lib64/libvirt.so.0 #8 0x000003ff8c75fb1c in virSecurityManagerTransactionCommit () from /lib64/libvirt.so.0 #9 0x000003ff6612c492 in qemuSecuritySetAllLabel () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #10 0x000003ff660bcfd6 in qemuProcessLaunch () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #11 0x000003ff660c0916 in qemuProcessStart () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #12 0x000003ff66124a00 in qemuDomainObjStart.constprop.52 () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #13 0x000003ff66125030 in qemuDomainCreateWithFlags () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #14 0x000003ff8c87bfca in virDomainCreate () from /lib64/libvirt.so.0 #15 0x000002aa1f2d516c in remoteDispatchDomainCreate (server=<optimized out>, msg=0x2aa4c488b00, args=<optimized out>, rerr=0x3ff883fdae8, client=<optimized out>) at remote/remote_daemon_dispatch_stubs.h:4434 #16 remoteDispatchDomainCreateHelper (server=<optimized out>, client=<optimized out>, msg=0x2aa4c488b00, rerr=0x3ff883fdae8, args=<optimized out>, ret=0x3ff78004c80) at remote/remote_daemon_dispatch_stubs.h:4410 #17 0x000003ff8c78bc80 in virNetServerProgramDispatch () from /lib64/libvirt.so.0 #18 0x000003ff8c792aba in virNetServerHandleJob () from /lib64/libvirt.so.0 #19 0x000003ff8c6bbb16 in virThreadPoolWorker () from /lib64/libvirt.so.0 #20 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #21 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #22 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 2 (Thread 0x3ff88bff910 (LWP 193298)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6 Thread 1 (Thread 0x3ff8cbd7970 (LWP 193297)): #0 0x000003ff8b5ee502 in poll () from /lib64/libc.so.6 #1 0x000003ff8c65629c in virEventPollRunOnce () from /lib64/libvirt.so.0 #2 0x000003ff8c654caa in virEventRunDefaultImpl () from /lib64/libvirt.so.0 #3 0x000003ff8c7921de in virNetDaemonRun () from /lib64/libvirt.so.0 #4 0x000002aa1f2a4c12 in main (argc=<optimized out>, argv=<optimized out>) at remote/remote_daemon.c:1461
When I try to reproduce I always get one domain running and the other failing to start because qemu is unable to do its locking. When I put <shareable/> in both, they start successfully.
Yes, I get the same (expected) behaviour, just with the timeout.
TBH, I don't run SELinux enabled host so I'm testing DAC only, but the changes to the code are merely the same. So I'm wondering if this is really an issue in my SELinux impl or somewhere else.
BTW: The one minute timeout comes from patch 16/23:
#define LOCK_ACQUIRE_TIMEOUT 60
Michal
It's entirely possible that my setup is broken, this was just a quick test. If you can't reproduce it or make sense out of the backtrace then it's fine, I will try to put some time to debug it next week. I was just interested if this would fix a long-standing bug where this scenario would crash the first domain because the label changed. Bjoern -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
On 09/12/2018 07:19 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-10, 11:36AM +0200]:
Technically, this is v4 of:
https://www.redhat.com/archives/libvir-list/2018-August/msg01627.html
However, this is implementing different approach than any of the previous versions.
One of the problems with previous version was that it was too complicated. The main reason for that was that we could not close the connection whilst there was a file locked. So we had to invent a mechanism that would prevent that (on the client side).
These patches implement different approach. They rely on secdriver's transactions which bring all the paths we want to label into one place so that they can be relabelled within different namespace. I'm extending this idea so that transactions run all the time (regardless of domain namespacing) and only at the very last moment is decided which namespace would the relabeling run in.
Metadata locking is then as easy as putting lock/unlock calls around one function.
You can find the patches at my github too:
https://github.com/zippy2/libvirt/tree/disk_metadata_lock_v4_alt
Hey Michal,
is was running a quick test with this patch series with two domains sharing a disk image without <shareable/> and SELinux enabled. When starting the second domain, the whole libvirtd daemon hangs for almost a minute until giving the error that the image is locked. I haven't debugged it yet to figure out what happens.
Is this on two different hosts or one?
On the same host.
I'm unable to reproduce, so can you please attach debugger and share 't a a bt' output with me?
(gdb) thread apply all bt
Thread 17 (Thread 0x3ff48dff910 (LWP 193353)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff6678c5a8 in udevEventHandleThread (opaque=<optimized out>) at node_device/node_device_udev.c:1603 #3 0x000003ff8c6bad16 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 16 (Thread 0x3ff4acfe910 (LWP 193312)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 15 (Thread 0x3ff4b4ff910 (LWP 193311)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 14 (Thread 0x3ff64efd910 (LWP 193310)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 13 (Thread 0x3ff656fe910 (LWP 193309)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 12 (Thread 0x3ff65eff910 (LWP 193308)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 11 (Thread 0x3ff67fff910 (LWP 193307)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 10 (Thread 0x3ff84bf7910 (LWP 193306)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 9 (Thread 0x3ff853f8910 (LWP 193305)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 8 (Thread 0x3ff85bf9910 (LWP 193304)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 7 (Thread 0x3ff863fa910 (LWP 193303)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbba0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 6 (Thread 0x3ff86bfb910 (LWP 193302)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 5 (Thread 0x3ff873fc910 (LWP 193301)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 4 (Thread 0x3ff87bfd910 (LWP 193300)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 3 (Thread 0x3ff883fe910 (LWP 193299)): #0 0x000003ff8b7113d4 in read () from /lib64/libpthread.so.0 #1 0x000003ff8c65a648 in saferead () from /lib64/libvirt.so.0 #2 0x000003ff8c65a718 in saferead_lim () from /lib64/libvirt.so.0 #3 0x000003ff8c65abe4 in virFileReadHeaderFD () from /lib64/libvirt.so.0 #4 0x000003ff8c69cb5a in virProcessRunInMountNamespace () from /lib64/libvirt.so.0 #5 0x000003ff8c767bd6 in virSecuritySELinuxTransactionCommit () from /lib64/libvirt.so.0 #6 0x000003ff8c75fb1c in virSecurityManagerTransactionCommit () from /lib64/libvirt.so.0 #7 0x000003ff8c75bb70 in virSecurityStackTransactionCommit () from /lib64/libvirt.so.0 #8 0x000003ff8c75fb1c in virSecurityManagerTransactionCommit () from /lib64/libvirt.so.0 #9 0x000003ff6612c492 in qemuSecuritySetAllLabel () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #10 0x000003ff660bcfd6 in qemuProcessLaunch () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #11 0x000003ff660c0916 in qemuProcessStart () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #12 0x000003ff66124a00 in qemuDomainObjStart.constprop.52 () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #13 0x000003ff66125030 in qemuDomainCreateWithFlags () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #14 0x000003ff8c87bfca in virDomainCreate () from /lib64/libvirt.so.0 #15 0x000002aa1f2d516c in remoteDispatchDomainCreate (server=<optimized out>, msg=0x2aa4c488b00, args=<optimized out>, rerr=0x3ff883fdae8, client=<optimized out>) at remote/remote_daemon_dispatch_stubs.h:4434 #16 remoteDispatchDomainCreateHelper (server=<optimized out>, client=<optimized out>, msg=0x2aa4c488b00, rerr=0x3ff883fdae8, args=<optimized out>, ret=0x3ff78004c80) at remote/remote_daemon_dispatch_stubs.h:4410 #17 0x000003ff8c78bc80 in virNetServerProgramDispatch () from /lib64/libvirt.so.0 #18 0x000003ff8c792aba in virNetServerHandleJob () from /lib64/libvirt.so.0 #19 0x000003ff8c6bbb16 in virThreadPoolWorker () from /lib64/libvirt.so.0 #20 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #21 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #22 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 2 (Thread 0x3ff88bff910 (LWP 193298)): #0 0x000003ff8b70d7b8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000003ff8c6baf92 in virCondWait () from /lib64/libvirt.so.0 #2 0x000003ff8c6bbbe0 in virThreadPoolWorker () from /lib64/libvirt.so.0 #3 0x000003ff8c6bace6 in virThreadHelper () from /lib64/libvirt.so.0 #4 0x000003ff8b7079a8 in start_thread () from /lib64/libpthread.so.0 #5 0x000003ff8b5f9706 in thread_start () from /lib64/libc.so.6
Thread 1 (Thread 0x3ff8cbd7970 (LWP 193297)): #0 0x000003ff8b5ee502 in poll () from /lib64/libc.so.6 #1 0x000003ff8c65629c in virEventPollRunOnce () from /lib64/libvirt.so.0 #2 0x000003ff8c654caa in virEventRunDefaultImpl () from /lib64/libvirt.so.0 #3 0x000003ff8c7921de in virNetDaemonRun () from /lib64/libvirt.so.0 #4 0x000002aa1f2a4c12 in main (argc=<optimized out>, argv=<optimized out>) at remote/remote_daemon.c:1461
When I try to reproduce I always get one domain running and the other failing to start because qemu is unable to do its locking. When I put <shareable/> in both, they start successfully.
Yes, I get the same (expected) behaviour, just with the timeout.
TBH, I don't run SELinux enabled host so I'm testing DAC only, but the changes to the code are merely the same. So I'm wondering if this is really an issue in my SELinux impl or somewhere else.
BTW: The one minute timeout comes from patch 16/23:
#define LOCK_ACQUIRE_TIMEOUT 60
Michal
It's entirely possible that my setup is broken, this was just a quick test. If you can't reproduce it or make sense out of the backtrace then it's fine, I will try to put some time to debug it next week. I was just interested if this would fix a long-standing bug where this scenario would crash the first domain because the label changed.
Bjoern
Still seeing the same timeout. Is this expected behaviour? -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened. Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce? Michal

Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
I can reproduce on a freshly booted machine. There should be no rouge lock held anywhere.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue
Yes, same issue.
yesterday and was unsuccessful. Do you have any steps to reproduce?
0. Host setup: # /usr/sbin/sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Memory protection checking: actual (secure) Max kernel policy version: 31 # grep -E "^[^#]" /etc/libvirt/qemu.conf lock_manager = "lockd" metadata_lock_manager = "lockd" 1. Define two domains, u1604-1 and u1604-2, using the same image, not shared: <domain type='kvm'> <name>u1604-1</name> <uuid>e49679de-eca9-4a72-8d1a-56f437541157</uuid> <memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory> <vcpu placement='static'>2</vcpu> <os> <type arch='s390x' machine='s390-ccw-virtio-2.12'>hvm</type> <boot dev='hd'/> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>preserve</on_crash> <devices> <emulator>/usr/bin/qemu-system-s390x</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/u1604.qcow2'/> <target dev='vda' bus='virtio'/> <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/> </disk> <console type='pty'> <target type='sclp' port='0'/> </console> <memballoon model='virtio'> <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/> </memballoon> <panic model='s390'/> </devices> </domain> 2. Start domain u1604-1: # ls -Z /var/lib/libvirt/images/u1604.qcow2 system_u:object_r:svirt_image_t:s0:c387,c937 /var/lib/libvirt/images/u1604.qcow2 3. Start domain u1604-2. 4. Result: the libvirtd hangs for 60 seconds after which the expected locking message appears. Security labels of the image are not changed: # virsh start u1604-2 error: Failed to start domain u1604-2 error: internal error: child reported: resource busy: Lockspace resource '/var/lib/libvirt/images/u1604.qcow2' is locked # ls -Z /var/lib/libvirt/images/u1604.qcow2 system_u:object_r:svirt_image_t:s0:c387,c937 /var/lib/libvirt/images/u1604.qcow2 Backtrace is the same I sent earlier. Let me know if you need more info or if anything is borked in my setup.
Michal

Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this? However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks. -- IBM Systems Linux on Z & Virtualization Development -------------------------------------------------- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -------------------------------------------------- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk <bwalk@linux.ibm.com> wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
Backtrace of libvirtd where it’s hanging (in thread A) (gdb) bt #0 read () from /lib64/libpthread.so.0 #1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at /usr/include/bits/unistd.h:44 #2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at util/virfile.c:1061 #3 saferead_lim (fd=31, max_len=max_len@entry=1024, length=length@entry=0x7f84f3fa8240) at util/virfile.c:1345 #4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024, buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376 #5 virProcessRunInMountNamespace () at util/virprocess.c:1149 #6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_selinux.c:1106 #7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at security/security_manager.c:324 #8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56 #11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329 #12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816 … #25 in start_thread () from /lib64/libpthread.so.0 #26 in clone () from /lib64/libc.so.6 Backtrace of the forked process where the process is trying to get the meta data lock for 60 seconds. #0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6 #1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6 #2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at util/virtime.c:453 #3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850, paths=<optimized out>, npaths=<optimized out>) at security/security_manager.c:1345 #4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977, opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226 #5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450, cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at util/virprocess.c:1100 #6 virProcessRunInMountNamespace () at util/virprocess.c:1140 #7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_dac.c:565 #8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850, pid=pid@entry=23977) at security/security_manager.c:324 #9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56 #12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6329 #13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816 #14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296 #15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at qemu/qemu_driver.c:7349 #16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at libvirt-domain.c:6534 #17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0, msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960, client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434 #18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800) at remote/remote_daemon_dispatch_stubs.h:4410 #19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00, client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at rpc/virnetserverprogram.c:437 #20 virNetServerProgramDispatch (prog=0x55a82ca4be70, server=server@entry=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00) at rpc/virnetserverprogram.c:304 #21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at rpc/virnetserver.c:144 #22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at rpc/virnetserver.c:165 #23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at util/virthreadpool.c:167 #24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at util/virthread.c:206 #25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0 #26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks.
Since thread A owns the lock for the virDomainObjList other operations like 'virsh list' won’t return for about 60s. Hope this information will help :)
-- IBM Systems Linux on Z & Virtualization Development -------------------------------------------------- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -------------------------------------------------- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
-- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/27/2018 09:57 AM, Marc Hartmayer wrote:
On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk <bwalk@linux.ibm.com> wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
Backtrace of libvirtd where it’s hanging (in thread A)
(gdb) bt #0 read () from /lib64/libpthread.so.0 #1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at /usr/include/bits/unistd.h:44 #2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at util/virfile.c:1061 #3 saferead_lim (fd=31, max_len=max_len@entry=1024, length=length@entry=0x7f84f3fa8240) at util/virfile.c:1345 #4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024, buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376 #5 virProcessRunInMountNamespace () at util/virprocess.c:1149 #6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_selinux.c:1106 #7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at security/security_manager.c:324 #8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56 #11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329 #12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816
This is just waiting for forked child to finish. Nothing suspicious here.
…
#25 in start_thread () from /lib64/libpthread.so.0 #26 in clone () from /lib64/libc.so.6
Backtrace of the forked process where the process is trying to get the meta data lock for 60 seconds.> #0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6 #1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6 #2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at util/virtime.c:453 #3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850, paths=<optimized out>, npaths=<optimized out>) at security/security_manager.c:1345 #4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977, opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226 #5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450, cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at util/virprocess.c:1100 #6 virProcessRunInMountNamespace () at util/virprocess.c:1140 #7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_dac.c:565 #8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850, pid=pid@entry=23977) at security/security_manager.c:324 #9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56 #12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6329 #13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816 #14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296 #15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at qemu/qemu_driver.c:7349 #16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at libvirt-domain.c:6534 #17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0, msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960, client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434 #18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800) at remote/remote_daemon_dispatch_stubs.h:4410 #19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00, client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at rpc/virnetserverprogram.c:437 #20 virNetServerProgramDispatch (prog=0x55a82ca4be70, server=server@entry=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00) at rpc/virnetserverprogram.c:304 #21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at rpc/virnetserver.c:144 #22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at rpc/virnetserver.c:165 #23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at util/virthreadpool.c:167 #24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at util/virthread.c:206 #25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0 #26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
Right, so this is just waiting for virtlockd to lock the paths. virtlockd is obviously unable to do that (as I suggested in my previous e-mail - is perhaps some other process holding the lock?). Can you please enable debug logs for virtlockd, reproduce and share the log with me? Setting this in /etc/libvirt/virtlockd.conf should be enough: log_level=1 log_filters="4:event 3:rpc" log_outputs="1:file:/var/log/virtlockd.lod" Thanks, Michal

On Thu, Sep 27, 2018 at 10:14 AM +0200, Michal Privoznik <mprivozn@redhat.com> wrote:
On 09/27/2018 09:57 AM, Marc Hartmayer wrote:
On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk <bwalk@linux.ibm.com> wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
Backtrace of libvirtd where it’s hanging (in thread A)
(gdb) bt #0 read () from /lib64/libpthread.so.0 #1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at /usr/include/bits/unistd.h:44 #2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at util/virfile.c:1061 #3 saferead_lim (fd=31, max_len=max_len@entry=1024, length=length@entry=0x7f84f3fa8240) at util/virfile.c:1345 #4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024, buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376 #5 virProcessRunInMountNamespace () at util/virprocess.c:1149 #6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_selinux.c:1106 #7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at security/security_manager.c:324 #8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56 #11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329 #12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816
This is just waiting for forked child to finish. Nothing suspicious here.
…
#25 in start_thread () from /lib64/libpthread.so.0 #26 in clone () from /lib64/libc.so.6
Backtrace of the forked process where the process is trying to get the meta data lock for 60 seconds.> #0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6 #1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6 #2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at util/virtime.c:453 #3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850, paths=<optimized out>, npaths=<optimized out>) at security/security_manager.c:1345 #4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977, opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226 #5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450, cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at util/virprocess.c:1100 #6 virProcessRunInMountNamespace () at util/virprocess.c:1140 #7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_dac.c:565 #8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850, pid=pid@entry=23977) at security/security_manager.c:324 #9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at security/security_stack.c:166 #10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at security/security_manager.c:324 #11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56 #12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6329 #13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50, updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816 #14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40, driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296 #15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at qemu/qemu_driver.c:7349 #16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at libvirt-domain.c:6534 #17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0, msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960, client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434 #18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800) at remote/remote_daemon_dispatch_stubs.h:4410 #19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00, client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at rpc/virnetserverprogram.c:437 #20 virNetServerProgramDispatch (prog=0x55a82ca4be70, server=server@entry=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00) at rpc/virnetserverprogram.c:304 #21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at rpc/virnetserver.c:144 #22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at rpc/virnetserver.c:165 #23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at util/virthreadpool.c:167 #24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at util/virthread.c:206 #25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0 #26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
Right, so this is just waiting for virtlockd to lock the paths. virtlockd is obviously unable to do that (as I suggested in my previous e-mail - is perhaps some other process holding the lock?).
The other domain is holding the lock (obviously).
Can you please enable debug logs for virtlockd, reproduce and share the log with me? Setting this in /etc/libvirt/virtlockd.conf should be enough:
Sure, but I don’t think this is needed. We’re trying to start two domains which have disks backed on the same image. So the current behavior is probably intended (apart from the problem with the later API calls). In my opinion, the timeout is just way too high.
log_level=1 log_filters="4:event 3:rpc" log_outputs="1:file:/var/log/virtlockd.lod"
Thanks, Michal
-- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Michal Privoznik <mprivozn@redhat.com> [2018-09-27, 10:14AM +0200]:
Right, so this is just waiting for virtlockd to lock the paths. virtlockd is obviously unable to do that (as I suggested in my previous e-mail - is perhaps some other process holding the lock?).
It can not lock the paths, because those paths are already locked by the first domain. This is intentional. But it should take 60 seconds to resolve this and block the whole libvirt daemon in the process.
Can you please enable debug logs for virtlockd, reproduce and share the log with me? Setting this in /etc/libvirt/virtlockd.conf should be enough:
log_level=1 log_filters="4:event 3:rpc" log_outputs="1:file:/var/log/virtlockd.lod"
See attached.

On 09/27/2018 09:01 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
The ideal solution would be to just tell virlockd "these are the paths I want you to lock on my behalf" and virtlockd would use F_SETLKW so that the moment all paths are unlocked virtlockd will lock them and libvirtd can continue its execution (i.e. chown() and setfcon()). However, we can't do this because virtlockd runs single threaded [1] and therefore if one thread is waiting for lock to be acquired there is no other thread to unlock the path. Therefore I had to move the logic into libvirtd which tries repeatedly to lock all the paths needed. And this is where the timeout steps in - the lock acquiring attempts are capped at 60 seconds. This is an arbitrary chosen timeout. We can make it smaller, but that will not solve your problem - only mask it.
However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks.
Whole libvirtd shouldn't hang. Only those threads which try to acquire domain object lock. IOW it should be possible to run APIs over different domains (or other objects for that matter). For instance dumpxml over different domain works just fine. Anyway, we need to get to the bottom of this. Looks like something keeps the file locked and then when libvirt wants to lock if for metadata the timeout is hit and whole operation fails. Do you mind running 'lslocks -u' when starting a domain and before the timeout is hit? Michal 1: The reason that virtlockd has to run single threaded is stupidity of POSIX file locks. Imagine one thread doing: open() + fcntl(fd, F_SETLKW, ...) and entering critical section. If another thread does open() + close() on the same file the file is unlocked. Because we can't guarantee this will not happen in multithreaded libvirt we had to introduce a separate process to take care of that. And this process has to be single threaded so there won't ever be the second thread to call close() and unintentionally release the lock. Perhaps we could use OFD locks but those are not part of POSIX and are available on Linux only.

On Thu, Sep 27, 2018 at 10:15 AM +0200, Michal Privoznik <mprivozn@redhat.com> wrote:
On 09/27/2018 09:01 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
The ideal solution would be to just tell virlockd "these are the paths I want you to lock on my behalf" and virtlockd would use F_SETLKW so that the moment all paths are unlocked virtlockd will lock them and libvirtd can continue its execution (i.e. chown() and setfcon()). However, we can't do this because virtlockd runs single threaded [1] and therefore if one thread is waiting for lock to be acquired there is no other thread to unlock the path.
Therefore I had to move the logic into libvirtd which tries repeatedly to lock all the paths needed. And this is where the timeout steps in - the lock acquiring attempts are capped at 60 seconds. This is an arbitrary chosen timeout. We can make it smaller, but that will not solve your problem - only mask it.
However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks.
Whole libvirtd shouldn't hang.
The main loop doesn’t hang.
Only those threads which try to acquire domain object lock. IOW it should be possible to run APIs over different domains (or other objects for that matter). For instance dumpxml over different domain works just fine.
Oh sry, my assumption that the thread A is also holding the virDomainObjList lock is wrong! Here is the actual backtrace of a waiting thread (in this case the “worker thread for the "virsh list" command”) (gdb) bt #0 0x00007f850887b7ed in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f8508874cf4 in pthread_mutex_lock () from /lib64/libpthread.so.0 #2 0x00007f850c26cfd9 in virMutexLock (m=<optimized out>) at util/virthread.c:89 #3 0x00007f850c2455df in virObjectLock (anyobj=<optimized out>) at util/virobject.c:429 #4 0x00007f850c2ceaba in virDomainObjListFilter (list=list@entry=0x7f84f47a97c0, nvms=nvms@entry=0x7f84f47a97c8, conn=conn@entry=0x7f84dc01ad00, filter=filter@entry=0x7f850c318c70 <virConnectListAllDomainsCheckACL>, flags=flags@entry=1) at conf/virdomainobjlist.c:919 #5 0x00007f850c2cfef2 in virDomainObjListCollect (domlist=0x7f848c10e290, conn=conn@entry=0x7f84dc01ad00, vms=vms@entry=0x7f84f47a9820, nvms=nvms@entry=0x7f84f47a9830, filter=0x7f850c318c70 <virConnectListAllDomainsCheckACL>, flags=1) at conf/virdomainobjlist.c:961 #6 0x00007f850c2d01a6 in virDomainObjListExport (domlist=<optimized out>, conn=0x7f84dc01ad00, domains=0x7f84f47a9890, filter=<optimized out>, flags=<optimized out>) at conf/virdomainobjlist.c:1042 #7 0x00007f850c426be9 in virConnectListAllDomains (conn=0x7f84dc01ad00, domains=0x7f84f47a9890, flags=1) at libvirt-domain.c:6493 #8 0x000055a82b2d905d in remoteDispatchConnectListAllDomains (server=0x55a82c9e16d0, msg=0x55a82ca53380, args=0x7f84dc01ae30, args=0x7f84dc01ae30, ret=0x7f84dc005bb0, rerr=0x7f84f47a9960, client=<optimized out>) at remote/remote_daemon_dispatch_stubs.h:1372 #9 remoteDispatchConnectListAllDomainsHelper (server=0x55a82c9e16d0, client=<optimized out>, msg=0x55a82ca53380, rerr=0x7f84f47a9960, args=0x7f84dc01ae30, ret=0x7f84dc005bb0) at remote/remote_daemon_dispatch_stubs.h:1348 #10 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53380, client=0x55a82ca52a40, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at rpc/virnetserverprogram.c:437 … virDomainObjListFilter needs the lock of every domain… That’s the actual problem.
Anyway, we need to get to the bottom of this. Looks like something keeps the file locked and then when libvirt wants to lock if for metadata the timeout is hit and whole operation fails. Do you mind running 'lslocks -u' when starting a domain and before the timeout is hit?
Michal
1: The reason that virtlockd has to run single threaded is stupidity of POSIX file locks. Imagine one thread doing: open() + fcntl(fd, F_SETLKW, ...) and entering critical section. If another thread does open() + close() on the same file the file is unlocked. Because we can't guarantee this will not happen in multithreaded libvirt we had to introduce a separate process to take care of that. And this process has to be single threaded so there won't ever be the second thread to call close() and unintentionally release the lock.
Perhaps we could use OFD locks but those are not part of POSIX and are available on Linux only.
-- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Michal Privoznik <mprivozn@redhat.com> [2018-09-27, 10:15AM +0200]:
On 09/27/2018 09:01 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
The ideal solution would be to just tell virlockd "these are the paths I want you to lock on my behalf" and virtlockd would use F_SETLKW so that the moment all paths are unlocked virtlockd will lock them and libvirtd can continue its execution (i.e. chown() and setfcon()). However, we can't do this because virtlockd runs single threaded [1] and therefore if one thread is waiting for lock to be acquired there is no other thread to unlock the path.
Therefore I had to move the logic into libvirtd which tries repeatedly to lock all the paths needed. And this is where the timeout steps in - the lock acquiring attempts are capped at 60 seconds. This is an arbitrary chosen timeout. We can make it smaller, but that will not solve your problem - only mask it.
I still don't understand why we need a timeout at all. If virtlockd is unable to get the lock, just bail and continue with what you did after the timeout runs out. Is this some kind of safety-measure? Against what?
However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks.
Whole libvirtd shouldn't hang. Only those threads which try to acquire domain object lock. IOW it should be possible to run APIs over different domains (or other objects for that matter). For instance dumpxml over different domain works just fine.
Yes, but from a user perspective, this is still pretty bad and unexpected. libvirt should continue to operate as usual while virtlockd is figuring out the locking.
Anyway, we need to get to the bottom of this. Looks like something keeps the file locked and then when libvirt wants to lock if for metadata the timeout is hit and whole operation fails. Do you mind running 'lslocks -u' when starting a domain and before the timeout is hit?
There IS a lock held on the image, from the FIRST domain that we started. The second domain, which is using the SAME image, unshared, runs into the locking timeout. Sorry if I failed to describe this setup appropriately. I am starting to believe that this is expected behaviour, although it is not intuitive. # lslocks -u COMMAND PID TYPE SIZE MODE M START END PATH ... virtlockd 199062 POSIX 1.5G WRITE 0 0 0 /var/lib/libvirt/images/u1604.qcow2 ...
Michal
1: The reason that virtlockd has to run single threaded is stupidity of POSIX file locks. Imagine one thread doing: open() + fcntl(fd, F_SETLKW, ...) and entering critical section. If another thread does open() + close() on the same file the file is unlocked. Because we can't guarantee this will not happen in multithreaded libvirt we had to introduce a separate process to take care of that. And this process has to be single threaded so there won't ever be the second thread to call close() and unintentionally release the lock.
Thanks for the explanation, I will have some reading to do to get a better overview of the locking process in Linux.
Perhaps we could use OFD locks but those are not part of POSIX and are available on Linux only.
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- IBM Systems Linux on Z & Virtualization Development -------------------------------------------------- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -------------------------------------------------- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/27/2018 11:11 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-27, 10:15AM +0200]:
On 09/27/2018 09:01 AM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-19, 11:45AM +0200]:
On 09/19/2018 11:17 AM, Bjoern Walk wrote:
Bjoern Walk <bwalk@linux.ibm.com> [2018-09-12, 01:17PM +0200]:
Michal Privoznik <mprivozn@redhat.com> [2018-09-12, 11:32AM +0200]:
Still seeing the same timeout. Is this expected behaviour?
Nope. I wonder if something has locked the path and forgot to unlock it (however, virtlockd should have unlocked all the paths owned by PID on connection close), or something is still holding the lock and connection opened.
Can you see the timeout even when you turn off the selinux driver (security_driver="none' in qemu.conf)? I tried to reproduce the issue yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have been able to reproduce this behaviour on x86 (both with SELinux and DAC). Looks like a general problem, if a problem at all, because from what I can see in the code, the 60 seconds timeout is actually intended, or not? The security manager does try for 60 seconds to acquire the lock and only then bails out. Why is this?
The ideal solution would be to just tell virlockd "these are the paths I want you to lock on my behalf" and virtlockd would use F_SETLKW so that the moment all paths are unlocked virtlockd will lock them and libvirtd can continue its execution (i.e. chown() and setfcon()). However, we can't do this because virtlockd runs single threaded [1] and therefore if one thread is waiting for lock to be acquired there is no other thread to unlock the path.
Therefore I had to move the logic into libvirtd which tries repeatedly to lock all the paths needed. And this is where the timeout steps in - the lock acquiring attempts are capped at 60 seconds. This is an arbitrary chosen timeout. We can make it smaller, but that will not solve your problem - only mask it.
I still don't understand why we need a timeout at all. If virtlockd is unable to get the lock, just bail and continue with what you did after the timeout runs out. Is this some kind of safety-measure? Against what?
When there are two threads fighting over the lock. Imagine two threads trying to set security label over the same file. Or imagine two daemons on two distant hosts trying to set label on the same file on NFS. virtlockd implements try-or-fail approach. So we need to call lock() repeatedly until we succeed (or timeout).
However, an actual bug is that while the security manager waits for the lock acquire the whole libvirtd hangs, but from what I understood and Marc explained to me, this is more of a pathological error in libvirt behaviour with various domain locks.
Whole libvirtd shouldn't hang. Only those threads which try to acquire domain object lock. IOW it should be possible to run APIs over different domains (or other objects for that matter). For instance dumpxml over different domain works just fine.
Yes, but from a user perspective, this is still pretty bad and unexpected. libvirt should continue to operate as usual while virtlockd is figuring out the locking.
Agreed. I will look into this.
Anyway, we need to get to the bottom of this. Looks like something keeps the file locked and then when libvirt wants to lock if for metadata the timeout is hit and whole operation fails. Do you mind running 'lslocks -u' when starting a domain and before the timeout is hit?
There IS a lock held on the image, from the FIRST domain that we started. The second domain, which is using the SAME image, unshared, runs into the locking timeout. Sorry if I failed to describe this setup appropriately. I am starting to believe that this is expected behaviour, although it is not intuitive.
# lslocks -u COMMAND PID TYPE SIZE MODE M START END PATH ... virtlockd 199062 POSIX 1.5G WRITE 0 0 0 /var/lib/libvirt/images/u1604.qcow2
But this should not clash, because metadata lock use different bytes: the regular locking uses offset=0, metadata lock uses offset=1. Both locks lock just one byte in length. However, from the logs it looks like virtlockd doesn't try to actually acquire the lock because it found in internal records that the path is locked (even though it is locked with different offset). Anyway, now I am finally able to reproduce the issue and I'll look into this too. Quick and dirty patch might look something like this: diff --git i/src/util/virlockspace.c w/src/util/virlockspace.c index 79fa48d365..3c51d7926b 100644 --- i/src/util/virlockspace.c +++ w/src/util/virlockspace.c @@ -630,8 +630,9 @@ int virLockSpaceAcquireResource(virLockSpacePtr lockspace, virMutexLock(&lockspace->lock); if ((res = virHashLookup(lockspace->resources, resname))) { - if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) && - (flags & VIR_LOCK_SPACE_ACQUIRE_SHARED)) { + if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED && + flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) || + start == 1) { if (VIR_EXPAND_N(res->owners, res->nOwners, 1) < 0) goto cleanup; But as I say, this is very dirty hack. I need to find a better solution. At least you might have something to test. Michal

Michal Privoznik <mprivozn@redhat.com> [2018-09-27, 12:07PM +0200]:
On 09/27/2018 11:11 AM, Bjoern Walk wrote:
I still don't understand why we need a timeout at all. If virtlockd is unable to get the lock, just bail and continue with what you did after the timeout runs out. Is this some kind of safety-measure? Against what?
When there are two threads fighting over the lock. Imagine two threads trying to set security label over the same file. Or imagine two daemons on two distant hosts trying to set label on the same file on NFS. virtlockd implements try-or-fail approach. So we need to call lock() repeatedly until we succeed (or timeout).
One thread wins and the other fails with lock contention? I don't see how repeatedly trying to acquire the lock will improve things, but maybe I am not getting it right now.
There IS a lock held on the image, from the FIRST domain that we started. The second domain, which is using the SAME image, unshared, runs into the locking timeout. Sorry if I failed to describe this setup appropriately. I am starting to believe that this is expected behaviour, although it is not intuitive.
# lslocks -u COMMAND PID TYPE SIZE MODE M START END PATH ... virtlockd 199062 POSIX 1.5G WRITE 0 0 0 /var/lib/libvirt/images/u1604.qcow2
But this should not clash, because metadata lock use different bytes: the regular locking uses offset=0, metadata lock uses offset=1. Both locks lock just one byte in length.
It's the only lock in the output for the image. Should I see the metadata lock at start=1 as well?
However, from the logs it looks like virtlockd doesn't try to actually acquire the lock because it found in internal records that the path is locked (even though it is locked with different offset). Anyway, now I am finally able to reproduce the issue and I'll look into this too.
Good.
Quick and dirty patch might look something like this:
diff --git i/src/util/virlockspace.c w/src/util/virlockspace.c index 79fa48d365..3c51d7926b 100644 --- i/src/util/virlockspace.c +++ w/src/util/virlockspace.c @@ -630,8 +630,9 @@ int virLockSpaceAcquireResource(virLockSpacePtr lockspace, virMutexLock(&lockspace->lock);
if ((res = virHashLookup(lockspace->resources, resname))) { - if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) && - (flags & VIR_LOCK_SPACE_ACQUIRE_SHARED)) { + if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED && + flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) || + start == 1) {
if (VIR_EXPAND_N(res->owners, res->nOwners, 1) < 0) goto cleanup;
Had a quick test with this, but now it seems like that the whole metadata locking does nothing. When starting the second domain, I get an internal error from QEMU, failing to get the write lock. The SELinux labels of the image are changed as well. This is the same behaviour as with metadata locking disabled. Not entirely sure what should happen or if this is a error in my setup or not. I will have to think about this.
But as I say, this is very dirty hack. I need to find a better solution. At least you might have something to test.
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- IBM Systems Linux on Z & Virtualization Development -------------------------------------------------- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -------------------------------------------------- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/27/2018 01:16 PM, Bjoern Walk wrote:
Michal Privoznik <mprivozn@redhat.com> [2018-09-27, 12:07PM +0200]:
On 09/27/2018 11:11 AM, Bjoern Walk wrote:
I still don't understand why we need a timeout at all. If virtlockd is unable to get the lock, just bail and continue with what you did after the timeout runs out. Is this some kind of safety-measure? Against what?
When there are two threads fighting over the lock. Imagine two threads trying to set security label over the same file. Or imagine two daemons on two distant hosts trying to set label on the same file on NFS. virtlockd implements try-or-fail approach. So we need to call lock() repeatedly until we succeed (or timeout).
One thread wins and the other fails with lock contention? I don't see how repeatedly trying to acquire the lock will improve things, but maybe I am not getting it right now.
The point of metadata locking is not to prevent metadata change, but to be able to atomically change it. As I said in other thread, there is not much point in this feature alone since chown() and setfcon() are atomic themselves. But this is preparing the code for original label remembering which will be stored in XATTRs at which point the whole operation is not going to be atomic. https://www.redhat.com/archives/libvir-list/2018-September/msg01349.html Long story short, at the first chown() libvirt will record the original owner of the file into XATTRs and then on the last restore it will use that owner to return the file to instead of root:root. The locks are there because reading XATTR, parsing it, increasing/decreasing refcounter and possibly doing chown() is not atomic. But with locks it is.
There IS a lock held on the image, from the FIRST domain that we started. The second domain, which is using the SAME image, unshared, runs into the locking timeout. Sorry if I failed to describe this setup appropriately. I am starting to believe that this is expected behaviour, although it is not intuitive.
# lslocks -u COMMAND PID TYPE SIZE MODE M START END PATH ... virtlockd 199062 POSIX 1.5G WRITE 0 0 0 /var/lib/libvirt/images/u1604.qcow2
But this should not clash, because metadata lock use different bytes: the regular locking uses offset=0, metadata lock uses offset=1. Both locks lock just one byte in length.
It's the only lock in the output for the image. Should I see the metadata lock at start=1 as well?
You should see it if you run lslocks at the right moment. But since metadata locking happens only for a fraction of a second, it is very unlikely that you'll be able to run it at the right moment.
However, from the logs it looks like virtlockd doesn't try to actually acquire the lock because it found in internal records that the path is locked (even though it is locked with different offset). Anyway, now I am finally able to reproduce the issue and I'll look into this too.
Good.
Quick and dirty patch might look something like this:
diff --git i/src/util/virlockspace.c w/src/util/virlockspace.c index 79fa48d365..3c51d7926b 100644 --- i/src/util/virlockspace.c +++ w/src/util/virlockspace.c @@ -630,8 +630,9 @@ int virLockSpaceAcquireResource(virLockSpacePtr lockspace, virMutexLock(&lockspace->lock);
if ((res = virHashLookup(lockspace->resources, resname))) { - if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) && - (flags & VIR_LOCK_SPACE_ACQUIRE_SHARED)) { + if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED && + flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) || + start == 1) {
if (VIR_EXPAND_N(res->owners, res->nOwners, 1) < 0) goto cleanup;
Had a quick test with this, but now it seems like that the whole metadata locking does nothing. When starting the second domain, I get an internal error from QEMU, failing to get the write lock. The SELinux labels of the image are changed as well. This is the same behaviour as with metadata locking disabled. Not entirely sure what should happen or if this is a error in my setup or not. I will have to think about this.
Metadata locking is not supposed to prevent you from running two domains with the same disk. Nor to prevent chown() on the file. But since you are using virtlockd to lock the disk contents, it should prevent running such domains, not qemu. I wonder if this is a misconfiguration or something. Perhaps one domain has disk RW and the other has it RO + shared? Michal
participants (4)
-
Bjoern Walk
-
John Ferlan
-
Marc Hartmayer
-
Michal Privoznik