The latest events (and bugs reported on the list) got me thinking. I
went back to the drawing board and relized we don't need virtlockd if we
fork() for every transaction.
The whole reason for offloading file locking to virtlockd is that there
is no good version of file locking in POSIX. flock() locks entire file
which would clash with qemu and therefore we can't use it. Then,
fcntl(F_SETLK) (which we expose as virFileLock()) allows us to lock only
some bytes, but it has few critical problems:
a) the lock is not shared across fork() (only the parent has all the
locks, not the child),
b) the lock is not owned by file descriptor, but (pid,inode) pair.
Therefore, if one thread holds the lock and the other close()-s any FD
referring to the inode, it releases all the locks on that inode [1].
Since we can't guarantee this won't happen in multithreaded app like
libvirtd we created a separate, single threaded binary (virtlockd) to
work around those problems [2].
However, since namespaces are on by default (and therefore arguably more
of our users use them than don't), we are fork()-ing anyway when setting
up security labels. But the child runs single threaded, so it can do the
locking instead of offloading it to a separate process. All that we need
to do is:
1) make sure to fork() every time (even when namespaces are disabled),
2) make sure the child won't create any threads (trivial).
I tried to write patches to make virtlockd work, but turns out it would
be another 10+ patches which looks like worthless work to me given that
we have cleaner solution. Having said that, I'm also reverting some
patches that modified virtlockd because we will not need it after all.
1: Worse, imagine two file names for the same inode, aka file2 is a
hardlink to file1. Locking file1 and closing file2 results in releasing
the lock.
2: There are sane locks, so called Open File Description locks, which
work well even in aforementioned cases. The lock is associated with the
FD and not (pid,inode) pair. But they are Linux only.
Michal Prívozník (11):
security: Always spawn process for transactions
security_manager: Rework metadata locking
Revert "security_manager: Load lock plugin on init"
Revert "qemu_conf: Introduce metadata_lock_manager"
Revert "lock_manager: Allow disabling configFile for
virLockManagerPluginNew"
Revert "lock_driver: Introduce VIR_LOCK_MANAGER_ACQUIRE_ROLLBACK"
Revert "lock_driver: Introduce
VIR_LOCK_MANAGER_RESOURCE_TYPE_METADATA"
Revert "_virLockManagerLockDaemonPrivate: Move @hasRWDisks into dom
union"
Revert "lock_driver: Introduce new
VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON"
Revert "lock_driver_lockd: Introduce
VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_METADATA flag"
Revert "virlockspace: Allow caller to specify start and length offset
in virLockSpaceAcquireResource"
cfg.mk | 4 +-
src/locking/lock_daemon_dispatch.c | 11 +-
src/locking/lock_driver.h | 12 --
src/locking/lock_driver_lockd.c | 421 ++++++++++++-------------------------
src/locking/lock_driver_lockd.h | 1 -
src/locking/lock_driver_sanlock.c | 44 ++--
src/locking/lock_manager.c | 10 +-
src/lxc/lxc_controller.c | 3 +-
src/lxc/lxc_driver.c | 2 +-
src/qemu/qemu_conf.c | 1 -
src/qemu/qemu_conf.h | 1 -
src/qemu/qemu_driver.c | 3 -
src/security/security_dac.c | 18 +-
src/security/security_manager.c | 218 ++++++++-----------
src/security/security_manager.h | 19 +-
src/security/security_selinux.c | 18 +-
src/util/virlockspace.c | 15 +-
src/util/virlockspace.h | 4 -
tests/seclabeltest.c | 2 +-
tests/securityselinuxlabeltest.c | 2 +-
tests/securityselinuxtest.c | 2 +-
tests/testutilsqemu.c | 2 +-
tests/virlockspacetest.c | 29 +--
23 files changed, 287 insertions(+), 555 deletions(-)
--
2.16.4