On a Wednesday in 2020, Michal Privoznik wrote:
Okay, here is the deal. Currently, the way we build namespace is
very fragile. It is done from pre-exec hook when starting a
domain, after we mass closed all FDs and before we drop
privileges and exec() QEMU. This fact poses some limitations onto
the namespace build code, e.g. it has to make sure not to keep
any FD opened (not even through a library call), because it would
be leaked to QEMU. Also, it has to call only async signal safe
functions. These requirements are hard to meet - in fact as of my
commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling
libdevmapper functions.
To solve this issue and avoid similar problems in the future, we
should change our paradigm. We already have functions which can
populate domain's namespace with nodes from the daemon context.
If we use them to populate the namespace and keep only the bare
minimum in the pre-exec hook, we've mitigated the risk.
Therefore, the old qemuDomainBuildNamespace() is renamed to
qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace()
function is introduced. So far, the new function is basically a
NOP and domain's namespace is still populated from the pre-exec
hook - next patches will fix it.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
src/qemu/qemu_domain_namespace.c | 23 ++++++++++++++++++++---
src/qemu/qemu_domain_namespace.h | 8 +++++---
src/qemu/qemu_process.c | 6 +++++-
3 files changed, 30 insertions(+), 7 deletions(-)
Reviewed-by: Ján Tomko <jtomko(a)redhat.com>
Jano