[PATCH 0/4] Couple of qemu:///embed related fixes

*** BLURB HERE *** Michal Prívozník (4): qemu_shim: Don't hang if failed to start domain docs: Lift restriction on running API from the event loop thread virConnectOpen: Require root dir to be absolute path qemu_shim: Always pre-create root dir docs/drvqemu.html.in | 8 ++++++-- src/libvirt.c | 6 ++++++ src/qemu/qemu_shim.c | 20 ++++++++++++++------ 3 files changed, 26 insertions(+), 8 deletions(-) -- 2.26.2

The qemu shim spawns a separate thread in which the event loop is ran. The virEventRunDefaultImpl() call is wrapped in a while() loop, just like it should. There are few lines of code around which try to ensure that domain is destroyed (when quitting) and that the last round of event loop is ran after the virDomainDestroy() call. Only after that the loop is quit from and the thread quits. Well, if creating the domain failed then the thread is stuck inside event loop forever. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1920337 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_shim.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_shim.c b/src/qemu/qemu_shim.c index c10598df4b..21a24abade 100644 --- a/src/qemu/qemu_shim.c +++ b/src/qemu/qemu_shim.c @@ -45,9 +45,12 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; + } else { + g_mutex_unlock(&eventLock); + break; } } g_mutex_unlock(&eventLock); -- 2.26.2

On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
+++ b/src/qemu/qemu_shim.c @@ -45,9 +45,12 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; + } else { + g_mutex_unlock(&eventLock); + break; } } g_mutex_unlock(&eventLock);
I'm probably missing something obvious, but I thought this could be simply while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { quit = true; if (dom) { virDomainDestroy(dom); } } g_mutex_unlock(&eventLock); virEventRunDefaultImpl(); } Do we specifically want to avoid call virEventRunDefaultImpl() one last time if the domain has failed to start? -- Andrea Bolognani / Red Hat / Virtualization

On 3/12/21 11:42 AM, Andrea Bolognani wrote:
On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
+++ b/src/qemu/qemu_shim.c @@ -45,9 +45,12 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; + } else { + g_mutex_unlock(&eventLock); + break; } } g_mutex_unlock(&eventLock);
I'm probably missing something obvious, but I thought this could be simply
while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { quit = true; if (dom) { virDomainDestroy(dom); } } g_mutex_unlock(&eventLock); virEventRunDefaultImpl(); }
Do we specifically want to avoid call virEventRunDefaultImpl() one last time if the domain has failed to start?
Huh, riddle me this - how come now this is sufficient, but back then when I was writing this it wasn't? diff --git i/src/qemu/qemu_shim.c w/src/qemu/qemu_shim.c index c10598df4b..5a9ffdd416 100644 --- i/src/qemu/qemu_shim.c +++ w/src/qemu/qemu_shim.c @@ -45,9 +45,9 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; } } g_mutex_unlock(&eventLock); Will post it as v2, soon. Michal

On Fri, 2021-03-12 at 15:24 +0100, Michal Privoznik wrote:
On 3/12/21 11:42 AM, Andrea Bolognani wrote:
On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
+++ b/src/qemu/qemu_shim.c @@ -45,9 +45,12 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; + } else { + g_mutex_unlock(&eventLock); + break; } } g_mutex_unlock(&eventLock);
I'm probably missing something obvious, but I thought this could be simply
while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { quit = true; if (dom) { virDomainDestroy(dom); } } g_mutex_unlock(&eventLock); virEventRunDefaultImpl(); }
Do we specifically want to avoid call virEventRunDefaultImpl() one last time if the domain has failed to start?
Huh, riddle me this - how come now this is sufficient, but back then when I was writing this it wasn't?
diff --git i/src/qemu/qemu_shim.c w/src/qemu/qemu_shim.c index c10598df4b..5a9ffdd416 100644 --- i/src/qemu/qemu_shim.c +++ w/src/qemu/qemu_shim.c @@ -45,9 +45,9 @@ qemuShimEventLoop(void *opaque G_GNUC_UNUSED) while (!quit) { g_mutex_lock(&eventLock); if (eventQuitFlag && !eventPreventQuitFlag) { + quit = true; if (dom) { virDomainDestroy(dom); - quit = true; } } g_mutex_unlock(&eventLock);
Will post it as v2, soon.
I see v2 it in the archives[1], but it doesn't seem to have any intention of reaching my inbox (is libvir-list broken again?) so I'll just comment here. Assuming you have verified that the rewritten code still addresses the bug, Reviewed-by: Andrea Bolognani <abologna@redhat.com> to v2. [1] https://listman.redhat.com/archives/libvir-list/2021-March/msg00607.html -- Andrea Bolognani / Red Hat / Virtualization

Since v6.2.0-rc1~238 (and friends) QMP processing was moved to a per-domain thread. Therefore, it is now safe to call APIs from the event loop thread (e.g. just like qemu shim is doing in qemuShimEventLoop(). However, it is still important to let the event loop run after each API call (obviously). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/drvqemu.html.in | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/drvqemu.html.in b/docs/drvqemu.html.in index 31d3fee213..3cdd04aa1e 100644 --- a/docs/drvqemu.html.in +++ b/docs/drvqemu.html.in @@ -158,8 +158,10 @@ qemu+ssh://root@example.com/system (remote access, SSH tunnelled) in mind, applications must <strong>NEVER</strong> invoke API calls from the event loop thread itself, only other threads. Not following this rule will lead to deadlocks in the API. - This restriction is intended to be lifted in a future release - of libvirt, once QMP processing moves to a dedicated thread. + This restriction was lifted starting from 6.2.0 release, when + QMP processing moved to a dedicated thread. However, it is + important to let the event loop run after each API call, even + the ones made from the vent loop thread itself. </p> <h2><a id="security">Driver security architecture</a></h2> -- 2.26.2

On 3/1/21 5:49 AM, Michal Privoznik wrote:
Since v6.2.0-rc1~238 (and friends) QMP processing was moved to a per-domain thread. Therefore, it is now safe to call APIs from the event loop thread (e.g. just like qemu shim is doing in qemuShimEventLoop(). However, it is still important to let the event loop run after each API call (obviously).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/drvqemu.html.in | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/docs/drvqemu.html.in b/docs/drvqemu.html.in index 31d3fee213..3cdd04aa1e 100644 --- a/docs/drvqemu.html.in +++ b/docs/drvqemu.html.in @@ -158,8 +158,10 @@ qemu+ssh://root@example.com/system (remote access, SSH tunnelled) in mind, applications must <strong>NEVER</strong> invoke API calls from the event loop thread itself, only other threads. Not following this rule will lead to deadlocks in the API. - This restriction is intended to be lifted in a future release - of libvirt, once QMP processing moves to a dedicated thread. + This restriction was lifted starting from 6.2.0 release, when + QMP processing moved to a dedicated thread. However, it is + important to let the event loop run after each API call, even + the ones made from the vent loop thread itself.
event loop
</p>
<h2><a id="security">Driver security architecture</a></h2>
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
+++ b/docs/drvqemu.html.in @@ -158,8 +158,10 @@ qemu+ssh://root@example.com/system (remote access, SSH tunnelled) in mind, applications must <strong>NEVER</strong> invoke API calls from the event loop thread itself, only other threads. Not following this rule will lead to deadlocks in the API. - This restriction is intended to be lifted in a future release - of libvirt, once QMP processing moves to a dedicated thread. + This restriction was lifted starting from 6.2.0 release, when + QMP processing moved to a dedicated thread. However, it is + important to let the event loop run after each API call, even + the ones made from the vent loop thread itself.
s/ vent/ event/g Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

In theory, users might want to use a relative path as a root directory for embed drivers. But in practice, nothing in driver initialization (specifically QEMU driver since it's the only one that supports embedding now), is prepared for that. Document and enforce absolute paths. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1883725 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/drvqemu.html.in | 2 ++ src/libvirt.c | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/docs/drvqemu.html.in b/docs/drvqemu.html.in index 3cdd04aa1e..494dda56ef 100644 --- a/docs/drvqemu.html.in +++ b/docs/drvqemu.html.in @@ -73,6 +73,8 @@ qemu+ssh://root@example.com/system (remote access, SSH tunnelled) registered & be running an instance of the event loop. To open the driver in embedded mode the app use the new URI path and specify a virtual root directory under which the driver will create content. + The path to the root directory must be absolute. Passing a relative + path results in an error. </p> <pre> diff --git a/src/libvirt.c b/src/libvirt.c index 5778b5daee..f2f0efa0cb 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -996,6 +996,12 @@ virConnectOpenInternal(const char *name, if (!root) goto failed; + if (!g_path_is_absolute(root)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("root path must be absolute")); + goto failed; + } + if (virEventRequireImpl() < 0) goto failed; -- 2.26.2

On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
In theory, users might want to use a relative path as a root directory for embed drivers. But in practice, nothing in driver initialization (specifically QEMU driver since it's the only one that supports embedding now), is prepared for that. Document and enforce absolute paths.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1883725 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/drvqemu.html.in | 2 ++ src/libvirt.c | 6 ++++++ 2 files changed, 8 insertions(+)
Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

This problem is reproducible only with secret driver. When starting a domain via virt-qemu-run and both secret and (nonexistent) root directory specified this is what happens: 1) virt-qemu-run opens "secret:///embed?root=$rootdir" connection, which results in the secret driver initialization (done in secretStateInitialize()). During this process, the driver creates it's own configDir (derived from $rootdir) including those parents which don't exists yet. This is all done with the mode S_IRWXU and thus results in the $rootdir being created with very restrictive mode (specifically, +x is missing for group and others). 2) now, virt-qemu-run-opens "qemu:///embed?root=$rootdir" and calls virDomainCreateXML(). This results in the master-key.aes being written somewhere under the $rootdir and telling qemu where to find it. But because the secret driver created $rootdir with too restrictive mode, qemu can't access the file (even though it knows the full path) and fails to start. It looks like the best solution is to pre-create the root directory before opening any connection (letting any driver initialize itself) and set its mode to something less restrictive. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1859873 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_shim.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_shim.c b/src/qemu/qemu_shim.c index 21a24abade..a08bdcac6a 100644 --- a/src/qemu/qemu_shim.c +++ b/src/qemu/qemu_shim.c @@ -213,11 +213,16 @@ int main(int argc, char **argv) } tmproot = true; - if (chmod(root, 0755) < 0) { - g_printerr("%s: cannot chown temporary dir: %s\n", - argv[0], g_strerror(errno)); - goto cleanup; - } + } else if (g_mkdir_with_parents(root, 0755) < 0) { + g_printerr("%s: cannot create dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; + } + + if (chmod(root, 0755) < 0) { + g_printerr("%s: cannot chmod temporary dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; } escaped = g_uri_escape_string(root, NULL, true); -- 2.26.2

On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
This problem is reproducible only with secret driver. When starting a domain via virt-qemu-run and both secret and (nonexistent) root directory specified this is what happens:
1) virt-qemu-run opens "secret:///embed?root=$rootdir" connection, which results in the secret driver initialization (done in secretStateInitialize()). During this process, the driver creates it's own configDir (derived from $rootdir)
s/it's own/its own/
including those parents which don't exists yet. This is all done with the mode S_IRWXU and thus results in the $rootdir being created with very restrictive mode (specifically, +x is missing for group and others).
2) now, virt-qemu-run-opens "qemu:///embed?root=$rootdir" and
s/run-opens/run opens/
+++ b/src/qemu/qemu_shim.c @@ -213,11 +213,16 @@ int main(int argc, char **argv) } tmproot = true;
- if (chmod(root, 0755) < 0) { - g_printerr("%s: cannot chown temporary dir: %s\n", - argv[0], g_strerror(errno)); - goto cleanup; - } + } else if (g_mkdir_with_parents(root, 0755) < 0) { + g_printerr("%s: cannot create dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; + } + + if (chmod(root, 0755) < 0) { + g_printerr("%s: cannot chmod temporary dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; }
Wouldn't it make sense to leave the chmod() bit where it was? g_mkdir_with_parents() already accepts the mode as a parameter, so calling chmod() again seems unnecessary. With that changed and the commit message fixed, Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

On 3/12/21 11:51 AM, Andrea Bolognani wrote:
On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
This problem is reproducible only with secret driver. When starting a domain via virt-qemu-run and both secret and (nonexistent) root directory specified this is what happens:
1) virt-qemu-run opens "secret:///embed?root=$rootdir" connection, which results in the secret driver initialization (done in secretStateInitialize()). During this process, the driver creates it's own configDir (derived from $rootdir)
s/it's own/its own/
including those parents which don't exists yet. This is all done with the mode S_IRWXU and thus results in the $rootdir being created with very restrictive mode (specifically, +x is missing for group and others).
2) now, virt-qemu-run-opens "qemu:///embed?root=$rootdir" and
s/run-opens/run opens/
+++ b/src/qemu/qemu_shim.c @@ -213,11 +213,16 @@ int main(int argc, char **argv) } tmproot = true;
- if (chmod(root, 0755) < 0) { - g_printerr("%s: cannot chown temporary dir: %s\n", - argv[0], g_strerror(errno)); - goto cleanup; - } + } else if (g_mkdir_with_parents(root, 0755) < 0) { + g_printerr("%s: cannot create dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; + } + + if (chmod(root, 0755) < 0) { + g_printerr("%s: cannot chmod temporary dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; }
Wouldn't it make sense to leave the chmod() bit where it was? g_mkdir_with_parents() already accepts the mode as a parameter, so calling chmod() again seems unnecessary.
Well, if the dir exists but doesn't have right perms then g_mkdir_with_parents() does nothing and we need that explicit chmod(). Michal

On Fri, 2021-03-12 at 15:38 +0100, Michal Privoznik wrote:
On 3/12/21 11:51 AM, Andrea Bolognani wrote:
On Mon, 2021-03-01 at 12:49 +0100, Michal Privoznik wrote:
@@ -213,11 +213,16 @@ int main(int argc, char **argv) } tmproot = true;
- if (chmod(root, 0755) < 0) { - g_printerr("%s: cannot chown temporary dir: %s\n", - argv[0], g_strerror(errno)); - goto cleanup; - } + } else if (g_mkdir_with_parents(root, 0755) < 0) { + g_printerr("%s: cannot create dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; + } + + if (chmod(root, 0755) < 0) { + g_printerr("%s: cannot chmod temporary dir: %s\n", + argv[0], g_strerror(errno)); + goto cleanup; }
Wouldn't it make sense to leave the chmod() bit where it was? g_mkdir_with_parents() already accepts the mode as a parameter, so calling chmod() again seems unnecessary.
Well, if the dir exists but doesn't have right perms then g_mkdir_with_parents() does nothing and we need that explicit chmod().
Fair point, I hadn't considered that :) -- Andrea Bolognani / Red Hat / Virtualization
participants (3)
-
Andrea Bolognani
-
Eric Blake
-
Michal Privoznik