[libvirt] [PATCH 0/4] Couple of metadata locking fixes

[libvirt] [RFC PATCH v1 1/1] Add...

[libvirt] [PATCH 0/2] A few spec...

Michal Privoznik

Friday, 21 September 2018 Fri, 21 Sep '18

4:29 a.m.

Strictly speaking, only the last patch actually fixes a real problem. The first three are more of a cleanup than anything. However, I am still sending them because they make the code better. Anyway, if my explanation in 4/4 is not clear enough, here it is in simpler terms: It's important to bear in mind that at any point when connection to virtlockd is closed, virtlockd goes through its hash table and if it finds any resources acquired for the process that has closed the connection, it will kill the process. In other words, resources are per PID not per connection. For instance, if libvirtd would open two connections, acquired two different resources through each of them and then closed one it would get killed instantly because it still owns some resources. So now that this virtlockd behaviour is clear (I must say intended and desired behaviour), we can start looking into how libvirtd runs qemu. Basically, problem is that instead of opening connection to virtlockd once and keeping it open through whole run of libvirtd I used this shortcut: open the connection in virSecurityManagerMetadataLock(), dup() the connection FD, and close it in virSecurityManagerMetadataUnlock(). In theory, this works pretty well - when the duplicated FD is closed, libvirtd doesn't own any locks anymore so virtlockd doesn't kill it. What my design did not count with is fork(). On fork() the duplicated FD is cloned into the child and thus connection is closed nobody knows when. Meantime, libvirtd might continue its execution (e.g. starting another domain from another thread) and call virSecurityManagerMetadataLock(). However, before it gets to call Unlock() the child process decides to close the FD (either via exec() or exit()). Given virtlockd behaviour described couple of paragraphs above I guess it is clear to see now that virtlockd kills libvirtd. My solution to this is to always run secdriver transactions from a separate child process because on fork() only the thread that calls it is cloned. So only the thread that runs the transaction is cloned, not the one that is starting a different domain. Therefore no other fork() can occur here and therefore we are safe. I know, I know, it's complicated. But it always is around fork() and IPC. If you want to see my fix in action, just enable metadata locking, and try to start two or more domains in a loop: for ((i=0;i<1000;i++)); do \ virsh start u1604-1 & virsh start u1604-2 &\ sleep 3;\ virsh destroy u1604-1; virsh destroy u1604-2;\ done At some point you'll see virsh reporting I/O error (this is because virtlockd killed libvirtd). With my patch, I run the test multiple times without any hiccup. @Bjoern: I'm still unable to reproduce the issue you reported. However, whilst trying to do so I've came across this bug. However, my gut feeling is that this might help you. Michal Prívozník (4): security: Grab a reference to virSecurityManager for transactions virNetSocket: Be more safe with fork() around virNetSocketDupFD() virLockManagerLockDaemonAcquire: Duplicate client FD with CLOEXEC flag security: Always spawn process for transactions src/libxl/libxl_migration.c | 4 ++-- src/locking/domain_lock.c | 18 ++++++++++++++++++ src/locking/lock_driver_lockd.c | 2 +- src/qemu/qemu_migration.c | 2 +- src/rpc/virnetclient.c | 5 +++-- src/rpc/virnetclient.h | 2 +- src/rpc/virnetsocket.c | 7 ++----- src/rpc/virnetsocket.h | 2 +- src/security/security_dac.c | 17 +++++++++-------- src/security/security_selinux.c | 17 +++++++++-------- 10 files changed, 47 insertions(+), 29 deletions(-) -- 2.16.4

Show replies by date

Michal Privoznik

Friday, 21 September Fri, 21 Sep

4:29 a.m.

New subject: [libvirt] [PATCH 1/4] security: Grab a reference to virSecurityManager for transactions

This shouldn't be needed per-se. Security manager shouldn't disappear during transactions - it's immutable. However, it doesn't hurt to grab a reference either - transaction code uses it after all. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/security/security_dac.c | 5 +++-- src/security/security_selinux.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 2dbaf29ff5..5aea386e7c 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -141,6 +141,7 @@ virSecurityDACChownListFree(void *opaque) VIR_FREE(list->items[i]); } VIR_FREE(list->items); + virObjectUnref(list->manager); VIR_FREE(list); } @@ -511,12 +512,12 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->manager = mgr; + list->manager = virObjectRef(mgr); if (virThreadLocalSet(&chownList, list) < 0) { virReportSystemError(errno, "%s", _("Unable to set thread local variable")); - VIR_FREE(list); + virSecurityDACChownListFree(list); return -1; } diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 056637e4cb..31e42afee7 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -156,6 +156,7 @@ virSecuritySELinuxContextListFree(void *opaque) for (i = 0; i < list->nItems; i++) virSecuritySELinuxContextItemFree(list->items[i]); + virObjectUnref(list->manager); VIR_FREE(list); } @@ -1054,12 +1055,12 @@ virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->manager = mgr; + list->manager = virObjectRef(mgr); if (virThreadLocalSet(&contextList, list) < 0) { virReportSystemError(errno, "%s", _("Unable to set thread local variable")); - VIR_FREE(list); + virSecuritySELinuxContextListFree(list); return -1; } -- 2.16.4

Marc Hartmayer

9:07 a.m.

New subject: [libvirt] [PATCH 1/4] security: Grab a reference to virSecurityManager for transactions

On Fri, Sep 21, 2018 at 11:29 AM +0200, Michal Privoznik <mprivozn(a)redhat.com> wrote:

...

[…snip…]

...

diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 056637e4cb..31e42afee7 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -156,6 +156,7 @@ virSecuritySELinuxContextListFree(void *opaque) for (i = 0; i < list->nItems; i++) virSecuritySELinuxContextItemFree(list->items[i]);

BTW, isn’t here VIR_FREE(list->items); missing? (see your commit ca250269b0af24b319989807ee50f2a87b421922).

...

+ virObjectUnref(list->manager); VIR_FREE(list); } @@ -1054,12 +1055,12 @@ virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->manager = mgr; + list->manager = virObjectRef(mgr); if (virThreadLocalSet(&contextList, list) < 0) { virReportSystemError(errno, "%s", _("Unable to set thread local variable")); - VIR_FREE(list); + virSecuritySELinuxContextListFree(list); return -1; } -- 2.16.4 -- libvir-list mailing list libvir-list(a)redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Reviewed-by: Marc Hartmayer <mhartmay(a)linux.ibm.com> -- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

John Ferlan

Tuesday, 25 September Tue, 25 Sep

10:25 a.m.

New subject: [libvirt] [PATCH 1/4] security: Grab a reference to virSecurityManager for transactions

On 9/21/18 5:29 AM, Michal Privoznik wrote:

...

FWIW: I agree w/ Marc's assessment. You need a patch 0.5 ;-) to add the VIR_FREE(list->items) for selinux. It should reference commit ca25026

...

diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 2dbaf29ff5..5aea386e7c 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -141,6 +141,7 @@ virSecurityDACChownListFree(void *opaque) VIR_FREE(list->items[i]); } VIR_FREE(list->items); + virObjectUnref(list->manager); VIR_FREE(list); } @@ -511,12 +512,12 @@ virSecurityDACTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->manager = mgr; + list->manager = virObjectRef(mgr);

If you move ^^^ to below vvv, then I think the VIR_FREE could still apply. Realistically all that's happening by calling ListFree is the Unref of list->manager. Same for _selinux. In fact, that'd probably be the more proper order with the Ref(mgr) being done. With all that, Reviewed-by: John Ferlan <jferlan(a)redhat.com> John I also assume you can add the VIR_FREE(list->items) to _selinux with an appropriate commit message as a pre-patch and that too has my R-By.

...

if (virThreadLocalSet(&chownList, list) < 0) { virReportSystemError(errno, "%s", _("Unable to set thread local variable")); - VIR_FREE(list); + virSecurityDACChownListFree(list); return -1; } diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 056637e4cb..31e42afee7 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -156,6 +156,7 @@ virSecuritySELinuxContextListFree(void *opaque) for (i = 0; i < list->nItems; i++) virSecuritySELinuxContextItemFree(list->items[i]); + virObjectUnref(list->manager); VIR_FREE(list); } @@ -1054,12 +1055,12 @@ virSecuritySELinuxTransactionStart(virSecurityManagerPtr mgr) if (VIR_ALLOC(list) < 0) return -1; - list->manager = mgr; + list->manager = virObjectRef(mgr); if (virThreadLocalSet(&contextList, list) < 0) { virReportSystemError(errno, "%s", _("Unable to set thread local variable")); - VIR_FREE(list); + virSecuritySELinuxContextListFree(list); return -1; }

Michal Privoznik

Friday, 21 September Fri, 21 Sep

4:29 a.m.

New subject: [libvirt] [PATCH 2/4] virNetSocket: Be more safe with fork() around virNetSocketDupFD()

If there was a caller which would dup the client FD without CLOEXEC flag and later decided to change the flag it wouldn't be safe to do because fork() might have had occurred meantime. Switch to the other pattern - always dup FD with the flag set and let callers clear the flag if they need to do so. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/libxl/libxl_migration.c | 4 ++-- src/qemu/qemu_migration.c | 2 +- src/rpc/virnetclient.c | 10 +++++++++- src/rpc/virnetsocket.c | 7 ++----- src/rpc/virnetsocket.h | 2 +- 5 files changed, 15 insertions(+), 10 deletions(-) diff --git a/src/libxl/libxl_migration.c b/src/libxl/libxl_migration.c index fc7ccb53d0..5eb8eb1203 100644 --- a/src/libxl/libxl_migration.c +++ b/src/libxl/libxl_migration.c @@ -310,7 +310,7 @@ libxlMigrateDstReceive(virNetSocketPtr sock, } VIR_DEBUG("Accepted migration connection." " Spawning thread to process migration data"); - recvfd = virNetSocketDupFD(client_sock, true); + recvfd = virNetSocketDupFD(client_sock); virObjectUnref(client_sock); /* @@ -1254,7 +1254,7 @@ libxlDomainMigrationSrcPerform(libxlDriverPrivatePtr driver, goto cleanup; } - sockfd = virNetSocketDupFD(sock, true); + sockfd = virNetSocketDupFD(sock); virObjectUnref(sock); if (virDomainLockProcessPause(driver->lockManager, vm, &priv->lockState) < 0) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 825a9d399b..129be0a11a 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3303,7 +3303,7 @@ qemuMigrationSrcConnect(virQEMUDriverPtr driver, if (virNetSocketNewConnectTCP(host, port, AF_UNSPEC, &sock) == 0) { - spec->dest.fd.qemu = virNetSocketDupFD(sock, true); + spec->dest.fd.qemu = virNetSocketDupFD(sock); virObjectUnref(sock); } if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0 || diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index b4d8fb2187..40ed3fde6d 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -715,8 +715,16 @@ int virNetClientGetFD(virNetClientPtr client) int virNetClientDupFD(virNetClientPtr client, bool cloexec) { int fd; + virObjectLock(client); - fd = virNetSocketDupFD(client->sock, cloexec); + + fd = virNetSocketDupFD(client->sock); + if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { + virReportSystemError(errno, "%s", + _("Cannot disable close-on-exec flag")); + VIR_FORCE_CLOSE(fd); + } + virObjectUnlock(client); return fd; } diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index 55de3b2aad..27ffa23bcd 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1372,14 +1372,11 @@ int virNetSocketGetFD(virNetSocketPtr sock) } -int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec) +int virNetSocketDupFD(virNetSocketPtr sock) { int fd; - if (cloexec) - fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); - else - fd = dup(sock->fd); + fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to copy socket file handle")); diff --git a/src/rpc/virnetsocket.h b/src/rpc/virnetsocket.h index de795af917..e6bac27566 100644 --- a/src/rpc/virnetsocket.h +++ b/src/rpc/virnetsocket.h @@ -124,7 +124,7 @@ virNetSocketPtr virNetSocketNewPostExecRestart(virJSONValuePtr object); virJSONValuePtr virNetSocketPreExecRestart(virNetSocketPtr sock); int virNetSocketGetFD(virNetSocketPtr sock); -int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec); +int virNetSocketDupFD(virNetSocketPtr sock); bool virNetSocketIsLocal(virNetSocketPtr sock); -- 2.16.4

Marc Hartmayer

9:15 a.m.

New subject: [libvirt] [PATCH 2/4] virNetSocket: Be more safe with fork() around virNetSocketDupFD()

On Fri, Sep 21, 2018 at 11:29 AM +0200, Michal Privoznik <mprivozn(a)redhat.com> wrote:

...

[…snip…]

...

-int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec) +int virNetSocketDupFD(virNetSocketPtr sock)

Might be useful to document this function.

...

{ int fd; - if (cloexec) - fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); - else - fd = dup(sock->fd); + fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to copy socket file handle")); diff --git a/src/rpc/virnetsocket.h b/src/rpc/virnetsocket.h index de795af917..e6bac27566 100644 --- a/src/rpc/virnetsocket.h +++ b/src/rpc/virnetsocket.h @@ -124,7 +124,7 @@ virNetSocketPtr virNetSocketNewPostExecRestart(virJSONValuePtr object); virJSONValuePtr virNetSocketPreExecRestart(virNetSocketPtr sock); int virNetSocketGetFD(virNetSocketPtr sock); -int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec); +int virNetSocketDupFD(virNetSocketPtr sock); bool virNetSocketIsLocal(virNetSocketPtr sock); -- 2.16.4 -- libvir-list mailing list libvir-list(a)redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Anyway, Reviewed-by: Marc Hartmayer <mhartmay(a)linux.ibm.com> -- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

John Ferlan

Tuesday, 25 September Tue, 25 Sep

11:25 a.m.

New subject: [libvirt] [PATCH 2/4] virNetSocket: Be more safe with fork() around virNetSocketDupFD()

On 9/21/18 5:29 AM, Michal Privoznik wrote:

...

If there was a caller which would dup the client FD without CLOEXEC flag and later decided to change the flag it wouldn't be safe to do because fork() might have had occurred meantime.

blank line

...

Switch to the other pattern - always dup FD with the flag set and let callers clear the flag if they need to do so.

I don't see this last sentence as happening in this patch...

...

Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/libxl/libxl_migration.c | 4 ++-- src/qemu/qemu_migration.c | 2 +- src/rpc/virnetclient.c | 10 +++++++++- src/rpc/virnetsocket.c | 7 ++----- src/rpc/virnetsocket.h | 2 +- 5 files changed, 15 insertions(+), 10 deletions(-)

I'm with Marc on this - documenting virNetSocketDupFD is a good thing. Although it could be considered out of the scope of what you're doing, since you are changing the semantics describing the nuance that fork brings about w/r/t fd duplication in the code rather than the commit message doesn't make someone have to go back and look at the commit message to gain more insight. It also should provide the reader with the knowledge of what virSetInherit would provide to remove if so desired. Of course while you're at it, documenting virNetClientDupFD similarly would be quite beneficial.

...

diff --git a/src/libxl/libxl_migration.c b/src/libxl/libxl_migration.c index fc7ccb53d0..5eb8eb1203 100644 --- a/src/libxl/libxl_migration.c +++ b/src/libxl/libxl_migration.c @@ -310,7 +310,7 @@ libxlMigrateDstReceive(virNetSocketPtr sock, } VIR_DEBUG("Accepted migration connection." " Spawning thread to process migration data"); - recvfd = virNetSocketDupFD(client_sock, true); + recvfd = virNetSocketDupFD(client_sock); virObjectUnref(client_sock); /* @@ -1254,7 +1254,7 @@ libxlDomainMigrationSrcPerform(libxlDriverPrivatePtr driver, goto cleanup; } - sockfd = virNetSocketDupFD(sock, true); + sockfd = virNetSocketDupFD(sock); virObjectUnref(sock); if (virDomainLockProcessPause(driver->lockManager, vm, &priv->lockState) < 0) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 825a9d399b..129be0a11a 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3303,7 +3303,7 @@ qemuMigrationSrcConnect(virQEMUDriverPtr driver, if (virNetSocketNewConnectTCP(host, port, AF_UNSPEC, &sock) == 0) { - spec->dest.fd.qemu = virNetSocketDupFD(sock, true); + spec->dest.fd.qemu = virNetSocketDupFD(sock); virObjectUnref(sock); } if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0 || diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index b4d8fb2187..40ed3fde6d 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -715,8 +715,16 @@ int virNetClientGetFD(virNetClientPtr client) int virNetClientDupFD(virNetClientPtr client, bool cloexec) { int fd; + virObjectLock(client); - fd = virNetSocketDupFD(client->sock, cloexec); + + fd = virNetSocketDupFD(client->sock); + if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { + virReportSystemError(errno, "%s", + _("Cannot disable close-on-exec flag")); + VIR_FORCE_CLOSE(fd); + } +

Caveat: my sock knowledge of setting or duplication of the socket fd isn't that deep, especially as it relates to fork "mechanics". So prior to this change, the call to virNetSocketDupFD(fd, false) would: fd = dup(sock->fd); Or essentially create a new socket fd from the existing one copying everything it already has... With this change you have (discounting some new error paths) fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); fcntl(fd, F_SETFD, ~FD_CLOEXEC) which seems to be a nop and/or unnecessary to me. Beyond that if the sock->fd didn't have CLOEXEC already, then what impact does briefly changing CLOEXEC have on something that arrives while set? And of course if sock->fd had CLOEXEC and then you clear it on the 'shared' socket, that doesn't seem right either. Seems to me the dup() does the trick and then lets the caller decide what CLOEXEC semantics they want with the duplicated fd without perhaps changing the sock->fd semantics.

...

virObjectUnlock(client); return fd; } diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index 55de3b2aad..27ffa23bcd 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1372,14 +1372,11 @@ int virNetSocketGetFD(virNetSocketPtr sock) } -int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec) +int virNetSocketDupFD(virNetSocketPtr sock) { int fd; - if (cloexec) - fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); - else - fd = dup(sock->fd); + fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0);

So removing this "instance" of fork and client usage - is there no belief that some day it may be a desirable option. I think the end result you have in patch3 do the SetInherit call is probably still proper since at that point you know by calling w/ false all you'd be getting is a straight duplicate of what exists and the decision at that point is to "alter" the view of the socket you have received to change the CLOEXEC value John hopefully I'm not too far off into the weeds on this, but I'm not convinced this patch is right even though I do think documenting the semantics would be a really good thing to happen because I cannot imagine the nuance(s) is(are) all that well understood.

...

if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to copy socket file handle")); diff --git a/src/rpc/virnetsocket.h b/src/rpc/virnetsocket.h index de795af917..e6bac27566 100644 --- a/src/rpc/virnetsocket.h +++ b/src/rpc/virnetsocket.h @@ -124,7 +124,7 @@ virNetSocketPtr virNetSocketNewPostExecRestart(virJSONValuePtr object); virJSONValuePtr virNetSocketPreExecRestart(virNetSocketPtr sock); int virNetSocketGetFD(virNetSocketPtr sock); -int virNetSocketDupFD(virNetSocketPtr sock, bool cloexec); +int virNetSocketDupFD(virNetSocketPtr sock); bool virNetSocketIsLocal(virNetSocketPtr sock);

Michal Privoznik

Wednesday, 26 September Wed, 26 Sep

4:46 a.m.

New subject: [libvirt] [PATCH 2/4] virNetSocket: Be more safe with fork() around virNetSocketDupFD()

On 09/25/2018 06:25 PM, John Ferlan wrote:

...

On 9/21/18 5:29 AM, Michal Privoznik wrote: > If there was a caller which would dup the client FD without > CLOEXEC flag and later decided to change the flag it wouldn't be > safe to do because fork() might have had occurred meantime. blank line > Switch to the other pattern - always dup FD with the flag set and > let callers clear the flag if they need to do so. I don't see this last sentence as happening in this patch... > > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> > --- > src/libxl/libxl_migration.c | 4 ++-- > src/qemu/qemu_migration.c | 2 +- > src/rpc/virnetclient.c | 10 +++++++++- > src/rpc/virnetsocket.c | 7 ++----- > src/rpc/virnetsocket.h | 2 +- > 5 files changed, 15 insertions(+), 10 deletions(-) > I'm with Marc on this - documenting virNetSocketDupFD is a good thing. Although it could be considered out of the scope of what you're doing, since you are changing the semantics describing the nuance that fork brings about w/r/t fd duplication in the code rather than the commit message doesn't make someone have to go back and look at the commit message to gain more insight. It also should provide the reader with the knowledge of what virSetInherit would provide to remove if so desired. Of course while you're at it, documenting virNetClientDupFD similarly would be quite beneficial. > diff --git a/src/libxl/libxl_migration.c b/src/libxl/libxl_migration.c > index fc7ccb53d0..5eb8eb1203 100644 > --- a/src/libxl/libxl_migration.c > +++ b/src/libxl/libxl_migration.c > @@ -310,7 +310,7 @@ libxlMigrateDstReceive(virNetSocketPtr sock, > } > VIR_DEBUG("Accepted migration connection." > " Spawning thread to process migration data"); > - recvfd = virNetSocketDupFD(client_sock, true); > + recvfd = virNetSocketDupFD(client_sock); > virObjectUnref(client_sock); > > /* > @@ -1254,7 +1254,7 @@ libxlDomainMigrationSrcPerform(libxlDriverPrivatePtr driver, > goto cleanup; > } > > - sockfd = virNetSocketDupFD(sock, true); > + sockfd = virNetSocketDupFD(sock); > virObjectUnref(sock); > > if (virDomainLockProcessPause(driver->lockManager, vm, &priv->lockState) < 0) > diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c > index 825a9d399b..129be0a11a 100644 > --- a/src/qemu/qemu_migration.c > +++ b/src/qemu/qemu_migration.c > @@ -3303,7 +3303,7 @@ qemuMigrationSrcConnect(virQEMUDriverPtr driver, > if (virNetSocketNewConnectTCP(host, port, > AF_UNSPEC, > &sock) == 0) { > - spec->dest.fd.qemu = virNetSocketDupFD(sock, true); > + spec->dest.fd.qemu = virNetSocketDupFD(sock); > virObjectUnref(sock); > } > if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0 || > diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c > index b4d8fb2187..40ed3fde6d 100644 > --- a/src/rpc/virnetclient.c > +++ b/src/rpc/virnetclient.c > @@ -715,8 +715,16 @@ int virNetClientGetFD(virNetClientPtr client) > int virNetClientDupFD(virNetClientPtr client, bool cloexec) > { > int fd; > + > virObjectLock(client); > - fd = virNetSocketDupFD(client->sock, cloexec); > + > + fd = virNetSocketDupFD(client->sock); > + if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { > + virReportSystemError(errno, "%s", > + _("Cannot disable close-on-exec flag")); > + VIR_FORCE_CLOSE(fd); > + } > + Caveat: my sock knowledge of setting or duplication of the socket fd isn't that deep, especially as it relates to fork "mechanics". So prior to this change, the call to virNetSocketDupFD(fd, false) would: fd = dup(sock->fd); Or essentially create a new socket fd from the existing one copying everything it already has... With this change you have (discounting some new error paths) fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); fcntl(fd, F_SETFD, ~FD_CLOEXEC) which seems to be a nop and/or unnecessary to me. Beyond that if the sock->fd didn't have CLOEXEC already, then what impact does briefly changing CLOEXEC have on something that arrives while set?

So imagine the following happening. There are two threads A,B and they do the following: thread A | thread B | child ----------------+-------------+-------- copy = dup(fd); | | | fork(); | fcntl(copy); | | | | exec(); (the fcntl() call is there to set CLOEXEC flag) This will obviously not work as expected because when B does fork() both @copy and @fd descriptors are copied to child. So even when A then tries to set CLOEXEC flag it won't matter as it doesn't affect FDs inside child. In the end, @copy is leaked to a process that child exec()-s. My patches make this safer: thread A | thread B | child ------------------+-------------+-------- copy = fcntl(fd); | | | XXX | | fork(); | | | exec(); (here, the fcntl() call is to atomically duplicate the @fd and set CLOEXEC flag) It's clean to see that no matter where fork() is placed, @copy is not leaked into the new process. Either @copy exists in child but it's closed on exec() (because of the flag), or fork() happened before fcntl() call and there is no @copy in child at all. The XXX can be replaced with virSetInherit() if the code intentionally wants to leak @copy into exec().

...

And of course if sock->fd had CLOEXEC and then you clear it on the 'shared' socket, that doesn't seem right either.

In fact it does. When starting a domain, we fork() plenty of times. Also, fork() happens in secdriver transactions, when fetching qemu caps, on every virCommandRun(), etc. What I'm trying to say is that fork() + exec() happen pretty asynchronously and therefore we can't do dup() + fcntl() because that is not atomic (as shown above). In fact, it's fairly easy to see this problem in action. What was sufficient for me was to start two domains at once from a loop. As I say in the cover letter. While one thread was doing the metada lock (where connection FD to virtlockd is duplicated), the other did fork() to start qemu and the connection FD got leaked there.

...

Seems to me the dup() does the trick and then lets the caller decide what CLOEXEC semantics they want with the duplicated fd without perhaps changing the sock->fd semantics.

No, this is not atomic and therefore it's plain wrong. Michal

John Ferlan

9:20 a.m.

New subject: [libvirt] [PATCH 2/4] virNetSocket: Be more safe with fork() around virNetSocketDupFD()

On 9/26/18 5:46 AM, Michal Privoznik wrote:

...

On 09/25/2018 06:25 PM, John Ferlan wrote: > > > On 9/21/18 5:29 AM, Michal Privoznik wrote: >> If there was a caller which would dup the client FD without >> CLOEXEC flag and later decided to change the flag it wouldn't be >> safe to do because fork() might have had occurred meantime. > > blank line > >> Switch to the other pattern - always dup FD with the flag set and >> let callers clear the flag if they need to do so. > > I don't see this last sentence as happening in this patch... > >> >> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> >> --- >> src/libxl/libxl_migration.c | 4 ++-- >> src/qemu/qemu_migration.c | 2 +- >> src/rpc/virnetclient.c | 10 +++++++++- >> src/rpc/virnetsocket.c | 7 ++----- >> src/rpc/virnetsocket.h | 2 +- >> 5 files changed, 15 insertions(+), 10 deletions(-) >> > > I'm with Marc on this - documenting virNetSocketDupFD is a good thing. > Although it could be considered out of the scope of what you're doing, > since you are changing the semantics describing the nuance that fork > brings about w/r/t fd duplication in the code rather than the commit > message doesn't make someone have to go back and look at the commit > message to gain more insight. It also should provide the reader with > the knowledge of what virSetInherit would provide to remove if so desired. > > Of course while you're at it, documenting virNetClientDupFD similarly > would be quite beneficial. > >> diff --git a/src/libxl/libxl_migration.c b/src/libxl/libxl_migration.c >> index fc7ccb53d0..5eb8eb1203 100644 >> --- a/src/libxl/libxl_migration.c >> +++ b/src/libxl/libxl_migration.c >> @@ -310,7 +310,7 @@ libxlMigrateDstReceive(virNetSocketPtr sock, >> } >> VIR_DEBUG("Accepted migration connection." >> " Spawning thread to process migration data"); >> - recvfd = virNetSocketDupFD(client_sock, true); >> + recvfd = virNetSocketDupFD(client_sock); >> virObjectUnref(client_sock); >> >> /* >> @@ -1254,7 +1254,7 @@ libxlDomainMigrationSrcPerform(libxlDriverPrivatePtr driver, >> goto cleanup; >> } >> >> - sockfd = virNetSocketDupFD(sock, true); >> + sockfd = virNetSocketDupFD(sock); >> virObjectUnref(sock); >> >> if (virDomainLockProcessPause(driver->lockManager, vm, &priv->lockState) < 0) >> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c >> index 825a9d399b..129be0a11a 100644 >> --- a/src/qemu/qemu_migration.c >> +++ b/src/qemu/qemu_migration.c >> @@ -3303,7 +3303,7 @@ qemuMigrationSrcConnect(virQEMUDriverPtr driver, >> if (virNetSocketNewConnectTCP(host, port, >> AF_UNSPEC, >> &sock) == 0) { >> - spec->dest.fd.qemu = virNetSocketDupFD(sock, true); >> + spec->dest.fd.qemu = virNetSocketDupFD(sock); >> virObjectUnref(sock); >> } >> if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0 || >> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c >> index b4d8fb2187..40ed3fde6d 100644 >> --- a/src/rpc/virnetclient.c >> +++ b/src/rpc/virnetclient.c >> @@ -715,8 +715,16 @@ int virNetClientGetFD(virNetClientPtr client) >> int virNetClientDupFD(virNetClientPtr client, bool cloexec) >> { >> int fd; >> + >> virObjectLock(client); >> - fd = virNetSocketDupFD(client->sock, cloexec); >> + >> + fd = virNetSocketDupFD(client->sock); >> + if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { >> + virReportSystemError(errno, "%s", >> + _("Cannot disable close-on-exec flag")); >> + VIR_FORCE_CLOSE(fd); >> + } >> + > > Caveat: my sock knowledge of setting or duplication of the socket fd > isn't that deep, especially as it relates to fork "mechanics". > > So prior to this change, the call to virNetSocketDupFD(fd, false) would: > > fd = dup(sock->fd); > > Or essentially create a new socket fd from the existing one copying > everything it already has... > > With this change you have (discounting some new error paths) > > fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); > fcntl(fd, F_SETFD, ~FD_CLOEXEC) > > which seems to be a nop and/or unnecessary to me. > > Beyond that if the sock->fd didn't have CLOEXEC already, then what > impact does briefly changing CLOEXEC have on something that arrives > while set? So imagine the following happening. There are two threads A,B and they do the following: thread A | thread B | child ----------------+-------------+-------- copy = dup(fd); | | | fork(); | fcntl(copy); | | | | exec(); (the fcntl() call is there to set CLOEXEC flag) This will obviously not work as expected because when B does fork() both @copy and @fd descriptors are copied to child. So even when A then tries to set CLOEXEC flag it won't matter as it doesn't affect FDs inside child. In the end, @copy is leaked to a process that child exec()-s.

So I'm going to have to read more docs then, <sigh>. Is there any question why documenting what these functions do and what edge conditions are being avoided is so important? IOW: Which way to call the function based on what the caller really wants. I see that the dup() man page says it's a copy of the original except that the FD_CLOEXEC is not set and fcntl(F_DUPFD_CLOEXEC) is a copy of the original except that FD_CLOEXEC is set. That would also seem to imply to me that fcntl(F_DUPFD) would be the same as dup(), but who knows at this point. So let's go back to your example - "thread A" calls virNetSocketDupFD: if (cloexec) fd = fcntl(sock->fd, F_DUPFD_CLOEXEC, 0); else fd = dup(sock->fd); So using your logic, that means @cloexec == false meaning the caller wants a duplicate socket without the FD_CLOEXEC. Then you note that "the fcntl() call is there to set CLOEXEC flag" - so that means to me someone called the function with @false and then set the flag afterwards. Which as you note is a problem, which I agree; however, isn't that a problem with the caller? If the caller called w/ @cloexec == true and then disabled - they'd have the same issue. That's why the code is written as it is, isn't it? I'm trying to consider this patch on it's own merits and now you're requiring every fd to be duplicated with the CLOEXEC flag and then afterwards it's cleared if the virNetClientDupFD consumer wanted it cleared. It seems to me that this patch is adding and unnecessary step that causes the threadB/child processing problem. Maybe I just don't know enough about this socket duplication processing, so hopefully you'll find someone that does. Because I'm totally missing the point. John

...

My patches make this safer: thread A | thread B | child ------------------+-------------+-------- copy = fcntl(fd); | | | XXX | | fork(); | | | exec(); (here, the fcntl() call is to atomically duplicate the @fd and set CLOEXEC flag) It's clean to see that no matter where fork() is placed, @copy is not leaked into the new process. Either @copy exists in child but it's closed on exec() (because of the flag), or fork() happened before fcntl() call and there is no @copy in child at all. The XXX can be replaced with virSetInherit() if the code intentionally wants to leak @copy into exec(). > > And of course if sock->fd had CLOEXEC and then you clear it on the > 'shared' socket, that doesn't seem right either. In fact it does. When starting a domain, we fork() plenty of times. Also, fork() happens in secdriver transactions, when fetching qemu caps, on every virCommandRun(), etc. What I'm trying to say is that fork() + exec() happen pretty asynchronously and therefore we can't do dup() + fcntl() because that is not atomic (as shown above). In fact, it's fairly easy to see this problem in action. What was sufficient for me was to start two domains at once from a loop. As I say in the cover letter. While one thread was doing the metada lock (where connection FD to virtlockd is duplicated), the other did fork() to start qemu and the connection FD got leaked there. > > Seems to me the dup() does the trick and then lets the caller decide > what CLOEXEC semantics they want with the duplicated fd without perhaps > changing the sock->fd semantics. No, this is not atomic and therefore it's plain wrong. Michal

Michal Privoznik

Friday, 21 September Fri, 21 Sep

4:29 a.m.

New subject: [libvirt] [PATCH 3/4] virLockManagerLockDaemonAcquire: Duplicate client FD with CLOEXEC flag

There is one caller (virSecurityManagerMetadataLock) which duplicates the connection FD and wants to have the flag set. However, trying to set the flag after dup() is not safe as another thread might fork() meanwhile. Therefore, switch to duplicating with the flag set and only let callers refine this later. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/locking/domain_lock.c | 18 ++++++++++++++++++ src/locking/lock_driver_lockd.c | 2 +- src/rpc/virnetclient.c | 9 +-------- src/rpc/virnetclient.h | 2 +- 4 files changed, 21 insertions(+), 10 deletions(-) diff --git a/src/locking/domain_lock.c b/src/locking/domain_lock.c index 705b132457..db20fa86a3 100644 --- a/src/locking/domain_lock.c +++ b/src/locking/domain_lock.c @@ -188,6 +188,24 @@ int virDomainLockProcessStart(virLockManagerPluginPtr plugin, ret = virLockManagerAcquire(lock, NULL, flags, dom->def->onLockFailure, fd); + if (ret >= 0 && fd && *fd >= 0 && virSetInherit(*fd, true) < 0) { + int saved_errno = errno; + virErrorPtr origerr; + + virErrorPreserveLast(&origerr); + + if (virLockManagerRelease(lock, NULL, 0) < 0) + VIR_WARN("Unable to release acquired resourced in cleanup path"); + + virErrorRestore(&origerr); + errno = saved_errno; + + virReportSystemError(errno, "%s", + _("Cannot disable close-on-exec flag")); + + ret = -1; + } + virLockManagerFree(lock); return ret; diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 0c672b05b1..9b1943daa6 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -796,7 +796,7 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, goto cleanup; if (fd && - (*fd = virNetClientDupFD(client, false)) < 0) + (*fd = virNetClientDupFD(client)) < 0) goto cleanup; if (!(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) { diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 40ed3fde6d..6b0ddbeaad 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -712,19 +712,12 @@ int virNetClientGetFD(virNetClientPtr client) } -int virNetClientDupFD(virNetClientPtr client, bool cloexec) +int virNetClientDupFD(virNetClientPtr client) { int fd; virObjectLock(client); - fd = virNetSocketDupFD(client->sock); - if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { - virReportSystemError(errno, "%s", - _("Cannot disable close-on-exec flag")); - VIR_FORCE_CLOSE(fd); - } - virObjectUnlock(client); return fd; } diff --git a/src/rpc/virnetclient.h b/src/rpc/virnetclient.h index 9cf32091f5..3702f7fe5a 100644 --- a/src/rpc/virnetclient.h +++ b/src/rpc/virnetclient.h @@ -95,7 +95,7 @@ void virNetClientSetCloseCallback(virNetClientPtr client, virFreeCallback ff); int virNetClientGetFD(virNetClientPtr client); -int virNetClientDupFD(virNetClientPtr client, bool cloexec); +int virNetClientDupFD(virNetClientPtr client); bool virNetClientHasPassFD(virNetClientPtr client); -- 2.16.4

John Ferlan

Tuesday, 25 September Tue, 25 Sep

11:29 a.m.

New subject: [libvirt] [PATCH 3/4] virLockManagerLockDaemonAcquire: Duplicate client FD with CLOEXEC flag

On 9/21/18 5:29 AM, Michal Privoznik wrote:

...

diff --git a/src/locking/domain_lock.c b/src/locking/domain_lock.c index 705b132457..db20fa86a3 100644 --- a/src/locking/domain_lock.c +++ b/src/locking/domain_lock.c @@ -188,6 +188,24 @@ int virDomainLockProcessStart(virLockManagerPluginPtr plugin, ret = virLockManagerAcquire(lock, NULL, flags, dom->def->onLockFailure, fd); + if (ret >= 0 && fd && *fd >= 0 && virSetInherit(*fd, true) < 0) {

Not quite sure 'how' ret > 0, but I'll go with it considering other consumers perform the same check.

...

+ int saved_errno = errno; + virErrorPtr origerr; + + virErrorPreserveLast(&origerr); + + if (virLockManagerRelease(lock, NULL, 0) < 0) + VIR_WARN("Unable to release acquired resourced in cleanup path");

*resource(s)

...

+ + virErrorRestore(&origerr); + errno = saved_errno; + + virReportSystemError(errno, "%s", + _("Cannot disable close-on-exec flag")); + + ret = -1; + } +

OK so this perhaps *is* the only thing you're after. Discounting patch2, you get a dup()'d socket and you then remove the CLOEXEC from it. The rest of the patch doesn't seem to matter. Perhaps some day there is a virNetClientDupFD consumer that wants to pass @true, then they have to add back all that you removed. John

...

virLockManagerFree(lock); return ret; diff --git a/src/locking/lock_driver_lockd.c b/src/locking/lock_driver_lockd.c index 0c672b05b1..9b1943daa6 100644 --- a/src/locking/lock_driver_lockd.c +++ b/src/locking/lock_driver_lockd.c @@ -796,7 +796,7 @@ static int virLockManagerLockDaemonAcquire(virLockManagerPtr lock, goto cleanup; if (fd && - (*fd = virNetClientDupFD(client, false)) < 0) + (*fd = virNetClientDupFD(client)) < 0) goto cleanup; if (!(flags & VIR_LOCK_MANAGER_ACQUIRE_REGISTER_ONLY)) { diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 40ed3fde6d..6b0ddbeaad 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -712,19 +712,12 @@ int virNetClientGetFD(virNetClientPtr client) } -int virNetClientDupFD(virNetClientPtr client, bool cloexec) +int virNetClientDupFD(virNetClientPtr client) { int fd; virObjectLock(client); - fd = virNetSocketDupFD(client->sock); - if (!cloexec && fd >= 0 && virSetInherit(fd, true)) { - virReportSystemError(errno, "%s", - _("Cannot disable close-on-exec flag")); - VIR_FORCE_CLOSE(fd); - } - virObjectUnlock(client); return fd; } diff --git a/src/rpc/virnetclient.h b/src/rpc/virnetclient.h index 9cf32091f5..3702f7fe5a 100644 --- a/src/rpc/virnetclient.h +++ b/src/rpc/virnetclient.h @@ -95,7 +95,7 @@ void virNetClientSetCloseCallback(virNetClientPtr client, virFreeCallback ff); int virNetClientGetFD(virNetClientPtr client); -int virNetClientDupFD(virNetClientPtr client, bool cloexec); +int virNetClientDupFD(virNetClientPtr client); bool virNetClientHasPassFD(virNetClientPtr client);

Michal Privoznik

Wednesday, 26 September Wed, 26 Sep

4:46 a.m.

New subject: [libvirt] [PATCH 3/4] virLockManagerLockDaemonAcquire: Duplicate client FD with CLOEXEC flag

On 09/25/2018 06:29 PM, John Ferlan wrote:

...

On 9/21/18 5:29 AM, Michal Privoznik wrote: > There is one caller (virSecurityManagerMetadataLock) which > duplicates the connection FD and wants to have the flag set. > However, trying to set the flag after dup() is not safe as > another thread might fork() meanwhile. Therefore, switch to > duplicating with the flag set and only let callers refine this > later. > > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> > --- > src/locking/domain_lock.c | 18 ++++++++++++++++++ > src/locking/lock_driver_lockd.c | 2 +- > src/rpc/virnetclient.c | 9 +-------- > src/rpc/virnetclient.h | 2 +- > 4 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/src/locking/domain_lock.c b/src/locking/domain_lock.c > index 705b132457..db20fa86a3 100644 > --- a/src/locking/domain_lock.c > +++ b/src/locking/domain_lock.c > @@ -188,6 +188,24 @@ int virDomainLockProcessStart(virLockManagerPluginPtr plugin, > ret = virLockManagerAcquire(lock, NULL, flags, > dom->def->onLockFailure, fd); > > + if (ret >= 0 && fd && *fd >= 0 && virSetInherit(*fd, true) < 0) { Not quite sure 'how' ret > 0, but I'll go with it considering other consumers perform the same check.

Yeah, maybe there's no way for ret > 0 right now. I am just trying to be more future proof and follow our (perhaps unwritten) rule on return values: a negative value means error, zero or positive value means success.

...

> + int saved_errno = errno; > + virErrorPtr origerr; > + > + virErrorPreserveLast(&origerr); > + > + if (virLockManagerRelease(lock, NULL, 0) < 0) > + VIR_WARN("Unable to release acquired resourced in cleanup path"); *resource(s) > + > + virErrorRestore(&origerr); > + errno = saved_errno; > + > + virReportSystemError(errno, "%s", > + _("Cannot disable close-on-exec flag")); > + > + ret = -1; > + } > + OK so this perhaps *is* the only thing you're after. Discounting patch2, you get a dup()'d socket and you then remove the CLOEXEC from it.

Actually, I am trying to fix leaking FD when doing metadata locking.

...

The rest of the patch doesn't seem to matter. Perhaps some day there is a virNetClientDupFD consumer that wants to pass @true, then they have to add back all that you removed.

That should never happen (TM). If somebody will try to revert these two patches they will have to deal with FD leaks in some way. I'd be interested to see how they do that. Michal

Michal Privoznik

Friday, 21 September Fri, 21 Sep

4:29 a.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

There is this latent bug which can result in virtlockd killing libvirtd. The problem is when the following steps occur: Parent | Child ------------------------------------------------------+---------------- 1) virSecurityManagerMetadataLock(path); | This results in connection FD to virtlockd to be | dup()-ed and saved for later use. | | 2) Another thread calling fork(); | This results in the FD from step 1) to be cloned | | 3) virSecurityManagerMetadataUnlock(path); | Here, the FD is closed, but the connection is | still kept open because child process has | duplicated FD | | 4) virSecurityManagerMetadataLock(differentPath); | Again, new connection is opened, FD is dup()-ed | | 5) | exec() or exit() Step 5) results in closing the connection from 1) to be closed, finally. However, at this point virtlockd looks at its internal records if PID from 1) does not still own any resources. Well it does - in step 4) it locked differentPath. This results in virtlockd killing libvirtd. Note that this happens because we do some relabelling even before fork() at which point vm->pid is still -1. When this value is passed down to transaction code it simply runs the transaction in parent process (=libvirtd), which means the owner of metadata locks is the parent process. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/security/security_dac.c | 12 ++++++------ src/security/security_selinux.c | 12 ++++++------ 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 5aea386e7c..876cca0f2f 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -561,12 +561,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if ((pid == -1 && - virSecurityDACTransactionRun(pid, list) < 0) || - (pid != -1 && - virProcessRunInMountNamespace(pid, - virSecurityDACTransactionRun, - list) < 0)) + if (pid == -1) + pid = getpid(); + + if (virProcessRunInMountNamespace(pid, + virSecurityDACTransactionRun, + list) < 0) goto cleanup; ret = 0; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 31e42afee7..1e23d776ff 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1102,12 +1102,12 @@ virSecuritySELinuxTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if ((pid == -1 && - virSecuritySELinuxTransactionRun(pid, list) < 0) || - (pid != -1 && - virProcessRunInMountNamespace(pid, - virSecuritySELinuxTransactionRun, - list) < 0)) + if (pid == -1) + pid = getpid(); + + if (virProcessRunInMountNamespace(pid, + virSecuritySELinuxTransactionRun, + list) < 0) goto cleanup; ret = 0; -- 2.16.4

John Ferlan

Tuesday, 25 September Tue, 25 Sep

3:52 p.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 9/21/18 5:29 AM, Michal Privoznik wrote:

...

Always good to double up on the S-o-b's when there's locks and forks involved. That makes sure the fork gets one too ;-)

...

--- src/security/security_dac.c | 12 ++++++------ src/security/security_selinux.c | 12 ++++++------ 2 files changed, 12 insertions(+), 12 deletions(-)

I read the cover, it's not simpler than the above. It also indicates that patches 1-3 don't have any bearing. So to me that says - 2 series, one the artsy/fartsy cleanup variety and the second this edge/corner condition chase based on timing and/or other interactions. My first question is - have you bisected your changes? Since this only seems to matter for metadata locking, you don't necessarily have to start too early, but I would assume that prior to commit 8b8aefb3 there is no issue. My second question is - given my analysis in patch2, was testing of this patch done with or without patch 2 and 3? Hard to review this without having my patch2 analysis rattling around in short term memory that works. At this point, I'm thinking they were in place and may have had an impact especially since that dup() call was removed, but it's mentioned during this commit message, which is really confusing. Prior to patch2/3, virNetSocketDupFD has 4 callers of which only one calls with 'false' and that's the path having problems. libxlMigrateDstReceive (using true) libxlDomainMigrationSrcPerform (using true) qemuMigrationSrcConnect (using true) virNetClientDupFD (using false) Since your testing certainly doesn't include the first 3, I'll focus on the last one which has but one caller: virLockManagerLockDaemonAcquire meaning without patches 2-3, we got a "dup()" socket back instead of changing the existing socket to F_DUPFD_CLOEXEC. I think I just talked myself in a circle, but perhaps gives you an understanding of the struggle with all the patches in one series.

...

diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 5aea386e7c..876cca0f2f 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -561,12 +561,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if ((pid == -1 && - virSecurityDACTransactionRun(pid, list) < 0) || - (pid != -1 && - virProcessRunInMountNamespace(pid, - virSecurityDACTransactionRun, - list) < 0)) + if (pid == -1) + pid = getpid(); + + if (virProcessRunInMountNamespace(pid, + virSecurityDACTransactionRun, + list) < 0)

Going back to commit d41c1621, it essentially states: "To differentiate whether transaction code should fork() or not the @pid argument now accepts -1 (which means do not fork)." This commit then backtracks or undoes that by using a getpid() causing every transaction to fork. The "logic" for the decision to pass pid == -1 or not is introduced in the subsequent commit 37131ada based on qemuDomainNamespaceEnabled. Thus with this patch every domain w/ or w/o namespace and daemon commit path will fork. So if this logic is to stay, then there's no reason to have the namespace check either, is there? Is the vm->pid different than getpid()? IOW: I think this patch is too big of a leap. However, the contention from the cover is that this issue only happens when metadata locking is enabled. Thus, rather than getpid() for everyone, maybe it's a getpid() for metadata lock consumers. Seems for pid == -1 we could query @mgr for what it knows, like virSecurityManagerGetPrivateData and then if priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, use getpid(). Of course then @mgr loses its ATTRIBUTE_UNUSED. Furthermore, if virLockManagerLockDaemonAcquire knows that we have a daemon type lock, would that virNetClientDupFD need to be altered to use @true instead of @false? Maybe I'm over thinking this part - but with patch2 and patch3 adjustments, it's just not clear in my head anymore... John

...

goto cleanup; ret = 0; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 31e42afee7..1e23d776ff 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1102,12 +1102,12 @@ virSecuritySELinuxTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, goto cleanup; } - if ((pid == -1 && - virSecuritySELinuxTransactionRun(pid, list) < 0) || - (pid != -1 && - virProcessRunInMountNamespace(pid, - virSecuritySELinuxTransactionRun, - list) < 0)) + if (pid == -1) + pid = getpid(); + + if (virProcessRunInMountNamespace(pid, + virSecuritySELinuxTransactionRun, + list) < 0) goto cleanup; ret = 0;

Michal Privoznik

Wednesday, 26 September Wed, 26 Sep

4:46 a.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 09/25/2018 10:52 PM, John Ferlan wrote:

...

On 9/21/18 5:29 AM, Michal Privoznik wrote: > There is this latent bug which can result in virtlockd killing > libvirtd. The problem is when the following steps occur: > > Parent | Child > ------------------------------------------------------+---------------- > 1) virSecurityManagerMetadataLock(path); | > This results in connection FD to virtlockd to be | > dup()-ed and saved for later use. | > | > 2) Another thread calling fork(); | > This results in the FD from step 1) to be cloned | > | > 3) virSecurityManagerMetadataUnlock(path); | > Here, the FD is closed, but the connection is | > still kept open because child process has | > duplicated FD | > | > 4) virSecurityManagerMetadataLock(differentPath); | > Again, new connection is opened, FD is dup()-ed | > | > 5) | exec() or exit() > > Step 5) results in closing the connection from 1) to be closed, > finally. However, at this point virtlockd looks at its internal > records if PID from 1) does not still own any resources. Well it > does - in step 4) it locked differentPath. This results in > virtlockd killing libvirtd. > > Note that this happens because we do some relabelling even before > fork() at which point vm->pid is still -1. When this value is > passed down to transaction code it simply runs the transaction > in parent process (=libvirtd), which means the owner of metadata > locks is the parent process. > > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> > > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> Always good to double up on the S-o-b's when there's locks and forks involved. That makes sure the fork gets one too ;-) > --- > src/security/security_dac.c | 12 ++++++------ > src/security/security_selinux.c | 12 ++++++------ > 2 files changed, 12 insertions(+), 12 deletions(-) > I read the cover, it's not simpler than the above.

Yeah, this is not that simple bug. I'll try to explain it differently: On fork(), all FDs of the parent are duplicated to child. So even if parent closes one of them, the connection is not closed until child closes the same FD. And since metadata locking is very fragile when it comes to connection to virtlockd, this fork() behaviour actually plays against us because it violates our assumption that connection is closed at the end of metadataUnlock(). In fact it is not - child has it still open. If there was a flag like DONT_DUP_ON_FORK I could just set it on virtlockd connection in metadataLock() and be done. Perhaps I could use pthread_atfork() but that might be very tricky - I am not sure if some library that we are linking with does not use it and it pthreads are capable of dealing with us and the library registering their own callbacks.

...

It also indicates that patches 1-3 don't have any bearing. So to me that says - 2 series, one the artsy/fartsy cleanup variety and the second this edge/corner condition chase based on timing and/or other interactions. My first question is - have you bisected your changes? Since this only seems to matter for metadata locking, you don't necessarily have to start too early, but I would assume that prior to commit 8b8aefb3 there is no issue.

I haven't. I'm not quite sure why would I do that. I already know the problem and yes, it's in metadata locking. From quick skim through git log: 7a9ca0fa and 6d855abc14338 look like the result of bisect.

...

My second question is - given my analysis in patch2, was testing of this patch done with or without patch 2 and 3? Hard to review this without having my patch2 analysis rattling around in short term memory that works. At this point, I'm thinking they were in place and may have had an impact especially since that dup() call was removed, but it's mentioned during this commit message, which is really confusing.

I've tested the whole series not individual patches. So this patch was tested with 2 and 3 applied. However, it's true that since we will spawn a separate process for every transaction (and dup() is called only from there) it is not possible to actually have FD leak (because the transaction code does not fork() further). Having said that, there are other callers that duplicate socket FD and the problem is real (even though after this patch metadata locking is not affected).

...

Prior to patch2/3, virNetSocketDupFD has 4 callers of which only one calls with 'false' and that's the path having problems. libxlMigrateDstReceive (using true) libxlDomainMigrationSrcPerform (using true) qemuMigrationSrcConnect (using true)

Since these were calling virNetSocketDupFD(cloexec=true) anyway, nothing changed for them.

...

virNetClientDupFD (using false) Since your testing certainly doesn't include the first 3, I'll focus on the last one which has but one caller: virLockManagerLockDaemonAcquire meaning without patches 2-3, we got a "dup()" socket back instead of changing the existing socket to F_DUPFD_CLOEXEC. I think I just talked myself in a circle, but perhaps gives you an understanding of the struggle with all the patches in one series. > diff --git a/src/security/security_dac.c b/src/security/security_dac.c > index 5aea386e7c..876cca0f2f 100644 > --- a/src/security/security_dac.c > +++ b/src/security/security_dac.c > @@ -561,12 +561,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, > goto cleanup; > } > > - if ((pid == -1 && > - virSecurityDACTransactionRun(pid, list) < 0) || > - (pid != -1 && > - virProcessRunInMountNamespace(pid, > - virSecurityDACTransactionRun, > - list) < 0)) > + if (pid == -1) > + pid = getpid(); > + > + if (virProcessRunInMountNamespace(pid, > + virSecurityDACTransactionRun, > + list) < 0) Going back to commit d41c1621, it essentially states: "To differentiate whether transaction code should fork() or not the @pid argument now accepts -1 (which means do not fork)." This commit then backtracks or undoes that by using a getpid() causing every transaction to fork. The "logic" for the decision to pass pid == -1 or not is introduced in the subsequent commit 37131ada based on qemuDomainNamespaceEnabled. Thus with this patch every domain w/ or w/o namespace and daemon commit path will fork. So if this logic is to stay, then there's no reason to have the namespace check either, is there? Is the vm->pid different than getpid()? IOW: I think this patch is too big of a leap.

Yes, vm->pid is different to getpid(). If there is a domain running with namespaces enabled then we definitely wants to use vm->pid to enter the qemu's namespace and relabel /dev/blah there instead of the namespace that libvirtd (=getpid()) is running. However, some labelling is done prior to running qemu (and setting up its namespace). For instance: qemuDomainWriteMasterKeyFile, qemuProcessStartManagedPRDaemon, qemuProcessMakeDir - they all call qemuSecurityDomainSetPathLabel() at the time when no namespace was built yet. But since we initialize vm->pid to -1 then the namspace check feels redundant. I agree with that. In virSecurityDACTransactionCommit() it does not matter if pid = -1 because vm->pid is -1 or because no namespaces are used and therefore -1 comes from variable initialization in qemuSecurityDomainSetPathLabel().

...

However, the contention from the cover is that this issue only happens when metadata locking is enabled. Thus, rather than getpid() for everyone, maybe it's a getpid() for metadata lock consumers. Seems for pid == -1 we could query @mgr for what it knows, like virSecurityManagerGetPrivateData and then if priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, use getpid(). Of course then @mgr loses its ATTRIBUTE_UNUSED.

The whole point of always fork()-ing is to avoid leaking FD even into child process. Patches 2 and 3 are trying to avoid leaking FD into exec(), this one is trying to avoid leaking it into fork(). I mean, if the connection FD is cloned into child, we can't be sure when child decides to close it. It may close it at the worst possible moment - when libvirtd holds some locks (in virtlockd). And because virtlockd locks are not associated with connection (only with PID) from virtlockd perspective it looks like libvirtd has closed connection and yet is still holding some locks. BANG! that's where virtlockd kills libvirtd. I don't think there is any other way to prevent fork() cloning an FD. My solution is to fork() in virSecurityDACTransactionCommit() (or SELinux counterpart) which duplicates only the thread that is executing the transaction and runs only virSecurityDACTransactionRun() which is fork() free.

...

Furthermore, if virLockManagerLockDaemonAcquire knows that we have a daemon type lock, would that virNetClientDupFD need to be altered to use @true instead of @false? Maybe I'm over thinking this part - but with patch2 and patch3 adjustments, it's just not clear in my head anymore...

Patches 2 and 3 are not related to metadata locking anymore. However, they started as such (when I was writing them). Anyway, that fact does not reduce their importance. Michal

John Ferlan

4:14 p.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 9/26/18 5:46 AM, Michal Privoznik wrote:

...

On 09/25/2018 10:52 PM, John Ferlan wrote: > > > On 9/21/18 5:29 AM, Michal Privoznik wrote: >> There is this latent bug which can result in virtlockd killing >> libvirtd. The problem is when the following steps occur: >> >> Parent | Child >> ------------------------------------------------------+---------------- >> 1) virSecurityManagerMetadataLock(path); | >> This results in connection FD to virtlockd to be | >> dup()-ed and saved for later use. | >> | >> 2) Another thread calling fork(); | >> This results in the FD from step 1) to be cloned | >> | >> 3) virSecurityManagerMetadataUnlock(path); | >> Here, the FD is closed, but the connection is | >> still kept open because child process has | >> duplicated FD | >> | >> 4) virSecurityManagerMetadataLock(differentPath); | >> Again, new connection is opened, FD is dup()-ed | >> | >> 5) | exec() or exit() >> >> Step 5) results in closing the connection from 1) to be closed, >> finally. However, at this point virtlockd looks at its internal >> records if PID from 1) does not still own any resources. Well it >> does - in step 4) it locked differentPath. This results in >> virtlockd killing libvirtd. >> >> Note that this happens because we do some relabelling even before >> fork() at which point vm->pid is still -1. When this value is >> passed down to transaction code it simply runs the transaction >> in parent process (=libvirtd), which means the owner of metadata >> locks is the parent process. >> >> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> >> >> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> > > Always good to double up on the S-o-b's when there's locks and forks > involved. That makes sure the fork gets one too ;-) > > >> --- >> src/security/security_dac.c | 12 ++++++------ >> src/security/security_selinux.c | 12 ++++++------ >> 2 files changed, 12 insertions(+), 12 deletions(-) >> > > I read the cover, it's not simpler than the above. Yeah, this is not that simple bug. I'll try to explain it differently: On fork(), all FDs of the parent are duplicated to child. So even if parent closes one of them, the connection is not closed until child closes the same FD. And since metadata locking is very fragile when it comes to connection to virtlockd, this fork() behaviour actually plays against us because it violates our assumption that connection is closed at the end of metadataUnlock(). In fact it is not - child has it still open. If there was a flag like DONT_DUP_ON_FORK I could just set it on virtlockd connection in metadataLock() and be done.

Avoid the forking exec's works too ;-} OK - instead of talking in generalities - which forking dup'd fd is in play here? I don't want to assume or guess any more.

...

Perhaps I could use pthread_atfork() but that might be very tricky - I am not sure if some library that we are linking with does not use it and it pthreads are capable of dealing with us and the library registering their own callbacks. > It also indicates > that patches 1-3 don't have any bearing. So to me that says - 2 series, > one the artsy/fartsy cleanup variety and the second this edge/corner > condition chase based on timing and/or other interactions. > > My first question is - have you bisected your changes? Since this only > seems to matter for metadata locking, you don't necessarily have to > start too early, but I would assume that prior to commit 8b8aefb3 there > is no issue. I haven't. I'm not quite sure why would I do that. I already know the problem and yes, it's in metadata locking. From quick skim through git log: 7a9ca0fa and 6d855abc14338 look like the result of bisect.

Um, well I'd include c34f11998e into that list since that's where metadata locking API's were introduced. The problem space described there is I think exactly what you've been describing as the problem you were trying to avoid.

...

> > My second question is - given my analysis in patch2, was testing of this > patch done with or without patch 2 and 3? Hard to review this without > having my patch2 analysis rattling around in short term memory that > works. At this point, I'm thinking they were in place and may have had > an impact especially since that dup() call was removed, but it's > mentioned during this commit message, which is really confusing. I've tested the whole series not individual patches. So this patch was tested with 2 and 3 applied. However, it's true that since we will spawn a separate process for every transaction (and dup() is called only from there) it is not possible to actually have FD leak (because the transaction code does not fork() further). Having said that, there are other callers that duplicate socket FD and the problem is real (even though after this patch metadata locking is not affected). > > Prior to patch2/3, virNetSocketDupFD has 4 callers of which only one > calls with 'false' and that's the path having problems. > > libxlMigrateDstReceive (using true) > libxlDomainMigrationSrcPerform (using true) > qemuMigrationSrcConnect (using true) Since these were calling virNetSocketDupFD(cloexec=true) anyway, nothing changed for them. > virNetClientDupFD (using false) > > Since your testing certainly doesn't include the first 3, I'll focus on > the last one which has but one caller: > > virLockManagerLockDaemonAcquire > > meaning without patches 2-3, we got a "dup()" socket back instead of > changing the existing socket to F_DUPFD_CLOEXEC. I think I just talked > myself in a circle, but perhaps gives you an understanding of the > struggle with all the patches in one series. > >> diff --git a/src/security/security_dac.c b/src/security/security_dac.c >> index 5aea386e7c..876cca0f2f 100644 >> --- a/src/security/security_dac.c >> +++ b/src/security/security_dac.c >> @@ -561,12 +561,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, >> goto cleanup; >> } >> >> - if ((pid == -1 && >> - virSecurityDACTransactionRun(pid, list) < 0) || >> - (pid != -1 && >> - virProcessRunInMountNamespace(pid, >> - virSecurityDACTransactionRun, >> - list) < 0)) >> + if (pid == -1) >> + pid = getpid(); >> + >> + if (virProcessRunInMountNamespace(pid, >> + virSecurityDACTransactionRun, >> + list) < 0) > > Going back to commit d41c1621, it essentially states: > > "To differentiate whether transaction code should fork() or not the @pid > argument now accepts -1 (which means do not fork)." > > This commit then backtracks or undoes that by using a getpid() causing > every transaction to fork. > > The "logic" for the decision to pass pid == -1 or not is introduced in > the subsequent commit 37131ada based on qemuDomainNamespaceEnabled. > > Thus with this patch every domain w/ or w/o namespace and daemon commit > path will fork. So if this logic is to stay, then there's no reason to > have the namespace check either, is there? Is the vm->pid different than > getpid()? IOW: I think this patch is too big of a leap. Yes, vm->pid is different to getpid(). If there is a domain running with namespaces enabled then we definitely wants to use vm->pid to enter the qemu's namespace and relabel /dev/blah there instead of the namespace that libvirtd (=getpid()) is running. However, some labelling is done prior to running qemu (and setting up its namespace). For instance: qemuDomainWriteMasterKeyFile, qemuProcessStartManagedPRDaemon, qemuProcessMakeDir - they all call qemuSecurityDomainSetPathLabel() at the time when no namespace was built yet. But since we initialize vm->pid to -1 then the namspace check feels redundant. I agree with that. In virSecurityDACTransactionCommit() it does not matter if pid = -1 because vm->pid is -1 or because no namespaces are used and therefore -1 comes from variable initialization in qemuSecurityDomainSetPathLabel().

My point here was setup towards the subsequent point of rather than doing the forks for domain's always just because daemon's now need to do that. Why can't we figure a way to get that pid for the subsequent fork for the narrower case which is the metadata lock or daemon lock.

...

> > However, the contention from the cover is that this issue only happens > when metadata locking is enabled. > > Thus, rather than getpid() for everyone, maybe it's a getpid() for > metadata lock consumers. Seems for pid == -1 we could query @mgr for > what it knows, like virSecurityManagerGetPrivateData and then if > priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, use getpid(). Of > course then @mgr loses its ATTRIBUTE_UNUSED. The whole point of always fork()-ing is to avoid leaking FD even into child process. Patches 2 and 3 are trying to avoid leaking FD into exec(), this one is trying to avoid leaking it into fork(). I mean, if the connection FD is cloned into child, we can't be sure when child decides to close it. It may close it at the worst possible moment - when libvirtd holds some locks (in virtlockd). And because virtlockd locks are not associated with connection (only with PID) from virtlockd perspective it looks like libvirtd has closed connection and yet is still holding some locks. BANG! that's where virtlockd kills libvirtd.

This still says to me not using the getpid() for domain locks when namespaces aren't in play isn't a problem. IOW: If !namespace, then pid == -1 is perfectly fine for domain locks. It's just this new daemon path where we need to fork, IIUC.

...

I don't think there is any other way to prevent fork() cloning an FD.

OpenVMS (hahaha) Actually the OpenVMS Distributed Lock Manager would more than likely handle things, but it's a totally different model. There once was someone looking to add dlm as one of the lock_manager plugins, but it was a one pass and quit type effort a few months back: https://www.redhat.com/archives/libvir-list/2017-December/msg00689.html https://www.redhat.com/archives/libvir-list/2017-December/msg00795.html then https://www.redhat.com/archives/libvir-list/2018-February/msg00185.html Outside of that for some reason I thought there was some very odd/strange trick that might achieve the result you want, I cannot remember though - as it was always deemed black magic voodoo to me.

...

My solution is to fork() in virSecurityDACTransactionCommit() (or SELinux counterpart) which duplicates only the thread that is executing the transaction and runs only virSecurityDACTransactionRun() which is fork() free.

OK - fine, but let's limit that to the path that needs it - and maybe create a helper method to return the pid we want rather than copy pasta.

...

> > Furthermore, if virLockManagerLockDaemonAcquire knows that we have a > daemon type lock, would that virNetClientDupFD need to be altered to use > @true instead of @false? Maybe I'm over thinking this part - but with > patch2 and patch3 adjustments, it's just not clear in my head anymore... > Patches 2 and 3 are not related to metadata locking anymore. However, they started as such (when I was writing them). Anyway, that fact does not reduce their importance.

If those aren't related, then let's wait for Dan to provide some feedback on them. I'm just not confident enough to say, yeah, sounds like the right thing to do. The one thing I learned many years ago when transitioning from the OpenVMS OS into a U*x world was the fork'ing and exec'ing do some really strange wonderful and awful things all at the same time. They are a lot easier that what OpenVMS had to do. John

Michal Privoznik

Thursday, 27 September Thu, 27 Sep

3:51 a.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 09/26/2018 11:14 PM, John Ferlan wrote:

...

On 9/26/18 5:46 AM, Michal Privoznik wrote: > On 09/25/2018 10:52 PM, John Ferlan wrote: >> >> >> On 9/21/18 5:29 AM, Michal Privoznik wrote: >>> There is this latent bug which can result in virtlockd killing >>> libvirtd. The problem is when the following steps occur: >>> >>> Parent | Child >>> ------------------------------------------------------+---------------- >>> 1) virSecurityManagerMetadataLock(path); | >>> This results in connection FD to virtlockd to be | >>> dup()-ed and saved for later use. | >>> | >>> 2) Another thread calling fork(); | >>> This results in the FD from step 1) to be cloned | >>> | >>> 3) virSecurityManagerMetadataUnlock(path); | >>> Here, the FD is closed, but the connection is | >>> still kept open because child process has | >>> duplicated FD | >>> | >>> 4) virSecurityManagerMetadataLock(differentPath); | >>> Again, new connection is opened, FD is dup()-ed | >>> | >>> 5) | exec() or exit() >>> >>> Step 5) results in closing the connection from 1) to be closed, >>> finally. However, at this point virtlockd looks at its internal >>> records if PID from 1) does not still own any resources. Well it >>> does - in step 4) it locked differentPath. This results in >>> virtlockd killing libvirtd. >>> >>> Note that this happens because we do some relabelling even before >>> fork() at which point vm->pid is still -1. When this value is >>> passed down to transaction code it simply runs the transaction >>> in parent process (=libvirtd), which means the owner of metadata >>> locks is the parent process. >>> >>> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> >>> >>> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> >> >> Always good to double up on the S-o-b's when there's locks and forks >> involved. That makes sure the fork gets one too ;-) >> >> >>> --- >>> src/security/security_dac.c | 12 ++++++------ >>> src/security/security_selinux.c | 12 ++++++------ >>> 2 files changed, 12 insertions(+), 12 deletions(-) >>> >> >> I read the cover, it's not simpler than the above. > > Yeah, this is not that simple bug. I'll try to explain it differently: > > On fork(), all FDs of the parent are duplicated to child. So even if > parent closes one of them, the connection is not closed until child > closes the same FD. And since metadata locking is very fragile when it > comes to connection to virtlockd, this fork() behaviour actually plays > against us because it violates our assumption that connection is closed > at the end of metadataUnlock(). In fact it is not - child has it still open. > > If there was a flag like DONT_DUP_ON_FORK I could just set it on > virtlockd connection in metadataLock() and be done. Avoid the forking exec's works too ;-} OK - instead of talking in generalities - which forking dup'd fd is in play here? I don't want to assume or guess any more.

The connection FD. Maybe I'm misreading your comments or the other way around. I'll try to explain what is the problem one last time, in the simplest way I can. 1) Thread A wants to start a domain so it calls virSecurityManagerMetadataLock(). 2) The function opens a connection to virtlockd (represented by FD) and duplicates the FD so that connection is kept open until unlock. 3) Now that locks are acquired, the thread does chown() and setfcon(). 4) the virSecurityManagerMetadataUnlock() function is called, which causes virtlockd to release all the locks. Subsequently to that, the function closes the duplicated FD obtained in step 2). The duplication is done because both virLockManagerAcquire() and virLockManagerRelease() open new connection, do the RPC talk and close the connection. That's just the way the functions are written. Now, because of FD duplication happening in step 2) the connection from virLockManagerAcquire() is not really closed until step 4) where the duplicated FD is closed. This is important to realize: the connection is closed only when the last FD representing it is closed. If I were to duplicate the connection FD a hundred times and then close 99 of those duplicates the connection would still be kept open. Now, lets put some threads into the picture. Imagine thread B which called fork() just in between steps 2) and 3). On fork, ALL partent FDs are duplicated. So the connection FD is duplicated into the child process too. Therefore, in step 4) where we think the connection is closed - well it is not really because there is still one FD around - in the child. And this is huge problem, because we do not know when child is going to close it. It may be (unintentionally) an evil child and close the FD at the worst time possible - when libvirtd holds some locks. Important thing to realize #2: whenever a connection closes, virtlockd looks into its internal records and if PID on the other side held some locks then: a) the locks are releasd, b) the PID is killed. I'm sorry, I can't explain this any simpler. I'll try to find somebody who understands this and have them review the patch. As an alternative approach, if we did not close the connection in 2) and 4) we don't have to care when the child closes the FD because it will not cause the connection to actually close. I wrote a patch like that for previous versions but it was disliked. Maybe I can write it from scratch again and have that one ACKed instead.

...

> > Perhaps I could use pthread_atfork() but that might be very tricky - I > am not sure if some library that we are linking with does not use it and > it pthreads are capable of dealing with us and the library registering > their own callbacks. > >> It also indicates >> that patches 1-3 don't have any bearing. So to me that says - 2 series, >> one the artsy/fartsy cleanup variety and the second this edge/corner >> condition chase based on timing and/or other interactions. >> >> My first question is - have you bisected your changes? Since this only >> seems to matter for metadata locking, you don't necessarily have to >> start too early, but I would assume that prior to commit 8b8aefb3 there >> is no issue. > > I haven't. I'm not quite sure why would I do that. I already know the > problem and yes, it's in metadata locking. From quick skim through git > log: 7a9ca0fa and 6d855abc14338 look like the result of bisect. > Um, well I'd include c34f11998e into that list since that's where metadata locking API's were introduced. The problem space described there is I think exactly what you've been describing as the problem you were trying to avoid. >> >> My second question is - given my analysis in patch2, was testing of this >> patch done with or without patch 2 and 3? Hard to review this without >> having my patch2 analysis rattling around in short term memory that >> works. At this point, I'm thinking they were in place and may have had >> an impact especially since that dup() call was removed, but it's >> mentioned during this commit message, which is really confusing. > > I've tested the whole series not individual patches. So this patch was > tested with 2 and 3 applied. > > However, it's true that since we will spawn a separate process for every > transaction (and dup() is called only from there) it is not possible to > actually have FD leak (because the transaction code does not fork() > further). Having said that, there are other callers that duplicate > socket FD and the problem is real (even though after this patch metadata > locking is not affected). > >> >> Prior to patch2/3, virNetSocketDupFD has 4 callers of which only one >> calls with 'false' and that's the path having problems. >> >> libxlMigrateDstReceive (using true) >> libxlDomainMigrationSrcPerform (using true) >> qemuMigrationSrcConnect (using true) > > Since these were calling virNetSocketDupFD(cloexec=true) anyway, nothing > changed for them. > >> virNetClientDupFD (using false) >> >> Since your testing certainly doesn't include the first 3, I'll focus on >> the last one which has but one caller: >> >> virLockManagerLockDaemonAcquire >> >> meaning without patches 2-3, we got a "dup()" socket back instead of >> changing the existing socket to F_DUPFD_CLOEXEC. I think I just talked >> myself in a circle, but perhaps gives you an understanding of the >> struggle with all the patches in one series. >> >>> diff --git a/src/security/security_dac.c b/src/security/security_dac.c >>> index 5aea386e7c..876cca0f2f 100644 >>> --- a/src/security/security_dac.c >>> +++ b/src/security/security_dac.c >>> @@ -561,12 +561,12 @@ virSecurityDACTransactionCommit(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED, >>> goto cleanup; >>> } >>> >>> - if ((pid == -1 && >>> - virSecurityDACTransactionRun(pid, list) < 0) || >>> - (pid != -1 && >>> - virProcessRunInMountNamespace(pid, >>> - virSecurityDACTransactionRun, >>> - list) < 0)) >>> + if (pid == -1) >>> + pid = getpid(); >>> + >>> + if (virProcessRunInMountNamespace(pid, >>> + virSecurityDACTransactionRun, >>> + list) < 0) >> >> Going back to commit d41c1621, it essentially states: >> >> "To differentiate whether transaction code should fork() or not the @pid >> argument now accepts -1 (which means do not fork)." >> >> This commit then backtracks or undoes that by using a getpid() causing >> every transaction to fork. >> >> The "logic" for the decision to pass pid == -1 or not is introduced in >> the subsequent commit 37131ada based on qemuDomainNamespaceEnabled. >> >> Thus with this patch every domain w/ or w/o namespace and daemon commit >> path will fork. So if this logic is to stay, then there's no reason to >> have the namespace check either, is there? Is the vm->pid different than >> getpid()? IOW: I think this patch is too big of a leap. > > Yes, vm->pid is different to getpid(). If there is a domain running with > namespaces enabled then we definitely wants to use vm->pid to enter the > qemu's namespace and relabel /dev/blah there instead of the namespace > that libvirtd (=getpid()) is running. However, some labelling is done > prior to running qemu (and setting up its namespace). For instance: > qemuDomainWriteMasterKeyFile, qemuProcessStartManagedPRDaemon, > qemuProcessMakeDir - they all call qemuSecurityDomainSetPathLabel() at > the time when no namespace was built yet. > > But since we initialize vm->pid to -1 then the namspace check feels > redundant. I agree with that. In virSecurityDACTransactionCommit() it > does not matter if pid = -1 because vm->pid is -1 or because no > namespaces are used and therefore -1 comes from variable initialization > in qemuSecurityDomainSetPathLabel(). > My point here was setup towards the subsequent point of rather than doing the forks for domain's always just because daemon's now need to do that. Why can't we figure a way to get that pid for the subsequent fork for the narrower case which is the metadata lock or daemon lock. >> >> However, the contention from the cover is that this issue only happens >> when metadata locking is enabled. >> >> Thus, rather than getpid() for everyone, maybe it's a getpid() for >> metadata lock consumers. Seems for pid == -1 we could query @mgr for >> what it knows, like virSecurityManagerGetPrivateData and then if >> priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON, use getpid(). Of >> course then @mgr loses its ATTRIBUTE_UNUSED. > > The whole point of always fork()-ing is to avoid leaking FD even into > child process. Patches 2 and 3 are trying to avoid leaking FD into > exec(), this one is trying to avoid leaking it into fork(). I mean, if > the connection FD is cloned into child, we can't be sure when child > decides to close it. It may close it at the worst possible moment - when > libvirtd holds some locks (in virtlockd). And because virtlockd locks > are not associated with connection (only with PID) from virtlockd > perspective it looks like libvirtd has closed connection and yet is > still holding some locks. BANG! that's where virtlockd kills libvirtd. This still says to me not using the getpid() for domain locks when namespaces aren't in play isn't a problem. IOW: If !namespace, then pid == -1 is perfectly fine for domain locks. It's just this new daemon path where we need to fork, IIUC. > > I don't think there is any other way to prevent fork() cloning an FD. > OpenVMS (hahaha)

I don't think we want to revert all metadata locking patches only to implement support for different beast. Also, we don't need distributed locking. Michal

John Ferlan

8:33 a.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 9/27/18 4:51 AM, Michal Privoznik wrote:

...

On 09/26/2018 11:14 PM, John Ferlan wrote: >

[...]

...

>>> I read the cover, it's not simpler than the above. >> >> Yeah, this is not that simple bug. I'll try to explain it differently: >> >> On fork(), all FDs of the parent are duplicated to child. So even if >> parent closes one of them, the connection is not closed until child >> closes the same FD. And since metadata locking is very fragile when it >> comes to connection to virtlockd, this fork() behaviour actually plays >> against us because it violates our assumption that connection is closed >> at the end of metadataUnlock(). In fact it is not - child has it still open. >> >> If there was a flag like DONT_DUP_ON_FORK I could just set it on >> virtlockd connection in metadataLock() and be done. > > Avoid the forking exec's works too ;-} > > OK - instead of talking in generalities - which forking dup'd fd is in > play here? I don't want to assume or guess any more. The connection FD. Maybe I'm misreading your comments or the other way around. I'll try to explain what is the problem one last time, in the simplest way I can. 1) Thread A wants to start a domain so it calls virSecurityManagerMetadataLock(). 2) The function opens a connection to virtlockd (represented by FD) and duplicates the FD so that connection is kept open until unlock.

Yep, virLockManagerLockDaemonAcquire calls virNetClientDupFD(false) meaning I don't want one with the "FD_CLOEXEC" flag set. While normally that would be OK - is it OK for this path? Does anything change if the argument was true here (e.g. cloexec = priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON)? It seems to me that the 3 patches together dance around two things - that the CLOEXEC is set and that TransactionCommit does the fork to avoid the parent/child duplication problem.

...

3) Now that locks are acquired, the thread does chown() and setfcon(). 4) the virSecurityManagerMetadataUnlock() function is called, which causes virtlockd to release all the locks. Subsequently to that, the function closes the duplicated FD obtained in step 2). The duplication is done because both virLockManagerAcquire() and virLockManagerRelease() open new connection, do the RPC talk and close the connection. That's just the way the functions are written. Now, because of FD duplication happening in step 2) the connection from virLockManagerAcquire() is not really closed until step 4) where the duplicated FD is closed. This is important to realize: the connection is closed only when the last FD representing it is closed. If I were to duplicate the connection FD a hundred times and then close 99 of those duplicates the connection would still be kept open.

I've understood all this.

...

Now, lets put some threads into the picture. Imagine thread B which called fork() just in between steps 2) and 3). On fork, ALL partent FDs are duplicated. So the connection FD is duplicated into the child process too. Therefore, in step 4) where we think the connection is closed - well it is not really because there is still one FD around - in the child.

This is exactly where the story gets cloudy for me. Is ThreadB related to ThreadA? It must be, because how would it get the duplicated fd, right? But ironically we save priv->clientfd, so if we know we've forked a child can we immediately close it? Knowing that our parent still has it open? Off in left field again?

...

And this is huge problem, because we do not know when child is going to close it. It may be (unintentionally) an evil child and close the FD at the worst time possible - when libvirtd holds some locks. Important thing to realize #2: whenever a connection closes, virtlockd looks into its internal records and if PID on the other side held some locks then: a) the locks are releasd, b) the PID is killed. I'm sorry, I can't explain this any simpler. I'll try to find somebody who understands this and have them review the patch.

I think sometimes it's better to see actual calls and traces rather than using "imagine" if type scenarios. I'm sure this is *a lot* easier when you have a couple of debug sessions running and see things happening in real time. If someone else reviews this it doesn't matter to me, but I think this patch is wrong for a couple of reasons. That reasoning doesn't seem to be well received. If this patch was accepted as is, then you'd be calling virProcessRunInMountNamespace even when namespaces weren't being used, even for domain locks. Something I saw in the traces that Marc posted this morning and starting thinking, hey wait this code is only supposed to be run when namespaces are enabled. So I have to assume that's being used in his environment. Would the same problem exist if namespaces weren't being used?

...

As an alternative approach, if we did not close the connection in 2) and

Did you mean clone or close?

...

4) we don't have to care when the child closes the FD because it will not cause the connection to actually close. I wrote a patch like that for previous versions but it was disliked. Maybe I can write it from scratch again and have that one ACKed instead. >

[...]

...

>> >> I don't think there is any other way to prevent fork() cloning an FD. >> > > OpenVMS (hahaha) I don't think we want to revert all metadata locking patches only to implement support for different beast. Also, we don't need distributed locking. Michal

I wasn't serious hence the (hahaha), but quite frankly what I know about dlm I believe it would certainly help, but that's not the point here. I do recall though always having some issue with fd locking in various projects I've been involved with. John

Michal Privoznik

12:59 p.m.

New subject: [libvirt] [PATCH 4/4] security: Always spawn process for transactions

On 09/27/2018 03:33 PM, John Ferlan wrote:

...

On 9/27/18 4:51 AM, Michal Privoznik wrote: > On 09/26/2018 11:14 PM, John Ferlan wrote: >> [...] >>>> I read the cover, it's not simpler than the above. >>> >>> Yeah, this is not that simple bug. I'll try to explain it differently: >>> >>> On fork(), all FDs of the parent are duplicated to child. So even if >>> parent closes one of them, the connection is not closed until child >>> closes the same FD. And since metadata locking is very fragile when it >>> comes to connection to virtlockd, this fork() behaviour actually plays >>> against us because it violates our assumption that connection is closed >>> at the end of metadataUnlock(). In fact it is not - child has it still open. >>> >>> If there was a flag like DONT_DUP_ON_FORK I could just set it on >>> virtlockd connection in metadataLock() and be done. >> >> Avoid the forking exec's works too ;-} >> >> OK - instead of talking in generalities - which forking dup'd fd is in >> play here? I don't want to assume or guess any more. > > The connection FD. > > Maybe I'm misreading your comments or the other way around. I'll try to > explain what is the problem one last time, in the simplest way I can. > > 1) Thread A wants to start a domain so it calls > virSecurityManagerMetadataLock(). > > 2) The function opens a connection to virtlockd (represented by FD) and > duplicates the FD so that connection is kept open until unlock. Yep, virLockManagerLockDaemonAcquire calls virNetClientDupFD(false) meaning I don't want one with the "FD_CLOEXEC" flag set. While normally that would be OK - is it OK for this path? Does anything change if the argument was true here (e.g. cloexec = priv->type == VIR_LOCK_MANAGER_OBJECT_TYPE_DAEMON)? It seems to me that the 3 patches together dance around two things - that the CLOEXEC is set and that TransactionCommit does the fork to avoid the parent/child duplication problem.

Let's forget about CLOEXEC for a moment. I'll wait for Dan to return (or find somebody else) for those patches. This is the only patch left for consideration.

...

> > 3) Now that locks are acquired, the thread does chown() and setfcon(). > > 4) the virSecurityManagerMetadataUnlock() function is called, which > causes virtlockd to release all the locks. Subsequently to that, the > function closes the duplicated FD obtained in step 2). > > The duplication is done because both virLockManagerAcquire() and > virLockManagerRelease() open new connection, do the RPC talk and close > the connection. That's just the way the functions are written. Now, > because of FD duplication happening in step 2) the connection from > virLockManagerAcquire() is not really closed until step 4) where the > duplicated FD is closed. > > This is important to realize: the connection is closed only when the > last FD representing it is closed. If I were to duplicate the connection > FD a hundred times and then close 99 of those duplicates the connection > would still be kept open. I've understood all this. > > Now, lets put some threads into the picture. Imagine thread B which > called fork() just in between steps 2) and 3). On fork, ALL partent FDs > are duplicated. So the connection FD is duplicated into the child > process too. Therefore, in step 4) where we think the connection is > closed - well it is not really because there is still one FD around - in > the child. This is exactly where the story gets cloudy for me. Is ThreadB related to ThreadA? It must be, because how would it get the duplicated fd, right?

They are related in a sense that they run in the same process. But they don't have to be related at all for fork() to duplicate all the FDs. fork() does not ask if two threads are related, it just duplicates ALL opened FDs into the child.

...

But ironically we save priv->clientfd, so if we know we've forked a child can we immediately close it? Knowing that our parent still has it open? Off in left field again?

I'm failing to see how to achieve that. fork() is occurring more than you might think - every time virCommandRun() is called, every time we try to open a file on NFS, and million of other places. Now for every place we would have to put close() as the first thing that child does (and even then that would be unsafe because you are not guaranteed when the child is scheduled to run), and there might be places where we are unable to put close(), e.g. if some library we link with fork()-s too. Also, requiring close() would mean making the FD globally available which goes against all our code separation efforts.

...

> > And this is huge problem, because we do not know when child is going to > close it. It may be (unintentionally) an evil child and close the FD at > the worst time possible - when libvirtd holds some locks. > > Important thing to realize #2: whenever a connection closes, virtlockd > looks into its internal records and if PID on the other side held some > locks then: a) the locks are releasd, b) the PID is killed. > > I'm sorry, I can't explain this any simpler. I'll try to find somebody > who understands this and have them review the patch. I think sometimes it's better to see actual calls and traces rather than using "imagine" if type scenarios. I'm sure this is *a lot* easier when you have a couple of debug sessions running and see things happening in real time. If someone else reviews this it doesn't matter to me, but I think this patch is wrong for a couple of reasons. That reasoning doesn't seem to be well received. If this patch was accepted as is, then you'd be calling virProcessRunInMountNamespace even when namespaces weren't being used, even for domain locks. Something I saw in the traces that Marc posted this morning and starting thinking, hey wait this code is only supposed to be run when namespaces are enabled. So I have to assume that's being used in his environment. Would the same problem exist if namespaces weren't being used?

Yes. Namespaces have nothing to do with this.

...

> > As an alternative approach, if we did not close the connection in 2) and Did you mean clone or close?

I meant close. However, turns out it's easier said than done. virtlockd is not quite happy about some remote procedures being called multiple times over one connection. And yet I have to call them. I'll look deeper but so far this is the only fix we have. Michal

Bjoern Walk

Friday, 21 September Fri, 21 Sep

9:40 a.m.

Michal Privoznik <mprivozn(a)redhat.com> [2018-09-21, 11:29AM +0200]:

...

@Bjoern: I'm still unable to reproduce the issue you reported. However, whilst trying to do so I've came across this bug. However, my gut feeling is that this might help you.

Quick check, same issue. I will start debugging next week to get at least an idea what might happen. -- IBM Systems Linux on Z & Virtualization Development ------------------------------------------------------------------------ IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 ------------------------------------------------------------------------ Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

Michal Privoznik

Tuesday, 25 September Tue, 25 Sep

2:34 a.m.

New subject: [libvirt] [PATCH 5/4] security: Don't try to lock NULL paths

It may happen that in the list of paths/disk sources to relabel there is a disk source. If that is the case, the path is NULL. In that case, we shouldn't try to lock the path. It's likely a network disk anyway and therefore there is nothing to lock. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/security/security_dac.c | 3 ++- src/security/security_selinux.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 876cca0f2f..04168feb3d 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -216,7 +216,8 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, for (i = 0; i < list->nItems; i++) { const char *p = list->items[i]->path; - if (virFileIsDir(p)) + if (!p || + virFileIsDir(p)) continue; VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 3c847d8dcb..3adbeada14 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -227,7 +227,8 @@ virSecuritySELinuxTransactionRun(pid_t pid ATTRIBUTE_UNUSED, for (i = 0; i < list->nItems; i++) { const char *p = list->items[i]->path; - if (virFileIsDir(p)) + if (!p || + virFileIsDir(p)) continue; VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); -- 2.16.4

Erik Skultety

5:47 a.m.

New subject: [libvirt] [PATCH 5/4] security: Don't try to lock NULL paths

On Tue, Sep 25, 2018 at 09:34:50AM +0200, Michal Privoznik wrote:

...

It may happen that in the list of paths/disk sources to relabel there is a disk source. If that is the case, the path is NULL. In

Is there a way to prevent to such a source would not make it into the list in the first place?

...

that case, we shouldn't try to lock the path. It's likely a network disk anyway and therefore there is nothing to lock. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/security/security_dac.c | 3 ++- src/security/security_selinux.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 876cca0f2f..04168feb3d 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -216,7 +216,8 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, for (i = 0; i < list->nItems; i++) { const char *p = list->items[i]->path; - if (virFileIsDir(p)) + if (!p || + virFileIsDir(p))

^this easily fits on the same line... Erik

Michal Privoznik

6:01 a.m.

New subject: [libvirt] [PATCH 5/4] security: Don't try to lock NULL paths

On 09/25/2018 12:47 PM, Erik Skultety wrote:

...

On Tue, Sep 25, 2018 at 09:34:50AM +0200, Michal Privoznik wrote: > It may happen that in the list of paths/disk sources to relabel > there is a disk source. If that is the case, the path is NULL. In Is there a way to prevent to such a source would not make it into the list in the first place?

Not at all. Disk source may consists of variety of strings. A path is just one of the options. The other is host FQN for iscsi, rbd, ceph and all sorts of other beasts. And yet, we might want to relabel them.

...

> that case, we shouldn't try to lock the path. It's likely a > network disk anyway and therefore there is nothing to lock. > > Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> > --- > src/security/security_dac.c | 3 ++- > src/security/security_selinux.c | 3 ++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/security/security_dac.c b/src/security/security_dac.c > index 876cca0f2f..04168feb3d 100644 > --- a/src/security/security_dac.c > +++ b/src/security/security_dac.c > @@ -216,7 +216,8 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, > for (i = 0; i < list->nItems; i++) { > const char *p = list->items[i]->path; > > - if (virFileIsDir(p)) > + if (!p || > + virFileIsDir(p)) ^this easily fits on the same line...

Yep, I just wanted to separate these conditions as they both serve different purposes. But I can move it onto a single line if desired. Michal

Erik Skultety

8:18 a.m.

New subject: [libvirt] [PATCH 5/4] security: Don't try to lock NULL paths

On Tue, Sep 25, 2018 at 01:01:59PM +0200, Michal Privoznik wrote:

...

On 09/25/2018 12:47 PM, Erik Skultety wrote: > On Tue, Sep 25, 2018 at 09:34:50AM +0200, Michal Privoznik wrote: >> It may happen that in the list of paths/disk sources to relabel >> there is a disk source. If that is the case, the path is NULL. In > > Is there a way to prevent to such a source would not make it into the list in > the first place? Not at all. Disk source may consists of variety of strings. A path is just one of the options. The other is host FQN for iscsi, rbd, ceph and all sorts of other beasts. And yet, we might want to relabel them.

Fair enough. Reviewed-by: Erik Skultety <eskultet(a)redhat.com>

John Ferlan

5:57 p.m.

New subject: [libvirt] [PATCH 5/4] security: Don't try to lock NULL paths

On 9/25/18 3:34 AM, Michal Privoznik wrote:

...

I think this needs a tweak to reference commit 6d855abc1 which only filtered if the provided @p was a directory. This adds another filter when @p is NULL such as would be the case with networked storage. NB: The storage source is only passed for DAC and not selinux. The DAC code makes a some valiant attempts at src->path if not Local too. The selinux code has lots of branches and callers which seem to validly pass a path, but I could have missed a path or some nuance. The "key" is chasing virSecurityDACChownListAppend and virSecuritySELinuxContextListAppend where the list->[n]items is populated via VIR_APPEND_ELEMENT. Expect to spend some time on the chase! You already have an R-by and I don't have anything else to provide on this particular one other than yeah, better safe than sorry and passing NULL. Although I have to imagine the stat(NULL, &s) in virFileIsDir wouldn't have been pleased. John

...

Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- src/security/security_dac.c | 3 ++- src/security/security_selinux.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 876cca0f2f..04168feb3d 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -216,7 +216,8 @@ virSecurityDACTransactionRun(pid_t pid ATTRIBUTE_UNUSED, for (i = 0; i < list->nItems; i++) { const char *p = list->items[i]->path; - if (virFileIsDir(p)) + if (!p || + virFileIsDir(p)) continue; VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p); diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 3c847d8dcb..3adbeada14 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -227,7 +227,8 @@ virSecuritySELinuxTransactionRun(pid_t pid ATTRIBUTE_UNUSED, for (i = 0; i < list->nItems; i++) { const char *p = list->items[i]->path; - if (virFileIsDir(p)) + if (!p || + virFileIsDir(p)) continue; VIR_APPEND_ELEMENT_COPY_INPLACE(paths, npaths, p);

2470

days inactive

2476

days old

devel@lists.libvirt.org

Manage subscription

24 comments

5 participants

tags (0)

participants (5)

Bjoern Walk
Erik Skultety
John Ferlan
Marc Hartmayer
Michal Privoznik

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[libvirt] [PATCH 0/4] Couple of metadata locking fixes