[libvirt] [PATCH 0/5] Couple of LXC fixes

*** BLURB HERE *** Michal Prívozník (5): lxc: Use correct job type for destroying a domain vircgroup: Try harder to kill cgroup lxc: Restore seclabels after the container is killed virinitctl: Expose fifo paths and allow caller to chose one lxc: Don't reboot host on virDomainReboot src/libvirt_private.syms | 1 + src/lxc/lxc_domain.c | 77 ++++++++++++++++++++++++++++++++++++++++ src/lxc/lxc_domain.h | 4 +++ src/lxc/lxc_driver.c | 19 ++-------- src/lxc/lxc_process.c | 22 ++++++------ src/util/vircgroup.c | 15 ++++---- src/util/virinitctl.c | 66 +++++++++++++++++++++------------- src/util/virinitctl.h | 6 +++- 8 files changed, 152 insertions(+), 58 deletions(-) -- 2.19.2

Not that it would matter because LXC driver doesn't differentiate the job types so far, but nevertheless the Destroy() should grab LXC_JOB_DESTROY. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/lxc/lxc_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index df15a0da50..6c0f9b57db 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1462,7 +1462,7 @@ lxcDomainDestroyFlags(virDomainPtr dom, if (virDomainDestroyFlagsEnsureACL(dom->conn, vm->def) < 0) goto cleanup; - if (virLXCDomainObjBeginJob(driver, vm, LXC_JOB_MODIFY) < 0) + if (virLXCDomainObjBeginJob(driver, vm, LXC_JOB_DESTROY) < 0) goto cleanup; if (virDomainObjCheckActive(vm) < 0) -- 2.19.2

On Fri, Jan 25, 2019 at 02:31:45PM +0100, Michal Privoznik wrote:
Not that it would matter because LXC driver doesn't differentiate the job types so far, but nevertheless the Destroy() should grab LXC_JOB_DESTROY.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
Reviewed-by: Erik Skultety <eskultet@redhat.com>

Prior to rewrite of cgroup code we only had one backend to try. After the rewrite the virCgroupBackendGetAll() returns both backends (for v1 and v2). However, not both have to really be present on the system which results in killRecursive callback failing which in turn might mean we won't try the other backend. At the same time, this function reports no error as it should. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircgroup.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 3ebb3b0a0f..b8b56f1263 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -2606,6 +2606,7 @@ virCgroupKillRecursive(virCgroupPtr group, int signum) int ret = 0; int rc; size_t i; + bool backendAvailable = false; virCgroupBackendPtr *backends = virCgroupBackendGetAll(); virHashTablePtr pids = virHashCreateFull(100, NULL, @@ -2616,13 +2617,9 @@ virCgroupKillRecursive(virCgroupPtr group, int signum) VIR_DEBUG("group=%p path=%s signum=%d", group, group->path, signum); - if (!backends) { - ret = -1; - goto cleanup; - } - for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { - if (backends[i]) { + if (backends && backends[i] && backends[i]->available()) { + backendAvailable = true; rc = backends[i]->killRecursive(group, signum, pids); if (rc < 0) { ret = -1; @@ -2633,6 +2630,12 @@ virCgroupKillRecursive(virCgroupPtr group, int signum) } } + if (!backends || !backendAvailable) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("no cgroup backend available")); + goto cleanup; + } + cleanup: virHashFree(pids); return ret; -- 2.19.2

On Fri, Jan 25, 2019 at 02:31:46PM +0100, Michal Privoznik wrote:
Prior to rewrite of cgroup code we only had one backend to try. After the rewrite the virCgroupBackendGetAll() returns both backends (for v1 and v2). However, not both have to really be present on the system which results in killRecursive callback failing which in turn might mean we won't try the other backend.
At the same time, this function reports no error as it should.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ahhh, this explains the bizarre unidentified error I was getting from Destroy() when testing the Go bindings in 5.0.0 libvirt on a host that only supports v1 Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Jan 25, 2019 at 02:31:46PM +0100, Michal Privoznik wrote:
Prior to rewrite of cgroup code we only had one backend to try. After the rewrite the virCgroupBackendGetAll() returns both backends (for v1 and v2). However, not both have to really be present on the system which results in killRecursive callback failing which in turn might mean we won't try the other backend.
At the same time, this function reports no error as it should.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircgroup.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 3ebb3b0a0f..b8b56f1263 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -2606,6 +2606,7 @@ virCgroupKillRecursive(virCgroupPtr group, int signum) int ret = 0; int rc; size_t i; + bool backendAvailable = false; virCgroupBackendPtr *backends = virCgroupBackendGetAll(); virHashTablePtr pids = virHashCreateFull(100, NULL, @@ -2616,13 +2617,9 @@ virCgroupKillRecursive(virCgroupPtr group, int signum)
VIR_DEBUG("group=%p path=%s signum=%d", group, group->path, signum);
- if (!backends) { - ret = -1; - goto cleanup; - } - for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { - if (backends[i]) { + if (backends && backends[i] && backends[i]->available()) { + backendAvailable = true; rc = backends[i]->killRecursive(group, signum, pids);
Assuming that it's not possible for v2 backend to succeed with ^this and v1 to fail with an error: Reviewed-by: Erik Skultety <eskultet@redhat.com>

Due to a bug the seclabels are restored before any PID in the container is killed. This should be done afterwards in virLXCProcessCleanup. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/lxc/lxc_process.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 33c806630b..a3481bfa08 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -180,6 +180,17 @@ static void virLXCProcessCleanup(virLXCDriverPtr driver, VIR_FREE(xml); } + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, false, false); + virSecurityManagerReleaseLabel(driver->securityManager, vm->def); + /* Clear out dynamically assigned labels */ + if (vm->def->nseclabels && + vm->def->seclabels[0]->type == VIR_DOMAIN_SECLABEL_DYNAMIC) { + VIR_FREE(vm->def->seclabels[0]->model); + VIR_FREE(vm->def->seclabels[0]->label); + VIR_FREE(vm->def->seclabels[0]->imagelabel); + } + /* Stop autodestroy in case guest is restarted */ virCloseCallbacksUnset(driver->closeCallbacks, vm, lxcProcessAutoDestroy); @@ -836,17 +847,6 @@ int virLXCProcessStop(virLXCDriverPtr driver, priv = vm->privateData; - virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, false, false); - virSecurityManagerReleaseLabel(driver->securityManager, vm->def); - /* Clear out dynamically assigned labels */ - if (vm->def->nseclabels && - vm->def->seclabels[0]->type == VIR_DOMAIN_SECLABEL_DYNAMIC) { - VIR_FREE(vm->def->seclabels[0]->model); - VIR_FREE(vm->def->seclabels[0]->label); - VIR_FREE(vm->def->seclabels[0]->imagelabel); - } - /* If the LXC domain is suspended we send all processes a SIGKILL * and thaw them. Upon wakeup the process sees the pending signal * and dies immediately. It is guaranteed that priv->cgroup != NULL -- 2.19.2

On Fri, Jan 25, 2019 at 02:31:47PM +0100, Michal Privoznik wrote:
Due to a bug the seclabels are restored before any PID in the container is killed. This should be done afterwards in virLXCProcessCleanup.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

So far the virInitctlSetRunLevel() is fully automatic. It finds the correct fifo to use to talk to the init and it will set the desired runlevel. Well, callers (so far there is just one) will need to inspect the fifo a bit just before the runlevel is set. Therefore, expose the internal list of fifos and also allow caller to explicitly use one. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/lxc/lxc_driver.c | 2 +- src/util/virinitctl.c | 66 +++++++++++++++++++++++++--------------- src/util/virinitctl.h | 6 +++- 4 files changed, 49 insertions(+), 26 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 89b8ca3b4f..af490be12c 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2049,6 +2049,7 @@ virIdentitySetX509DName; # util/virinitctl.h +virInitctlFifos; virInitctlSetRunLevel; diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 6c0f9b57db..943d199616 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -3277,7 +3277,7 @@ lxcDomainInitctlCallback(pid_t pid ATTRIBUTE_UNUSED, void *opaque) { int *command = opaque; - return virInitctlSetRunLevel(*command); + return virInitctlSetRunLevel(NULL, *command); } diff --git a/src/util/virinitctl.c b/src/util/virinitctl.c index 0b06743151..8f8bbae4bc 100644 --- a/src/util/virinitctl.c +++ b/src/util/virinitctl.c @@ -101,7 +101,20 @@ struct virInitctlRequest { verify(sizeof(struct virInitctlRequest) == 384); # endif -/* + +/* List of fifos that inits are known to listen on */ +const char *virInitctlFifos[] = { + "/run/initctl", + "/dev/initctl", + "/etc/.initctl", +}; + + +/** + * virInitctlSetRunLevel: + * @fifo: the path to fifo that init listens on (can be NULL for autodetection) + * @level: the desired runlevel + * * Send a message to init to change the runlevel. This function is * asynchronous-signal-safe (thus safe to use after fork of a * multithreaded parent) - which is good, because it should only be @@ -110,18 +123,14 @@ struct virInitctlRequest { * Returns 1 on success, 0 if initctl does not exist, -1 on error */ int -virInitctlSetRunLevel(virInitctlRunLevel level) +virInitctlSetRunLevel(const char *fifo, + virInitctlRunLevel level) { struct virInitctlRequest req; int fd = -1; int ret = -1; - const char *initctl_fifo = NULL; + const int open_flags = O_WRONLY|O_NONBLOCK|O_CLOEXEC|O_NOCTTY; size_t i = 0; - const char *initctl_fifos[] = { - "/run/initctl", - "/dev/initctl", - "/etc/.initctl", - }; memset(&req, 0, sizeof(req)); @@ -131,31 +140,39 @@ virInitctlSetRunLevel(virInitctlRunLevel level) /* Yes it is an 'int' field, but wants a numeric character. Go figure */ req.runlevel = '0' + level; - for (i = 0; i < ARRAY_CARDINALITY(initctl_fifos); i++) { - initctl_fifo = initctl_fifos[i]; - - if ((fd = open(initctl_fifo, - O_WRONLY|O_NONBLOCK|O_CLOEXEC|O_NOCTTY)) >= 0) - break; - - if (errno != ENOENT) { + if (fifo) { + if ((fd = open(fifo, open_flags)) < 0) { virReportSystemError(errno, _("Cannot open init control %s"), - initctl_fifo); + fifo); goto cleanup; } - } + } else { + for (i = 0; i < ARRAY_CARDINALITY(virInitctlFifos); i++) { + fifo = virInitctlFifos[i]; - /* Ensure we found a valid initctl fifo */ - if (fd < 0) { - ret = 0; - goto cleanup; + if ((fd = open(fifo, open_flags)) >= 0) + break; + + if (errno != ENOENT) { + virReportSystemError(errno, + _("Cannot open init control %s"), + fifo); + goto cleanup; + } + } + + /* Ensure we found a valid initctl fifo */ + if (fd < 0) { + ret = 0; + goto cleanup; + } } if (safewrite(fd, &req, sizeof(req)) != sizeof(req)) { virReportSystemError(errno, _("Failed to send request to init control %s"), - initctl_fifo); + fifo); goto cleanup; } @@ -166,7 +183,8 @@ virInitctlSetRunLevel(virInitctlRunLevel level) return ret; } #else -int virInitctlSetRunLevel(virInitctlRunLevel level ATTRIBUTE_UNUSED) +int virInitctlSetRunLevel(const char *fifo ATTRIBUTE_UNUSED, + virInitctlRunLevel level ATTRIBUTE_UNUSED) { virReportUnsupportedError(); return -1; diff --git a/src/util/virinitctl.h b/src/util/virinitctl.h index 7ac627883a..f12741c7c1 100644 --- a/src/util/virinitctl.h +++ b/src/util/virinitctl.h @@ -33,6 +33,10 @@ typedef enum { VIR_INITCTL_RUNLEVEL_LAST } virInitctlRunLevel; -int virInitctlSetRunLevel(virInitctlRunLevel level); + +extern const char *virInitctlFifos[3]; + +int virInitctlSetRunLevel(const char *fifo, + virInitctlRunLevel level); #endif /* LIBVIRT_VIRINITCTL_H */ -- 2.19.2

On Fri, Jan 25, 2019 at 02:31:48PM +0100, Michal Privoznik wrote:
So far the virInitctlSetRunLevel() is fully automatic. It finds the correct fifo to use to talk to the init and it will set the desired runlevel. Well, callers (so far there is just one) will need to inspect the fifo a bit just before the runlevel is set. Therefore, expose the internal list of fifos and also allow caller to explicitly use one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
...
-int virInitctlSetRunLevel(virInitctlRunLevel level); + +extern const char *virInitctlFifos[3];
I prefer [] to prevent compiler from complaining if someone adds a new fifo
+ +int virInitctlSetRunLevel(const char *fifo, + virInitctlRunLevel level);
#endif /* LIBVIRT_VIRINITCTL_H */ -- Reviewed-by: Erik Skultety <eskultet@redhat.com>

On 2/6/19 3:43 PM, Erik Skultety wrote:
On Fri, Jan 25, 2019 at 02:31:48PM +0100, Michal Privoznik wrote:
So far the virInitctlSetRunLevel() is fully automatic. It finds the correct fifo to use to talk to the init and it will set the desired runlevel. Well, callers (so far there is just one) will need to inspect the fifo a bit just before the runlevel is set. Therefore, expose the internal list of fifos and also allow caller to explicitly use one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
...
-int virInitctlSetRunLevel(virInitctlRunLevel level); + +extern const char *virInitctlFifos[3];
I prefer [] to prevent compiler from complaining if someone adds a new fifo
Problem with this is that then I can't use ARRAY_CARDINALITY() macro in 5/5. As we discussed in person, I'm turning this into a NULL terminated list and adjusting 5/5 correspondingly.
+ +int virInitctlSetRunLevel(const char *fifo, + virInitctlRunLevel level);
#endif /* LIBVIRT_VIRINITCTL_H */ -- Reviewed-by: Erik Skultety <eskultet@redhat.com>
Michal

If the container is really a simple one (init is just bash and the whole root is passed through) then virDomainReboot and virDomainShutdown will talk to the actual init within the host. Therefore, 'virsh shutdown $dom' will result in shutting down the host. True, at that point the container is shut down too but looks a bit harsh to me. The solution is to check if the init inside the container is or is not the same as the init running on the host. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/lxc/lxc_domain.c | 77 ++++++++++++++++++++++++++++++++++++++++++++ src/lxc/lxc_domain.h | 4 +++ src/lxc/lxc_driver.c | 17 ++-------- 3 files changed, 83 insertions(+), 15 deletions(-) diff --git a/src/lxc/lxc_domain.c b/src/lxc/lxc_domain.c index b197f9dfc2..73023c0a57 100644 --- a/src/lxc/lxc_domain.c +++ b/src/lxc/lxc_domain.c @@ -32,6 +32,7 @@ #include "virfile.h" #include "virtime.h" #include "virsystemd.h" +#include "virinitctl.h" #define VIR_FROM_THIS VIR_FROM_LXC #define LXC_NAMESPACE_HREF "http://libvirt.org/schemas/domain/lxc/1.0" @@ -416,3 +417,79 @@ virLXCDomainGetMachineName(virDomainDefPtr def, pid_t pid) return ret; } + + +typedef struct _lxcDomainInitctlCallbackData lxcDomainInitctlCallbackData; +struct _lxcDomainInitctlCallbackData { + int runlevel; + bool st_valid[ARRAY_CARDINALITY(virInitctlFifos)]; + struct stat st[ARRAY_CARDINALITY(virInitctlFifos)]; +}; + + +static int +lxcDomainInitctlCallback(pid_t pid ATTRIBUTE_UNUSED, + void *opaque) +{ + lxcDomainInitctlCallbackData *data = opaque; + size_t i; + + for (i = 0; i < ARRAY_CARDINALITY(virInitctlFifos); i++) { + const char *fifo = virInitctlFifos[i]; + struct stat cont_sb; + + if (stat(fifo, &cont_sb) < 0) { + if (errno == ENOENT) + continue; + + virReportSystemError(errno, _("Unable to stat %s"), fifo); + return -1; + } + + /* Check if the init fifo is not the very one that's on + * the host. We don't want to change the host's runlevel. + */ + if (data->st_valid[i] && + data->st[i].st_dev == cont_sb.st_dev && + data->st[i].st_ino == cont_sb.st_ino) + continue; + + return virInitctlSetRunLevel(fifo, data->runlevel); + } + + /* If no usable fifo was found then declare success. Caller + * will try killing the domain with signal. */ + return 0; +} + + +int +virLXCDomainSetRunlevel(virDomainObjPtr vm, + int runlevel) +{ + virLXCDomainObjPrivatePtr priv = vm->privateData; + lxcDomainInitctlCallbackData data; + size_t i; + + memset(&data, 0, sizeof(data)); + + data.runlevel = runlevel; + + for (i = 0; i < ARRAY_CARDINALITY(virInitctlFifos); i++) { + const char *fifo = virInitctlFifos[i]; + + if (stat(fifo, &(data.st[i])) < 0) { + if (errno == ENOENT) + continue; + + virReportSystemError(errno, _("Unable to stat %s"), fifo); + return -1; + } + + data.st_valid[i] = true; + } + + return virProcessRunInMountNamespace(priv->initpid, + lxcDomainInitctlCallback, + &data); +} diff --git a/src/lxc/lxc_domain.h b/src/lxc/lxc_domain.h index 364b8e5a44..c62d6d1362 100644 --- a/src/lxc/lxc_domain.h +++ b/src/lxc/lxc_domain.h @@ -109,4 +109,8 @@ virLXCDomainObjEndJob(virLXCDriverPtr driver, char * virLXCDomainGetMachineName(virDomainDefPtr def, pid_t pid); +int +virLXCDomainSetRunlevel(virDomainObjPtr vm, + int runlevel); + #endif /* LIBVIRT_LXC_DOMAIN_H */ diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 943d199616..f0d72aa569 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -3272,15 +3272,6 @@ lxcConnectListAllDomains(virConnectPtr conn, } -static int -lxcDomainInitctlCallback(pid_t pid ATTRIBUTE_UNUSED, - void *opaque) -{ - int *command = opaque; - return virInitctlSetRunLevel(NULL, *command); -} - - static int lxcDomainShutdownFlags(virDomainPtr dom, unsigned int flags) @@ -3318,9 +3309,7 @@ lxcDomainShutdownFlags(virDomainPtr dom, (flags & VIR_DOMAIN_SHUTDOWN_INITCTL)) { int command = VIR_INITCTL_RUNLEVEL_POWEROFF; - if ((rc = virProcessRunInMountNamespace(priv->initpid, - lxcDomainInitctlCallback, - &command)) < 0) + if ((rc = virLXCDomainSetRunlevel(vm, command)) < 0) goto endjob; if (rc == 0 && flags != 0 && ((flags & ~VIR_DOMAIN_SHUTDOWN_INITCTL) == 0)) { @@ -3398,9 +3387,7 @@ lxcDomainReboot(virDomainPtr dom, (flags & VIR_DOMAIN_REBOOT_INITCTL)) { int command = VIR_INITCTL_RUNLEVEL_REBOOT; - if ((rc = virProcessRunInMountNamespace(priv->initpid, - lxcDomainInitctlCallback, - &command)) < 0) + if ((rc = virLXCDomainSetRunlevel(vm, command)) < 0) goto endjob; if (rc == 0 && flags != 0 && ((flags & ~VIR_DOMAIN_SHUTDOWN_INITCTL) == 0)) { -- 2.19.2

On Fri, Jan 25, 2019 at 02:31:49PM +0100, Michal Privoznik wrote:
If the container is really a simple one (init is just bash and the whole root is passed through) then virDomainReboot and virDomainShutdown will talk to the actual init within the host. Therefore, 'virsh shutdown $dom' will result in shutting down the host. True, at that point the container is shut down too but
Nevertheless, it's hilarious :).
looks a bit harsh to me.
The solution is to check if the init inside the container is or is not the same as the init running on the host.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I'll trust you tried it, code-wise, I don't see an issue, it's pretty straightforward. Reviewed-by: Erik Skultety <eskultet@redhat.com>
participants (3)
-
Daniel P. Berrangé
-
Erik Skultety
-
Michal Privoznik