[libvirt] [PATCH 00/10] Another set of qemu namespace fixes

The major problem was with symlinks. Imagine the following chain of symlinks: /dev/my_awesome_disk -> /home/user/blaah -> /dev/disk/by-uuid/$uuid -> /dev/sda We really need to create all those /dev/* symlinks and /dev/sda device. Also, some other (less critical) bugs are fixed. Michal Privoznik (10): virProcessRunInMountNamespace: Report errors from child util: Introduce virFileReadLink qemuDomainPrepareDisk: Fix ordering qemuSecurityRestoreAllLabel: Don't use transactions qemu_security: Use more transactions qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices qemuDomainCreateDevice: Properly deal with symlinks qemuDomainCreateDevice: Don't loop endlessly qemuDomainAttachDeviceMknod: Deal with symlinks qemuDomainAttachDeviceMknod: Don't loop endlessly src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 438 +++++++++++++++++++++++++++-------------------- src/qemu/qemu_hotplug.c | 20 +-- src/qemu/qemu_security.c | 137 +++++++++------ src/util/virfile.c | 12 ++ src/util/virfile.h | 2 + src/util/virprocess.c | 8 +- 7 files changed, 374 insertions(+), 244 deletions(-) -- 2.11.0

The comment to the function states that the errors from the child process are reported. Well, the error buffer is filled with possible error messages. But then it is thrown away. Among with important error message from the child process. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virprocess.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/util/virprocess.c b/src/util/virprocess.c index f5c7ebb96..16eb41212 100644 --- a/src/util/virprocess.c +++ b/src/util/virprocess.c @@ -1137,8 +1137,14 @@ virProcessRunInMountNamespace(pid_t pid, VIR_FORCE_CLOSE(errfd[1]); ignore_value(virFileReadHeaderFD(errfd[0], 1024, &buf)); ret = virProcessWait(child, &status, false); - if (!ret) + if (!ret) { ret = status == EXIT_CANCELED ? -1 : status; + if (ret) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("child reported: %s"), + NULLSTR(buf)); + } + } VIR_FREE(buf); } -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:41AM +0100, Michal Privoznik wrote:
The comment to the function states that the errors from the child process are reported. Well, the error buffer is filled with possible error messages. But then it is thrown away. Among with important error message from the child process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virprocess.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
ACK

We will need to traverse the symlinks one step at the time. Therefore we need to see where a symlink is pointing to. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 12 ++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 15 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cfeb43cf0..7f7dcfe44 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1615,6 +1615,7 @@ virFileReadAllQuiet; virFileReadBufQuiet; virFileReadHeaderFD; virFileReadLimFD; +virFileReadLink; virFileRelLinkPointsTo; virFileRemove; virFileRemoveLastComponent; diff --git a/src/util/virfile.c b/src/util/virfile.c index bf8099e34..49ea1d1f0 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -76,6 +76,7 @@ #include "virutil.h" #include "c-ctype.h" +#include "areadlink.h" #define VIR_FROM_THIS VIR_FROM_NONE @@ -1614,6 +1615,17 @@ virFileIsLink(const char *linkpath) return S_ISLNK(st.st_mode) != 0; } +/* + * Read where symlink is pointing to. + * + * Returns 0 on success (@linkpath is a successfully read link), + * -1 with errno set upon error. + */ +int +virFileReadLink(const char *linkpath, char **resultpath) +{ + return (*resultpath = areadlink(linkpath)) ? 0 : -1; +} /* * Finds a requested executable file in the PATH env. e.g.: diff --git a/src/util/virfile.h b/src/util/virfile.h index 0343acd5b..981a9e07d 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -166,6 +166,8 @@ int virFileResolveAllLinks(const char *linkpath, int virFileIsLink(const char *linkpath) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; +int virFileReadLink(const char *linkpath, char **resultpath); + char *virFindFileInPath(const char *file); char *virFileFindResource(const char *filename, -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:42AM +0100, Michal Privoznik wrote:
We will need to traverse the symlinks one step at the time. Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 12 ++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 15 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cfeb43cf0..7f7dcfe44 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1615,6 +1615,7 @@ virFileReadAllQuiet; virFileReadBufQuiet; virFileReadHeaderFD; virFileReadLimFD; +virFileReadLink; virFileRelLinkPointsTo; virFileRemove; virFileRemoveLastComponent; diff --git a/src/util/virfile.c b/src/util/virfile.c index bf8099e34..49ea1d1f0 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -76,6 +76,7 @@ #include "virutil.h"
#include "c-ctype.h" +#include "areadlink.h"
#define VIR_FROM_THIS VIR_FROM_NONE
@@ -1614,6 +1615,17 @@ virFileIsLink(const char *linkpath) return S_ISLNK(st.st_mode) != 0; }
+/* + * Read where symlink is pointing to. + * + * Returns 0 on success (@linkpath is a successfully read link), + * -1 with errno set upon error. + */ +int +virFileReadLink(const char *linkpath, char **resultpath) +{ + return (*resultpath = areadlink(linkpath)) ? 0 : -1; +}
I don't really see the benefit of this function, are you trying to obfuscate it? You essentially double the return information for no reason and try to push it all into one line.
/* * Finds a requested executable file in the PATH env. e.g.: diff --git a/src/util/virfile.h b/src/util/virfile.h index 0343acd5b..981a9e07d 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -166,6 +166,8 @@ int virFileResolveAllLinks(const char *linkpath, int virFileIsLink(const char *linkpath) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
+int virFileReadLink(const char *linkpath, char **resultpath); + char *virFindFileInPath(const char *file);
char *virFileFindResource(const char *filename, -- 2.11.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Thu, Feb 02, 2017 at 03:11:56PM +0100, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:42AM +0100, Michal Privoznik wrote:
We will need to traverse the symlinks one step at the time. Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 12 ++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 15 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cfeb43cf0..7f7dcfe44 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1615,6 +1615,7 @@ virFileReadAllQuiet; virFileReadBufQuiet; virFileReadHeaderFD; virFileReadLimFD; +virFileReadLink; virFileRelLinkPointsTo; virFileRemove; virFileRemoveLastComponent; diff --git a/src/util/virfile.c b/src/util/virfile.c index bf8099e34..49ea1d1f0 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -76,6 +76,7 @@ #include "virutil.h"
#include "c-ctype.h" +#include "areadlink.h"
#define VIR_FROM_THIS VIR_FROM_NONE
@@ -1614,6 +1615,17 @@ virFileIsLink(const char *linkpath) return S_ISLNK(st.st_mode) != 0; }
+/* + * Read where symlink is pointing to. + * + * Returns 0 on success (@linkpath is a successfully read link), + * -1 with errno set upon error. + */ +int +virFileReadLink(const char *linkpath, char **resultpath) +{ + return (*resultpath = areadlink(linkpath)) ? 0 : -1; +}
I don't really see the benefit of this function, are you trying to obfuscate it? You essentially double the return information for no reason and try to push it all into one line.
It would be useful if you have an ATTRIBUTE_RETURN_CHECK annotation on it, as it would allow compiler time verification of error checking, which is not possible when you overload the return value to contain both the data and error indicator.
/* * Finds a requested executable file in the PATH env. e.g.: diff --git a/src/util/virfile.h b/src/util/virfile.h index 0343acd5b..981a9e07d 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -166,6 +166,8 @@ int virFileResolveAllLinks(const char *linkpath, int virFileIsLink(const char *linkpath) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
+int virFileReadLink(const char *linkpath, char **resultpath);
eg add ATTRIBUTE_RETURN_CHECK here
+ char *virFindFileInPath(const char *file);
char *virFileFindResource(const char *filename,
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Thu, Feb 02, 2017 at 03:09:15PM +0000, Daniel P. Berrange wrote:
On Thu, Feb 02, 2017 at 03:11:56PM +0100, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:42AM +0100, Michal Privoznik wrote:
We will need to traverse the symlinks one step at the time. Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 12 ++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 15 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cfeb43cf0..7f7dcfe44 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1615,6 +1615,7 @@ virFileReadAllQuiet; virFileReadBufQuiet; virFileReadHeaderFD; virFileReadLimFD; +virFileReadLink; virFileRelLinkPointsTo; virFileRemove; virFileRemoveLastComponent; diff --git a/src/util/virfile.c b/src/util/virfile.c index bf8099e34..49ea1d1f0 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -76,6 +76,7 @@ #include "virutil.h"
#include "c-ctype.h" +#include "areadlink.h"
#define VIR_FROM_THIS VIR_FROM_NONE
@@ -1614,6 +1615,17 @@ virFileIsLink(const char *linkpath) return S_ISLNK(st.st_mode) != 0; }
+/* + * Read where symlink is pointing to. + * + * Returns 0 on success (@linkpath is a successfully read link), + * -1 with errno set upon error. + */ +int +virFileReadLink(const char *linkpath, char **resultpath) +{ + return (*resultpath = areadlink(linkpath)) ? 0 : -1; +}
I don't really see the benefit of this function, are you trying to obfuscate it? You essentially double the return information for no reason and try to push it all into one line.
It would be useful if you have an ATTRIBUTE_RETURN_CHECK annotation on it, as it would allow compiler time verification of error checking, which is not possible when you overload the return value to contain both the data and error indicator.
More like if there was another logic, e.g. checking that it is a link and if it is not, returning the same path or similar. ATTRIBUTE_RETURN_CHECK doesn't make much sense to me here since you can get the same information from the second parameter and you are basically making the caller use two values that have the same information. Moreover you can set ATTRIBUTE_RETURN_CHECK for the function and just directly return the output of areadlink(), essentially just renaming it. I don't see the point in that. It would be different if the function guaranteed non-modification of that parameter when it errors out, for example. Or set an call virReportSystemError(errno, ...); so that the error reporting is done consistently and in one place. I haven't checked yet how Michal uses it, though, so I can't say what's the best solution for now. But I now see where it comes from. It's the virFileResolveAllLinks() that does the same thing. That doesn't mean it's right, though. My question is this. If returning integer, which does not contain any other value then just the error, is desired, should we mandate that for non-simple functions and rewrite, for example, qemuBuildCommandLine()?
/* * Finds a requested executable file in the PATH env. e.g.: diff --git a/src/util/virfile.h b/src/util/virfile.h index 0343acd5b..981a9e07d 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -166,6 +166,8 @@ int virFileResolveAllLinks(const char *linkpath, int virFileIsLink(const char *linkpath) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
+int virFileReadLink(const char *linkpath, char **resultpath);
eg add ATTRIBUTE_RETURN_CHECK here
+ char *virFindFileInPath(const char *file);
char *virFileFindResource(const char *filename,
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Thu, Feb 02, 2017 at 08:57:53PM +0100, Martin Kletzander wrote:
On Thu, Feb 02, 2017 at 03:09:15PM +0000, Daniel P. Berrange wrote:
On Thu, Feb 02, 2017 at 03:11:56PM +0100, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:42AM +0100, Michal Privoznik wrote:
We will need to traverse the symlinks one step at the time. Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 12 ++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 15 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cfeb43cf0..7f7dcfe44 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1615,6 +1615,7 @@ virFileReadAllQuiet; virFileReadBufQuiet; virFileReadHeaderFD; virFileReadLimFD; +virFileReadLink; virFileRelLinkPointsTo; virFileRemove; virFileRemoveLastComponent; diff --git a/src/util/virfile.c b/src/util/virfile.c index bf8099e34..49ea1d1f0 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -76,6 +76,7 @@ #include "virutil.h"
#include "c-ctype.h" +#include "areadlink.h"
#define VIR_FROM_THIS VIR_FROM_NONE
@@ -1614,6 +1615,17 @@ virFileIsLink(const char *linkpath) return S_ISLNK(st.st_mode) != 0; }
+/* + * Read where symlink is pointing to. + * + * Returns 0 on success (@linkpath is a successfully read link), + * -1 with errno set upon error. + */ +int +virFileReadLink(const char *linkpath, char **resultpath) +{ + return (*resultpath = areadlink(linkpath)) ? 0 : -1; +}
I don't really see the benefit of this function, are you trying to obfuscate it? You essentially double the return information for no reason and try to push it all into one line.
It would be useful if you have an ATTRIBUTE_RETURN_CHECK annotation on it, as it would allow compiler time verification of error checking, which is not possible when you overload the return value to contain both the data and error indicator.
More like if there was another logic, e.g. checking that it is a link and if it is not, returning the same path or similar.
ATTRIBUTE_RETURN_CHECK doesn't make much sense to me here since you can get the same information from the second parameter and you are basically making the caller use two values that have the same information. Moreover you can set ATTRIBUTE_RETURN_CHECK for the function and just directly return the output of areadlink(), essentially just renaming it. I don't see the point in that.
It would be different if the function guaranteed non-modification of that parameter when it errors out, for example. Or set an call virReportSystemError(errno, ...); so that the error reporting is done consistently and in one place. I haven't checked yet how Michal uses it, though, so I can't say what's the best solution for now.
But I now see where it comes from. It's the virFileResolveAllLinks() that does the same thing. That doesn't mean it's right, though.
My question is this. If returning integer, which does not contain any other value then just the error, is desired, should we mandate that for non-simple functions and rewrite, for example, qemuBuildCommandLine()?
/* * Finds a requested executable file in the PATH env. e.g.: diff --git a/src/util/virfile.h b/src/util/virfile.h index 0343acd5b..981a9e07d 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -166,6 +166,8 @@ int virFileResolveAllLinks(const char *linkpath, int virFileIsLink(const char *linkpath) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
+int virFileReadLink(const char *linkpath, char **resultpath);
eg add ATTRIBUTE_RETURN_CHECK here
So it seems like Michal is not participating in the discussion about his own patch. But I talked to him personally and he admitted that the function prototype was meant to look like it does simply to be just like the other functions, e.g. virFileResolveAllLinks(). So for now I'm OK with it, we can change how all the functions look later on, but that's a discussion to be had. So ACK with the ATTRIBUTE_RETURN_CHECK added here.
+ char *virFindFileInPath(const char *file);
char *virFileFindResource(const char *filename,
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

The current ordering is as follows: 1) set label 2) create the device in namespace 3) allow device in the cgroup While this might work for now, it will definitely not work if the security driver would use transactions as in that case there would be no device to relabel in the domain namespace as the device is created in the second step. Swap steps 1) and 2) to allow security driver to use more transactions. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_hotplug.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 5b0a3f515..7493d525a 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -110,15 +110,15 @@ qemuDomainPrepareDisk(virQEMUDriverPtr driver, vm, disk) < 0) goto cleanup; - if (qemuSecuritySetDiskLabel(driver, vm, disk) < 0) - goto rollback_lock; - if (qemuDomainNamespaceSetupDisk(driver, vm, disk) < 0) - goto rollback_label; + goto rollback_lock; - if (qemuSetupDiskCgroup(vm, disk) < 0) + if (qemuSecuritySetDiskLabel(driver, vm, disk) < 0) goto rollback_namespace; + if (qemuSetupDiskCgroup(vm, disk) < 0) + goto rollback_label; + ret = 0; goto cleanup; @@ -126,16 +126,16 @@ qemuDomainPrepareDisk(virQEMUDriverPtr driver, if (qemuTeardownDiskCgroup(vm, disk) < 0) VIR_WARN("Unable to tear down cgroup access on %s", virDomainDiskGetSource(disk)); - rollback_namespace: - if (qemuDomainNamespaceTeardownDisk(driver, vm, disk) < 0) - VIR_WARN("Unable to remove /dev entry for %s", - virDomainDiskGetSource(disk)); - rollback_label: if (qemuSecurityRestoreDiskLabel(driver, vm, disk) < 0) VIR_WARN("Unable to restore security label on %s", virDomainDiskGetSource(disk)); + rollback_namespace: + if (qemuDomainNamespaceTeardownDisk(driver, vm, disk) < 0) + VIR_WARN("Unable to remove /dev entry for %s", + virDomainDiskGetSource(disk)); + rollback_lock: if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:43AM +0100, Michal Privoznik wrote:
The current ordering is as follows: 1) set label 2) create the device in namespace 3) allow device in the cgroup
While this might work for now, it will definitely not work if the security driver would use transactions as in that case there would be no device to relabel in the domain namespace as the device is created in the second step. Swap steps 1) and 2) to allow security driver to use more transactions.
ACK, makes more sense if I don't read the message =)

Because of the nature of security driver transactions, it is impossible to use them properly. The thing is, transactions enter the domain namespace and commit all the seclabel changes. However, in RestoreAllLabel() this is impossible - the qemu process, the only process running in the namespace, is gone. And thus is the namespace. Therefore we shouldn't use the transactions as there is no namespace to enter. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 544feeb4a..13d99cdbd 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -73,22 +73,15 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, bool migrated) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) - goto cleanup; - - if (virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, - migrated) < 0) - goto cleanup; - - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) - goto cleanup; - - cleanup: - virSecurityManagerTransactionAbort(driver->securityManager); + /* In contrast to qemuSecuritySetAllLabel, do not use + * secdriver transactions here. This function is called from + * qemuProcessStop() which is meant to do cleanup after qemu + * process died. If it did do, the namespace is gone as qemu + * was the only process running there. We would not succeed + * in entering the namespace then. */ + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, + migrated); } -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:44AM +0100, Michal Privoznik wrote:
Because of the nature of security driver transactions, it is impossible to use them properly. The thing is, transactions enter the domain namespace and commit all the seclabel changes. However, in RestoreAllLabel() this is impossible - the qemu process, the only process running in the namespace, is gone. And thus is the namespace. Therefore we shouldn't use the transactions as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 544feeb4a..13d99cdbd 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -73,22 +73,15 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, bool migrated) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) - goto cleanup; - - if (virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, - migrated) < 0) - goto cleanup; - - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) - goto cleanup; - - cleanup: - virSecurityManagerTransactionAbort(driver->securityManager); + /* In contrast to qemuSecuritySetAllLabel, do not use + * secdriver transactions here. This function is called from + * qemuProcessStop() which is meant to do cleanup after qemu + * process died. If it did do, the namespace is gone as qemu + * was the only process running there. We would not succeed + * in entering the namespace then. */ + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, + migrated);
This means we'll be running restore on /dev/BLAH in the host namespace, which is a file we didn't set a label/permissions on originally. I think this is probably safe though, as we'd just be resetting it to a label that it should already have and thus be a no-op. I'm a little more concerned about file permiissions though, as we might end up changing group from "disk" to "root" for example. It feels like we need to explicitly skip restore for block devices if we had namespaces in use previously. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On 01/20/2017 11:18 AM, Daniel P. Berrange wrote:
On Fri, Jan 20, 2017 at 10:42:44AM +0100, Michal Privoznik wrote:
Because of the nature of security driver transactions, it is impossible to use them properly. The thing is, transactions enter the domain namespace and commit all the seclabel changes. However, in RestoreAllLabel() this is impossible - the qemu process, the only process running in the namespace, is gone. And thus is the namespace. Therefore we shouldn't use the transactions as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 544feeb4a..13d99cdbd 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -73,22 +73,15 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, bool migrated) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) - goto cleanup; - - if (virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, - migrated) < 0) - goto cleanup; - - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) - goto cleanup; - - cleanup: - virSecurityManagerTransactionAbort(driver->securityManager); + /* In contrast to qemuSecuritySetAllLabel, do not use + * secdriver transactions here. This function is called from + * qemuProcessStop() which is meant to do cleanup after qemu + * process died. If it did do, the namespace is gone as qemu + * was the only process running there. We would not succeed + * in entering the namespace then. */ + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, + migrated);
This means we'll be running restore on /dev/BLAH in the host namespace, which is a file we didn't set a label/permissions on originally. I think this is probably safe though, as we'd just be resetting it to a label that it should already have and thus be a no-op. I'm a little more concerned about file permiissions though, as we might end up changing group from "disk" to "root" for example.
Which is no different from current situation with namespaces disabled or using just any pre 3.0.0 release.
It feels like we need to explicitly skip restore for block devices if we had namespaces in use previously.
If we want to go this way, sure. But I'm not quite sure how to design this. E.g. have a filter callback that would return "yes restore label on this path", or "no, do not restore anything on this path". And obviously set the callback only for the RestoreAll() if the namespaces are enabled? Or what do you suggest? Michal

On Fri, Jan 20, 2017 at 11:34:06AM +0100, Michal Privoznik wrote:
On 01/20/2017 11:18 AM, Daniel P. Berrange wrote:
On Fri, Jan 20, 2017 at 10:42:44AM +0100, Michal Privoznik wrote:
Because of the nature of security driver transactions, it is impossible to use them properly. The thing is, transactions enter the domain namespace and commit all the seclabel changes. However, in RestoreAllLabel() this is impossible - the qemu process, the only process running in the namespace, is gone. And thus is the namespace. Therefore we shouldn't use the transactions as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 544feeb4a..13d99cdbd 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -73,22 +73,15 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, bool migrated) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) - goto cleanup; - - if (virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, - migrated) < 0) - goto cleanup; - - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) - goto cleanup; - - cleanup: - virSecurityManagerTransactionAbort(driver->securityManager); + /* In contrast to qemuSecuritySetAllLabel, do not use + * secdriver transactions here. This function is called from + * qemuProcessStop() which is meant to do cleanup after qemu + * process died. If it did do, the namespace is gone as qemu + * was the only process running there. We would not succeed + * in entering the namespace then. */ + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, + migrated);
This means we'll be running restore on /dev/BLAH in the host namespace, which is a file we didn't set a label/permissions on originally. I think this is probably safe though, as we'd just be resetting it to a label that it should already have and thus be a no-op. I'm a little more concerned about file permiissions though, as we might end up changing group from "disk" to "root" for example.
Which is no different from current situation with namespaces disabled or using just any pre 3.0.0 release.
Hmm, good point. No need to solve this right now then - it can just be a todo item
It feels like we need to explicitly skip restore for block devices if we had namespaces in use previously.
If we want to go this way, sure. But I'm not quite sure how to design this. E.g. have a filter callback that would return "yes restore label on this path", or "no, do not restore anything on this path". And obviously set the callback only for the RestoreAll() if the namespaces are enabled? Or what do you suggest?
I guess, lets not special case this right now - we have the long standing problem of (not) correctly restoring permissions for all paths, whether files or block devs. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Fri, Jan 20, 2017 at 10:42:44AM +0100, Michal Privoznik wrote:
Because of the nature of security driver transactions, it is impossible to use them properly. The thing is, transactions enter the domain namespace and commit all the seclabel changes. However, in RestoreAllLabel() this is impossible - the qemu process, the only process running in the namespace, is gone. And thus is the namespace. Therefore we shouldn't use the transactions as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-)
ACK, no need to duplicate the same information in the commit message and the comment below IMHO.
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 544feeb4a..13d99cdbd 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -73,22 +73,15 @@ qemuSecurityRestoreAllLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, bool migrated) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionStart(driver->securityManager) < 0) - goto cleanup; - - if (virSecurityManagerRestoreAllLabel(driver->securityManager, - vm->def, - migrated) < 0) - goto cleanup; - - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && - virSecurityManagerTransactionCommit(driver->securityManager, - vm->pid) < 0) - goto cleanup; - - cleanup: - virSecurityManagerTransactionAbort(driver->securityManager); + /* In contrast to qemuSecuritySetAllLabel, do not use + * secdriver transactions here. This function is called from + * qemuProcessStop() which is meant to do cleanup after qemu + * process died. If it did do, the namespace is gone as qemu + * was the only process running there. We would not succeed + * in entering the namespace then. */ + virSecurityManagerRestoreAllLabel(driver->securityManager, + vm->def, + migrated); }
-- 2.11.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

The idea is to move all the seclabel setting to security driver. Having the relabel code spread all over the place looks very messy. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 112 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 80 insertions(+), 32 deletions(-) diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 13d99cdbd..9d1a87971 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -90,14 +90,26 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } + int ret = -1; - return virSecurityManagerSetDiskLabel(driver->securityManager, + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetDiskLabel(driver->securityManager, vm->def, - disk); + disk) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; } @@ -106,14 +118,26 @@ qemuSecurityRestoreDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } - - return virSecurityManagerRestoreDiskLabel(driver->securityManager, - vm->def, - disk); + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerRestoreDiskLabel(driver->securityManager, + vm->def, + disk) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; } @@ -122,15 +146,27 @@ qemuSecuritySetHostdevLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } - - return virSecurityManagerSetHostdevLabel(driver->securityManager, - vm->def, - hostdev, - NULL); + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetHostdevLabel(driver->securityManager, + vm->def, + hostdev, + NULL) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; } @@ -139,13 +175,25 @@ qemuSecurityRestoreHostdevLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } - - return virSecurityManagerRestoreHostdevLabel(driver->securityManager, - vm->def, - hostdev, - NULL); + int ret = -1; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerRestoreHostdevLabel(driver->securityManager, + vm->def, + hostdev, + NULL) < 0) + goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; } -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:45AM +0100, Michal Privoznik wrote:
The idea is to move all the seclabel setting to security driver. Having the relabel code spread all over the place looks very messy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 112 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 80 insertions(+), 32 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 13d99cdbd..9d1a87971 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -90,14 +90,26 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } + int ret = -1;
- return virSecurityManagerSetDiskLabel(driver->securityManager, + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetDiskLabel(driver->securityManager, vm->def, - disk); + disk) < 0)
indentation
+ goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; }
So much code duplication. Wasn't there an idea to have a callback in the security manager that would be called before/after the labelling? Since this is only for a disk and hostdev, let's leave it like this, I guess, but I'm expecting this much code duplication to bite us back in a while. None of my ideas for how to make this better are finalized, but I have some, if you want to discuss. ACK for the sake of fixing stuff if you fix the indentation as well.

On 02/02/2017 04:03 PM, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:45AM +0100, Michal Privoznik wrote:
The idea is to move all the seclabel setting to security driver. Having the relabel code spread all over the place looks very messy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 112 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 80 insertions(+), 32 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 13d99cdbd..9d1a87971 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -90,14 +90,26 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } + int ret = -1;
- return virSecurityManagerSetDiskLabel(driver->securityManager, + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetDiskLabel(driver->securityManager, vm->def, - disk); + disk) < 0)
indentation
+ goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; }
So much code duplication.
I have a patch ready that kills that and replaces it with a macro.
Wasn't there an idea to have a callback in the security manager that would be called before/after the labelling?
The problem is that security driver sees just virDomainDef while all the namespace stuff (e.g. pid, enabled namespace bitmap) lives in a domain object. Therefore security driver can't make such decision on its own. Thus transactions were born. Having a chown callback that would enter the namespace and change ownership proved to be very suboptimal: entering a namespace requires a fork(). Therefore, instead of forking like crazy, a list of all the desired chown()-s is constructed, one fork() is called and then all the operations from the list are performed.
Since this is only for a disk and hostdev, let's leave it like this, I guess, but I'm expecting this much code duplication to bite us back in a while. None of my ideas for how to make this better are finalized, but I have some, if you want to discuss.
Sure. I'm up for making this better. Michal

On Tue, Feb 07, 2017 at 11:02:09AM +0100, Michal Privoznik wrote:
On 02/02/2017 04:03 PM, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:45AM +0100, Michal Privoznik wrote:
The idea is to move all the seclabel setting to security driver. Having the relabel code spread all over the place looks very messy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_security.c | 112 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 80 insertions(+), 32 deletions(-)
diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 13d99cdbd..9d1a87971 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -90,14 +90,26 @@ qemuSecuritySetDiskLabel(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainDiskDefPtr disk) { - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - /* Already handled by namespace code. */ - return 0; - } + int ret = -1;
- return virSecurityManagerSetDiskLabel(driver->securityManager, + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerSetDiskLabel(driver->securityManager, vm->def, - disk); + disk) < 0)
indentation
+ goto cleanup; + + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) && + virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid) < 0) + goto cleanup; + + ret = 0; + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; }
So much code duplication.
I have a patch ready that kills that and replaces it with a macro.
Wasn't there an idea to have a callback in the security manager that would be called before/after the labelling?
The problem is that security driver sees just virDomainDef while all the namespace stuff (e.g. pid, enabled namespace bitmap) lives in a domain object. Therefore security driver can't make such decision on its own. Thus transactions were born.
Having a chown callback that would enter the namespace and change ownership proved to be very suboptimal: entering a namespace requires a fork(). Therefore, instead of forking like crazy, a list of all the desired chown()-s is constructed, one fork() is called and then all the operations from the list are performed.
Since this is only for a disk and hostdev, let's leave it like this, I guess, but I'm expecting this much code duplication to bite us back in a while. None of my ideas for how to make this better are finalized, but I have some, if you want to discuss.
Sure. I'm up for making this better.
When thinking about them after a while none of them are very clean. Let's see how it looks with your macros for now.
Michal

After previous commit this has become redundant step. Also setting up devices in namespace and setting their label later on are two different steps and should be not done at once. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 112 +------------------------------------------------ 1 file changed, 2 insertions(+), 110 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c67604222..0f45f753e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7565,58 +7565,6 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid ATTRIBUTE_UNUSED, } #endif - switch ((virDomainDeviceType) data->devDef->type) { - case VIR_DOMAIN_DEVICE_DISK: { - virDomainDiskDefPtr def = data->devDef->data.disk; - char *tmpsrc = def->src->path; - def->src->path = (char *) data->file; - if (virSecurityManagerSetDiskLabel(data->driver->securityManager, - data->vm->def, def) < 0) { - def->src->path = tmpsrc; - goto cleanup; - } - def->src->path = tmpsrc; - } break; - - case VIR_DOMAIN_DEVICE_HOSTDEV: { - virDomainHostdevDefPtr def = data->devDef->data.hostdev; - if (virSecurityManagerSetHostdevLabel(data->driver->securityManager, - data->vm->def, def, NULL) < 0) - goto cleanup; - } break; - - case VIR_DOMAIN_DEVICE_CHR: - case VIR_DOMAIN_DEVICE_RNG: - /* No labelling. */ - break; - - case VIR_DOMAIN_DEVICE_NONE: - case VIR_DOMAIN_DEVICE_LEASE: - case VIR_DOMAIN_DEVICE_FS: - case VIR_DOMAIN_DEVICE_NET: - case VIR_DOMAIN_DEVICE_INPUT: - case VIR_DOMAIN_DEVICE_SOUND: - case VIR_DOMAIN_DEVICE_VIDEO: - case VIR_DOMAIN_DEVICE_WATCHDOG: - case VIR_DOMAIN_DEVICE_CONTROLLER: - case VIR_DOMAIN_DEVICE_GRAPHICS: - case VIR_DOMAIN_DEVICE_HUB: - case VIR_DOMAIN_DEVICE_REDIRDEV: - case VIR_DOMAIN_DEVICE_SMARTCARD: - case VIR_DOMAIN_DEVICE_MEMBALLOON: - case VIR_DOMAIN_DEVICE_NVRAM: - case VIR_DOMAIN_DEVICE_SHMEM: - case VIR_DOMAIN_DEVICE_TPM: - case VIR_DOMAIN_DEVICE_PANIC: - case VIR_DOMAIN_DEVICE_MEMORY: - case VIR_DOMAIN_DEVICE_IOMMU: - case VIR_DOMAIN_DEVICE_LAST: - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unexpected device type %d"), - data->devDef->type); - goto cleanup; - } - ret = 0; cleanup: if (ret < 0 && delDevice) @@ -7707,67 +7655,11 @@ qemuDomainDetachDeviceUnlinkHelper(pid_t pid ATTRIBUTE_UNUSED, static int -qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver, +qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr vm, - virDomainDeviceDefPtr dev, + virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED, const char *file) { - /* Technically, this is not needed. Yet. But in the future - * security managers might do some reference counting over - * Set/Restore label and thus for every SetLabel() there - * should be corresponding RestoreLabel(). */ - switch ((virDomainDeviceType) dev->type) { - case VIR_DOMAIN_DEVICE_DISK: { - virDomainDiskDefPtr def = dev->data.disk; - char *tmpsrc = def->src->path; - def->src->path = (char *) file; - if (virSecurityManagerRestoreDiskLabel(driver->securityManager, - vm->def, def) < 0) { - def->src->path = tmpsrc; - return -1; - } - def->src->path = tmpsrc; - } break; - - case VIR_DOMAIN_DEVICE_HOSTDEV: { - virDomainHostdevDefPtr def = dev->data.hostdev; - if (virSecurityManagerRestoreHostdevLabel(driver->securityManager, - vm->def, def, NULL) < 0) - return -1; - } break; - - case VIR_DOMAIN_DEVICE_CHR: - case VIR_DOMAIN_DEVICE_RNG: - /* No labelling. */ - break; - - case VIR_DOMAIN_DEVICE_NONE: - case VIR_DOMAIN_DEVICE_LEASE: - case VIR_DOMAIN_DEVICE_FS: - case VIR_DOMAIN_DEVICE_NET: - case VIR_DOMAIN_DEVICE_INPUT: - case VIR_DOMAIN_DEVICE_SOUND: - case VIR_DOMAIN_DEVICE_VIDEO: - case VIR_DOMAIN_DEVICE_WATCHDOG: - case VIR_DOMAIN_DEVICE_CONTROLLER: - case VIR_DOMAIN_DEVICE_GRAPHICS: - case VIR_DOMAIN_DEVICE_HUB: - case VIR_DOMAIN_DEVICE_REDIRDEV: - case VIR_DOMAIN_DEVICE_SMARTCARD: - case VIR_DOMAIN_DEVICE_MEMBALLOON: - case VIR_DOMAIN_DEVICE_NVRAM: - case VIR_DOMAIN_DEVICE_SHMEM: - case VIR_DOMAIN_DEVICE_TPM: - case VIR_DOMAIN_DEVICE_PANIC: - case VIR_DOMAIN_DEVICE_MEMORY: - case VIR_DOMAIN_DEVICE_IOMMU: - case VIR_DOMAIN_DEVICE_LAST: - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unexpected device type %d"), - dev->type); - return -1; - } - if (virProcessRunInMountNamespace(vm->pid, qemuDomainDetachDeviceUnlinkHelper, (void *)file) < 0) -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:46AM +0100, Michal Privoznik wrote:
After previous commit this has become redundant step. Also setting up devices in namespace and setting their label later on are two different steps and should be not done at once.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 112 +------------------------------------------------ 1 file changed, 2 insertions(+), 110 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c67604222..0f45f753e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7707,67 +7655,11 @@ qemuDomainDetachDeviceUnlinkHelper(pid_t pid ATTRIBUTE_UNUSED,
static int -qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver, +qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr vm, - virDomainDeviceDefPtr dev, + virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED,
Any reason for keeping these unused attributes here? Your call on removing them (from caller as well, of course). ACK either way.

On 02/07/2017 10:59 AM, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:46AM +0100, Michal Privoznik wrote:
After previous commit this has become redundant step. Also setting up devices in namespace and setting their label later on are two different steps and should be not done at once.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 112 +------------------------------------------------ 1 file changed, 2 insertions(+), 110 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c67604222..0f45f753e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7707,67 +7655,11 @@ qemuDomainDetachDeviceUnlinkHelper(pid_t pid ATTRIBUTE_UNUSED,
static int -qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver, +qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr vm, - virDomainDeviceDefPtr dev, + virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED,
Any reason for keeping these unused attributes here? Your call on removing them (from caller as well, of course). ACK either way.
As mentioned in previous e-mail, I have another patch set ready that goes on the top of this and removes couple of unused variables/arguments (mostly because I'm fixing another bug by calling qemuDomainNamespaceSetupDisk() from a place where just virStorageSourcePtr is available. So stay tuned :-) Michal

Imagine you have a disk with the following source set up: /dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda After cbc45525cb21 the transitive end of the symlink chain is created (/dev/sda), but we need to create any item in chain too. Others might rely on that. In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus it is this path that secdriver tries to relabel. Not the resolved one. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 150 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 108 insertions(+), 42 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0f45f753e..8cbfb2d16 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -68,6 +68,7 @@ #endif #include <libxml/xpathInternals.h> +#include "dosname.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -6958,70 +6959,135 @@ qemuDomainCreateDevice(const char *device, bool allow_noent) { char *devicePath = NULL; - char *canonDevicePath = NULL; + char *target = NULL; struct stat sb; int ret = -1; + bool isLink = false; + bool create = false; #ifdef WITH_SELINUX char *tcon = NULL; #endif - if (virFileResolveAllLinks(device, &canonDevicePath) < 0) { + if (lstat(device, &sb) < 0) { if (errno == ENOENT && allow_noent) { /* Ignore non-existent device. */ - ret = 0; - goto cleanup; + return 0; } - - virReportError(errno, _("Unable to canonicalize %s"), device); - goto cleanup; - } - - if (!STRPREFIX(canonDevicePath, DEVPREFIX)) { - ret = 0; - goto cleanup; + virReportSystemError(errno, _("Unable to stat %s"), device); + return ret; } - if (virAsprintf(&devicePath, "%s/%s", - path, canonDevicePath + strlen(DEVPREFIX)) < 0) - goto cleanup; + isLink = S_ISLNK(sb.st_mode); - if (stat(canonDevicePath, &sb) < 0) { - if (errno == ENOENT && allow_noent) { - /* Ignore non-existent device. */ - ret = 0; + /* Here, @device might be whatever path in the system. We + * should create the path in the namespace iff its "/dev" + * prefixed. However, if it is a symlink, we need to traverse + * it too (it might point to something in "/dev"). Just + * consider: + * + * /var/sym1 -> /var/sym2 -> /dev/sda (because users can) + * + * This means, "/var/sym1" is not created (it's shared with + * the parent namespace), nor "/var/sym2", but "/dev/sda". + * + * TODO Remove all `.' and `..' from the @device path. + * Otherwise we might get fooled with `/dev/../var/my_image'. + * For now, lets hope callers play nice. + */ + if (STRPREFIX(device, DEVPREFIX)) { + if (virAsprintf(&devicePath, "%s/%s", + path, device + strlen(DEVPREFIX)) < 0) goto cleanup; - } - - virReportSystemError(errno, _("Unable to stat %s"), canonDevicePath); - goto cleanup; - } - - if (virFileMakeParentPath(devicePath) < 0) { - virReportSystemError(errno, - _("Unable to create %s"), - devicePath); - goto cleanup; - } - if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) { - if (errno == EEXIST) { - ret = 0; - } else { + if (virFileMakeParentPath(devicePath) < 0) { virReportSystemError(errno, - _("Failed to make device %s"), + _("Unable to create %s"), devicePath); + goto cleanup; } + create = true; + } + + if (isLink) { + /* We are dealing with a symlink. Create a dangling symlink and descend + * down one level which hopefully creates the symlink's target. */ + if (virFileReadLink(device, &target) < 0) { + virReportSystemError(errno, + _("unable to resolve symlink %s"), + device); + goto cleanup; + } + + if (create && + symlink(target, devicePath) < 0) { + if (errno == EEXIST) { + ret = 0; + } else { + virReportSystemError(errno, + _("unable to create symlink %s"), + devicePath); + } + goto cleanup; + } + + /* Tricky part. If the target starts with a slash then we need to take + * it as it is. Otherwise we need to replace the last component in the + * original path with the link target: + * /dev/rtc -> rtc0 (want /dev/rtc0) + * /dev/disk/by-id/ata-SanDisk_SDSSDXPS480G_161101402485 -> ../../sda + * (want /dev/disk/by-id/../../sda) + * /dev/stdout -> /proc/self/fd/1 (no change needed) + */ + if (IS_RELATIVE_FILE_NAME(target)) { + char *c = NULL, *tmp = NULL, *devTmp = NULL; + + if (VIR_STRDUP(devTmp, device) < 0) + goto cleanup; + + if ((c = strrchr(devTmp, '/'))) + *(c + 1) = '\0'; + + if (virAsprintf(&tmp, "%s%s", devTmp, target) < 0) { + VIR_FREE(devTmp); + goto cleanup; + } + VIR_FREE(devTmp); + VIR_FREE(target); + target = tmp; + tmp = NULL; + } + + if (qemuDomainCreateDevice(target, path, allow_noent) < 0) + goto cleanup; + } else { + if (create && + mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) { + if (errno == EEXIST) { + ret = 0; + } else { + virReportSystemError(errno, + _("Failed to make device %s"), + devicePath); + } + goto cleanup; + } + } + + if (!create) { + ret = 0; goto cleanup; } - if (chown(devicePath, sb.st_uid, sb.st_gid) < 0) { + if (lchown(devicePath, sb.st_uid, sb.st_gid) < 0) { virReportSystemError(errno, _("Failed to chown device %s"), devicePath); goto cleanup; } - if (virFileCopyACLs(canonDevicePath, devicePath) < 0 && + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileCopyACLs(device, devicePath) < 0 && errno != ENOTSUP) { virReportSystemError(errno, _("Failed to copy ACLs on device %s"), @@ -7030,16 +7096,16 @@ qemuDomainCreateDevice(const char *device, } #ifdef WITH_SELINUX - if (getfilecon_raw(canonDevicePath, &tcon) < 0 && + if (lgetfilecon_raw(device, &tcon) < 0 && (errno != ENOTSUP && errno != ENODATA)) { virReportSystemError(errno, _("Unable to get SELinux label from %s"), - canonDevicePath); + device); goto cleanup; } if (tcon && - setfilecon_raw(devicePath, (VIR_SELINUX_CTX_CONST char *) tcon) < 0) { + lsetfilecon_raw(devicePath, (VIR_SELINUX_CTX_CONST char *) tcon) < 0) { VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR if (errno != EOPNOTSUPP && errno != ENOTSUP) { VIR_WARNINGS_RESET @@ -7053,7 +7119,7 @@ qemuDomainCreateDevice(const char *device, ret = 0; cleanup: - VIR_FREE(canonDevicePath); + VIR_FREE(target); VIR_FREE(devicePath); #ifdef WITH_SELINUX freecon(tcon); -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:47AM +0100, Michal Privoznik wrote:
Imagine you have a disk with the following source set up:
/dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda
After cbc45525cb21 the transitive end of the symlink chain is created (/dev/sda), but we need to create any item in chain too. Others might rely on that. In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus it is this path that secdriver tries to relabel. Not the resolved one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 150 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 108 insertions(+), 42 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0f45f753e..8cbfb2d16 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -68,6 +68,7 @@ #endif
#include <libxml/xpathInternals.h> +#include "dosname.h"
#define VIR_FROM_THIS VIR_FROM_QEMU
@@ -6958,70 +6959,135 @@ qemuDomainCreateDevice(const char *device, bool allow_noent) { char *devicePath = NULL; - char *canonDevicePath = NULL; + char *target = NULL; struct stat sb; int ret = -1; + bool isLink = false; + bool create = false; #ifdef WITH_SELINUX char *tcon = NULL; #endif
- if (virFileResolveAllLinks(device, &canonDevicePath) < 0) { + if (lstat(device, &sb) < 0) { if (errno == ENOENT && allow_noent) { /* Ignore non-existent device. */ - ret = 0; - goto cleanup; + return 0; } - - virReportError(errno, _("Unable to canonicalize %s"), device); - goto cleanup; - } - - if (!STRPREFIX(canonDevicePath, DEVPREFIX)) { - ret = 0; - goto cleanup; + virReportSystemError(errno, _("Unable to stat %s"), device); + return ret; }
- if (virAsprintf(&devicePath, "%s/%s", - path, canonDevicePath + strlen(DEVPREFIX)) < 0) - goto cleanup; + isLink = S_ISLNK(sb.st_mode);
- if (stat(canonDevicePath, &sb) < 0) { - if (errno == ENOENT && allow_noent) { - /* Ignore non-existent device. */ - ret = 0; + /* Here, @device might be whatever path in the system. We + * should create the path in the namespace iff its "/dev"
s/its/it's/ ACK

When working with symlinks it is fairly easy to get into a loop. Don't. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8cbfb2d16..448583313 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6954,9 +6954,10 @@ qemuDomainGetPreservedMounts(virQEMUDriverPtr driver, static int -qemuDomainCreateDevice(const char *device, - const char *path, - bool allow_noent) +qemuDomainCreateDeviceRecursive(const char *device, + const char *path, + bool allow_noent, + unsigned int ttl) { char *devicePath = NULL; char *target = NULL; @@ -6968,6 +6969,13 @@ qemuDomainCreateDevice(const char *device, char *tcon = NULL; #endif + if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + device); + return ret; + } + if (lstat(device, &sb) < 0) { if (errno == ENOENT && allow_noent) { /* Ignore non-existent device. */ @@ -7057,7 +7065,8 @@ qemuDomainCreateDevice(const char *device, tmp = NULL; } - if (qemuDomainCreateDevice(target, path, allow_noent) < 0) + if (qemuDomainCreateDeviceRecursive(target, path, + allow_noent, ttl - 1) < 0) goto cleanup; } else { if (create && @@ -7128,6 +7137,17 @@ qemuDomainCreateDevice(const char *device, } +static int +qemuDomainCreateDevice(const char *device, + const char *path, + bool allow_noent) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainCreateDeviceRecursive(device, path, + allow_noent, symloop_max); +} + static int qemuDomainPopulateDevices(virQEMUDriverPtr driver, -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:48AM +0100, Michal Privoznik wrote:
When working with symlinks it is fairly easy to get into a loop. Don't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8cbfb2d16..448583313 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6954,9 +6954,10 @@ qemuDomainGetPreservedMounts(virQEMUDriverPtr driver,
static int -qemuDomainCreateDevice(const char *device, - const char *path, - bool allow_noent) +qemuDomainCreateDeviceRecursive(const char *device, + const char *path, + bool allow_noent, + unsigned int ttl) { char *devicePath = NULL; char *target = NULL; @@ -6968,6 +6969,13 @@ qemuDomainCreateDevice(const char *device, char *tcon = NULL; #endif
+ if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + device); + return ret; + } + if (lstat(device, &sb) < 0) { if (errno == ENOENT && allow_noent) { /* Ignore non-existent device. */ @@ -7057,7 +7065,8 @@ qemuDomainCreateDevice(const char *device, tmp = NULL; }
- if (qemuDomainCreateDevice(target, path, allow_noent) < 0) + if (qemuDomainCreateDeviceRecursive(target, path, + allow_noent, ttl - 1) < 0) goto cleanup; } else { if (create && @@ -7128,6 +7137,17 @@ qemuDomainCreateDevice(const char *device, }
+static int +qemuDomainCreateDevice(const char *device, + const char *path, + bool allow_noent) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainCreateDeviceRecursive(device, path, + allow_noent, symloop_max);
So you are taking 'long' that can, officially, be '-1' and pass it to a function as unsigned int. Having a maximum is nice, I have no idea why on my system there is no limit apparently (sysconf() returns -1 with no errno set), but if the system doesn't report any (like my case above) it effectively makes the SYMLOOP_MAX = UINT_MAX in this codepath. Should we avoid that or just let it be? This way it's still better than unlimited, so ACK to this version, but I just wanted a (short) discussion about this.
+} +
static int qemuDomainPopulateDevices(virQEMUDriverPtr driver, -- 2.11.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 02/07/2017 11:34 AM, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:48AM +0100, Michal Privoznik wrote:
When working with symlinks it is fairly easy to get into a loop. Don't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8cbfb2d16..448583313 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -6954,9 +6954,10 @@ qemuDomainGetPreservedMounts(virQEMUDriverPtr driver,
static int -qemuDomainCreateDevice(const char *device, - const char *path, - bool allow_noent) +qemuDomainCreateDeviceRecursive(const char *device, + const char *path, + bool allow_noent, + unsigned int ttl) { char *devicePath = NULL; char *target = NULL; @@ -6968,6 +6969,13 @@ qemuDomainCreateDevice(const char *device, char *tcon = NULL; #endif
+ if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + device); + return ret; + } + if (lstat(device, &sb) < 0) { if (errno == ENOENT && allow_noent) { /* Ignore non-existent device. */ @@ -7057,7 +7065,8 @@ qemuDomainCreateDevice(const char *device, tmp = NULL; }
- if (qemuDomainCreateDevice(target, path, allow_noent) < 0) + if (qemuDomainCreateDeviceRecursive(target, path, + allow_noent, ttl - 1) < 0) goto cleanup; } else { if (create && @@ -7128,6 +7137,17 @@ qemuDomainCreateDevice(const char *device, }
+static int +qemuDomainCreateDevice(const char *device, + const char *path, + bool allow_noent) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainCreateDeviceRecursive(device, path, + allow_noent, symloop_max);
So you are taking 'long' that can, officially, be '-1' and pass it to a function as unsigned int. Having a maximum is nice, I have no idea why on my system there is no limit apparently (sysconf() returns -1 with no errno set), but if the system doesn't report any (like my case above) it effectively makes the SYMLOOP_MAX = UINT_MAX in this codepath. Should we avoid that or just let it be? This way it's still better than unlimited, so ACK to this version, but I just wanted a (short) discussion about this.
Sure. I think it's safe to assume if the level of symlinks reaches UINT_MAX your system is terribly broken. Michal

Similarly to one of the previous commits, we need to deal properly with symlinks in hotplug case too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 120 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 94 insertions(+), 26 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 448583313..bcfb2446f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7590,6 +7590,7 @@ struct qemuDomainAttachDeviceMknodData { virDomainObjPtr vm; virDomainDeviceDefPtr devDef; const char *file; + const char *target; struct stat sb; void *acl; #ifdef WITH_SELINUX @@ -7605,6 +7606,7 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid ATTRIBUTE_UNUSED, struct qemuDomainAttachDeviceMknodData *data = opaque; int ret = -1; bool delDevice = false; + bool isLink = S_ISLNK(data->sb.st_mode); virSecurityManagerPostFork(data->driver->securityManager); @@ -7614,24 +7616,47 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid ATTRIBUTE_UNUSED, goto cleanup; } - VIR_DEBUG("Creating dev %s (%d,%d)", - data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev)); - if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) { - /* Because we are not removing devices on hotunplug, or - * we might be creating part of backing chain that - * already exist due to a different disk plugged to - * domain, accept EEXIST. */ - if (errno != EEXIST) { - virReportSystemError(errno, - _("Unable to create device %s"), - data->file); - goto cleanup; + if (isLink) { + VIR_DEBUG("Creating symlink %s -> %s", data->file, data->target); + if (symlink(data->target, data->file) < 0) { + if (errno != EEXIST) { + virReportSystemError(errno, + _("Unable to create symlink %s"), + data->target); + goto cleanup; + } + } else { + delDevice = true; } } else { - delDevice = true; + VIR_DEBUG("Creating dev %s (%d,%d)", + data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev)); + if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) { + /* Because we are not removing devices on hotunplug, or + * we might be creating part of backing chain that + * already exist due to a different disk plugged to + * domain, accept EEXIST. */ + if (errno != EEXIST) { + virReportSystemError(errno, + _("Unable to create device %s"), + data->file); + goto cleanup; + } + } else { + delDevice = true; + } } - if (virFileSetACLs(data->file, data->acl) < 0 && + if (lchown(data->file, data->sb.st_uid, data->sb.st_gid) < 0) { + virReportSystemError(errno, + _("Failed to chown device %s"), + data->file); + goto cleanup; + } + + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileSetACLs(data->file, data->acl) < 0 && errno != ENOTSUP) { virReportSystemError(errno, _("Unable to set ACLs on %s"), data->file); @@ -7639,7 +7664,8 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid ATTRIBUTE_UNUSED, } #ifdef WITH_SELINUX - if (setfilecon_raw(data->file, (VIR_SELINUX_CTX_CONST char *) data->tcon) < 0) { + if (data->tcon && + lsetfilecon_raw(data->file, (VIR_SELINUX_CTX_CONST char *) data->tcon) < 0) { VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR if (errno != EOPNOTSUPP && errno != ENOTSUP) { VIR_WARNINGS_RESET @@ -7671,6 +7697,8 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, { struct qemuDomainAttachDeviceMknodData data; int ret = -1; + char *target = NULL; + bool isLink; memset(&data, 0, sizeof(data)); @@ -7679,21 +7707,55 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, data.devDef = devDef; data.file = file; - if (stat(file, &data.sb) < 0) { + if (lstat(file, &data.sb) < 0) { virReportSystemError(errno, _("Unable to access %s"), file); return ret; } - if (virFileGetACLs(file, &data.acl) < 0 && + isLink = S_ISLNK(data.sb.st_mode); + + if (isLink) { + if (virFileReadLink(file, &target) < 0) { + virReportSystemError(errno, + _("unable to resolve symlink %s"), + file); + return ret; + } + + if (IS_RELATIVE_FILE_NAME(target)) { + char *c = NULL, *tmp = NULL, *fileTmp = NULL; + + if (VIR_STRDUP(fileTmp, file) < 0) + goto cleanup; + + if ((c = strrchr(fileTmp, '/'))) + *(c + 1) = '\0'; + + if (virAsprintf(&tmp, "%s%s", fileTmp, target) < 0) { + VIR_FREE(fileTmp); + goto cleanup; + } + VIR_FREE(fileTmp); + VIR_FREE(target); + target = tmp; + tmp = NULL; + } + + data.target = target; + } + + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileGetACLs(file, &data.acl) < 0 && errno != ENOTSUP) { virReportSystemError(errno, _("Unable to get ACLs on %s"), file); - return ret; + goto cleanup; } #ifdef WITH_SELINUX - if (getfilecon_raw(file, &data.tcon) < 0 && + if (lgetfilecon_raw(file, &data.tcon) < 0 && (errno != ENOTSUP && errno != ENODATA)) { virReportSystemError(errno, _("Unable to get SELinux label from %s"), file); @@ -7701,17 +7763,22 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } #endif - if (virSecurityManagerPreFork(driver->securityManager) < 0) - goto cleanup; + if (STRPREFIX(file, DEVPREFIX)) { + if (virSecurityManagerPreFork(driver->securityManager) < 0) + goto cleanup; - if (virProcessRunInMountNamespace(vm->pid, - qemuDomainAttachDeviceMknodHelper, - &data) < 0) { + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainAttachDeviceMknodHelper, + &data) < 0) { + virSecurityManagerPostFork(driver->securityManager); + goto cleanup; + } virSecurityManagerPostFork(driver->securityManager); - goto cleanup; } - virSecurityManagerPostFork(driver->securityManager); + if (isLink && + qemuDomainAttachDeviceMknod(driver, vm, devDef, target) < 0) + goto cleanup; ret = 0; cleanup: @@ -7719,6 +7786,7 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, freecon(data.tcon); #endif virFileFreeACLs(&data.acl); + VIR_FREE(target); return ret; } -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:49AM +0100, Michal Privoznik wrote:
Similarly to one of the previous commits, we need to deal properly with symlinks in hotplug case too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 120 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 94 insertions(+), 26 deletions(-)
ACK to this, but ...
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 448583313..bcfb2446f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7701,17 +7763,22 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } #endif
- if (virSecurityManagerPreFork(driver->securityManager) < 0) - goto cleanup; + if (STRPREFIX(file, DEVPREFIX)) { + if (virSecurityManagerPreFork(driver->securityManager) < 0) + goto cleanup;
- if (virProcessRunInMountNamespace(vm->pid, - qemuDomainAttachDeviceMknodHelper, - &data) < 0) { + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainAttachDeviceMknodHelper, + &data) < 0) {
... I'm sure you have patches for this somewhere that are not posted or something =D However now we actually fork for every level of the symlink. Even when everyone is scared of every single fork(). Can't we use transactions for this as well? If not, could we enhance them so that we can use them?
+ virSecurityManagerPostFork(driver->securityManager); + goto cleanup; + } virSecurityManagerPostFork(driver->securityManager); - goto cleanup; }
- virSecurityManagerPostFork(driver->securityManager); + if (isLink && + qemuDomainAttachDeviceMknod(driver, vm, devDef, target) < 0) + goto cleanup;
ret = 0; cleanup:

On 02/07/2017 11:57 AM, Martin Kletzander wrote:
On Fri, Jan 20, 2017 at 10:42:49AM +0100, Michal Privoznik wrote:
Similarly to one of the previous commits, we need to deal properly with symlinks in hotplug case too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 120 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 94 insertions(+), 26 deletions(-)
ACK to this, but ...
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 448583313..bcfb2446f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7701,17 +7763,22 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } #endif
- if (virSecurityManagerPreFork(driver->securityManager) < 0) - goto cleanup; + if (STRPREFIX(file, DEVPREFIX)) { + if (virSecurityManagerPreFork(driver->securityManager) < 0) + goto cleanup;
- if (virProcessRunInMountNamespace(vm->pid, - qemuDomainAttachDeviceMknodHelper, - &data) < 0) { + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainAttachDeviceMknodHelper, + &data) < 0) {
... I'm sure you have patches for this somewhere that are not posted or something =D However now we actually fork for every level of the symlink. Even when everyone is scared of every single fork(). Can't we use transactions for this as well? If not, could we enhance them so that we can use them?
Transactions are security driver specific. But we can imitate them here too. Instead of direct fork() we would have a list to which we append all the symlinks we want to create and then fork() once and execute the list. Good point. I will work on that. Michal

When working with symlinks it is fairly easy to get into a loop. Don't. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index bcfb2446f..db01bd230 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7690,16 +7690,24 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid ATTRIBUTE_UNUSED, static int -qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, - virDomainObjPtr vm, - virDomainDeviceDefPtr devDef, - const char *file) +qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainDeviceDefPtr devDef, + const char *file, + unsigned int ttl) { struct qemuDomainAttachDeviceMknodData data; int ret = -1; char *target = NULL; bool isLink; + if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + file); + return ret; + } + memset(&data, 0, sizeof(data)); data.driver = driver; @@ -7777,7 +7785,8 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } if (isLink && - qemuDomainAttachDeviceMknod(driver, vm, devDef, target) < 0) + qemuDomainAttachDeviceMknodRecursive(driver, vm, devDef, + target, ttl -1) < 0) goto cleanup; ret = 0; @@ -7791,6 +7800,19 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } +static int +qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainDeviceDefPtr devDef, + const char *file) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainAttachDeviceMknodRecursive(driver, vm, devDef, + file, symloop_max); +} + + static int qemuDomainDetachDeviceUnlinkHelper(pid_t pid ATTRIBUTE_UNUSED, void *opaque) -- 2.11.0

On Fri, Jan 20, 2017 at 10:42:50AM +0100, Michal Privoznik wrote:
When working with symlinks it is fairly easy to get into a loop. Don't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-)
ACK, same worries as before, but no better soluton exists in my mind.

On 01/20/2017 10:42 AM, Michal Privoznik wrote:
The major problem was with symlinks. Imagine the following chain of symlinks:
/dev/my_awesome_disk -> /home/user/blaah -> /dev/disk/by-uuid/$uuid -> /dev/sda
We really need to create all those /dev/* symlinks and /dev/sda device. Also, some other (less critical) bugs are fixed.
Michal Privoznik (10): virProcessRunInMountNamespace: Report errors from child util: Introduce virFileReadLink qemuDomainPrepareDisk: Fix ordering qemuSecurityRestoreAllLabel: Don't use transactions qemu_security: Use more transactions qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices qemuDomainCreateDevice: Properly deal with symlinks qemuDomainCreateDevice: Don't loop endlessly qemuDomainAttachDeviceMknod: Deal with symlinks qemuDomainAttachDeviceMknod: Don't loop endlessly
src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 438 +++++++++++++++++++++++++++-------------------- src/qemu/qemu_hotplug.c | 20 +-- src/qemu/qemu_security.c | 137 +++++++++------ src/util/virfile.c | 12 ++ src/util/virfile.h | 2 + src/util/virprocess.c | 8 +- 7 files changed, 374 insertions(+), 244 deletions(-)
_____ _ | __ (_) | |__) | _ __ __ _ | ___/ | '_ \ / _` | | | | | | | | (_| | |_| |_|_| |_|\__, | __/ | |___/ Michal

On Mon, Jan 30, 2017 at 11:15 AM +0100, Michal Privoznik <mprivozn@redhat.com> wrote:
On 01/20/2017 10:42 AM, Michal Privoznik wrote:
The major problem was with symlinks. Imagine the following chain of symlinks:
/dev/my_awesome_disk -> /home/user/blaah -> /dev/disk/by-uuid/$uuid -> /dev/sda
We really need to create all those /dev/* symlinks and /dev/sda device. Also, some other (less critical) bugs are fixed.
Michal Privoznik (10): virProcessRunInMountNamespace: Report errors from child util: Introduce virFileReadLink qemuDomainPrepareDisk: Fix ordering qemuSecurityRestoreAllLabel: Don't use transactions qemu_security: Use more transactions qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices qemuDomainCreateDevice: Properly deal with symlinks qemuDomainCreateDevice: Don't loop endlessly qemuDomainAttachDeviceMknod: Deal with symlinks qemuDomainAttachDeviceMknod: Don't loop endlessly
src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 438 +++++++++++++++++++++++++++-------------------- src/qemu/qemu_hotplug.c | 20 +-- src/qemu/qemu_security.c | 137 +++++++++------ src/util/virfile.c | 12 ++ src/util/virfile.h | 2 + src/util/virprocess.c | 8 +- 7 files changed, 374 insertions(+), 244 deletions(-)
_____ _ | __ (_) | |__) | _ __ __ _ | ___/ | '_ \ / _` | | | | | | | | (_| | |_| |_|_| |_|\__, | __/ | |___/
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Tested the patch series and the problems with v3.0.0 - starting domains with hostdev devices - hot plugging seem to be fixed. -- Beste Grüße / Kind regards Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 01/20/2017 10:42 AM, Michal Privoznik wrote:
The major problem was with symlinks. Imagine the following chain of symlinks:
/dev/my_awesome_disk -> /home/user/blaah -> /dev/disk/by-uuid/$uuid -> /dev/sda
We really need to create all those /dev/* symlinks and /dev/sda device. Also, some other (less critical) bugs are fixed.
Michal Privoznik (10): virProcessRunInMountNamespace: Report errors from child util: Introduce virFileReadLink qemuDomainPrepareDisk: Fix ordering qemuSecurityRestoreAllLabel: Don't use transactions qemu_security: Use more transactions qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices qemuDomainCreateDevice: Properly deal with symlinks qemuDomainCreateDevice: Don't loop endlessly qemuDomainAttachDeviceMknod: Deal with symlinks qemuDomainAttachDeviceMknod: Don't loop endlessly
src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 438 +++++++++++++++++++++++++++-------------------- src/qemu/qemu_hotplug.c | 20 +-- src/qemu/qemu_security.c | 137 +++++++++------ src/util/virfile.c | 12 ++ src/util/virfile.h | 2 + src/util/virprocess.c | 8 +- 7 files changed, 374 insertions(+), 244 deletions(-)
Thanks Martin for the review. I've pushed these. Michal
participants (4)
-
Daniel P. Berrange
-
Marc Hartmayer
-
Martin Kletzander
-
Michal Privoznik