On Thu, Apr 04, 2013 at 11:56:29AM -0600, Eric Blake wrote:
On 04/04/2013 07:40 AM, Daniel P. Berrange wrote:
> From: "Daniel P. Berrange" <berrange(a)redhat.com>
>
> Introduce a method virFileDeleteTree for recursively deleting
> an entire directory tree
>
> Signed-off-by: Daniel P. Berrange <berrange(a)redhat.com>
> ---
> src/libvirt_private.syms | 1 +
> src/util/virfile.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++
> src/util/virfile.h | 2 ++
> 3 files changed, 81 insertions(+)
Don't we already have something like this?
/me goes and looks...
yep, very similar to virCgroupRemoveRecursively - hopefully a later
patch drops that function to use this instead. I like the idea of a
generalized interface.
NB virCgroupRemoveRecursively is a little special - with the cgroups
filesystem, you never have to actually delete any of the files in
the directories. You delete the directory & every file in it magically
goes away too (assuming there are no sub-directories or tasks present).
> + errno = 0;
> + while ((de = readdir(dh)) != NULL) {
> + struct stat sb;
> +
> + if (STREQ(de->d_name, ".") ||
> + STREQ(de->d_name, ".."))
> + continue;
> +
> + if (virAsprintf(&filepath, "%s/%s",
> + dir, de->d_name) < 0) {
> + virReportOOMError();
> + goto cleanup;
> + }
We should use gnulib's LGPL unlinkat. On capable kernels, it avoids
O(n^2) behavior that is inherent in computing filenames in a deep
hierarchy. On less capable kernels, it still makes this code simpler to
write (no virAsprintf needed here).
Yep, I wondered about unlinkat(), then then wondered about portability
forgetting gnulib might help.
> +
> + if (lstat(filepath, &sb) < 0) {
> + virReportSystemError(errno, _("Cannot access '%s'"),
> + filepath);
> + goto cleanup;
> + }
Potentially wasteful on systems like Linux that have d_type. If d_type
exists, and is not DT_UNKNOWN, then the difference between DT_DIR and
other types can save some system call efforts.
Potential TOCTTOU race. POSIX allows unlink("dir") to succeed (although
most platforms reject it, either always [Linux], or based on
capabilities [Solaris, which also has code to give up that capability,
and where gnulib also exposes that]). If we ever manage to unlink() a
directory, because we lost the TOCTTOU race, then we have done bad
things to the file system.
But you are guaranteed that rmdir() on a non-directory will gracefully
fail; so you can minimize the race window by attempting to treat _every_
name as a directory first, then gracefully fall back to unlink() if the
opendir() fails with ENOTDIR, without ever having to waste time
lstat()ing things. Hmm, except that you don't want to follow symlinks,
but opendir() follows them by default; so you would have to use
open(O_DIRECTORY)/fdopendir() instead of raw opendir().
> +
> + if (S_ISDIR(sb.st_mode)) {
> + if (virFileDeleteTree(filepath) < 0)
> + goto cleanup;
> + } else {
> + if (unlink(filepath) < 0 && errno != ENOENT) {
> + virReportSystemError(errno,
> + _("Cannot delete file
'%s'"),
> + filepath);
> + goto cleanup;
What happens on files that have restrictive permissions? Do we need to
worry about chmod()ing files (or containing directories) to give
ourselves enough access so that we can then turn around and unlink()
files regardless of restrictive permissions?
> + }
> + }
> +
> + VIR_FREE(filepath);
> + errno = 0;
> + }
> +
> + if (errno) {
> + virReportSystemError(errno, _("Cannot read dir '%s'"),
> + dir);
> + goto cleanup;
> + }
> +
> + if (rmdir(dir) < 0 && errno != ENOENT) {
> + virReportSystemError(errno,
> + _("Cannot delete directory '%s'"),
> + dir);
> + goto cleanup;
> + }
What you have works, even if it is inherently quadratic compared to
using unlinkat(). What sort of trees do we envision deleting? Do we
need to start worrying about performance, using lessons learned from GNU
coreutils? Or are the trees small enough, with properties of never
being too-restrictive in file mode, and where the trees we are deleting
are unlikely to be hit by a malicious user exploiting a TOCTTOU race,
that we can just stick with this implementation as-is?
> +
> + ret = 0;
> +
> +cleanup:
> + VIR_FREE(filepath);
> + closedir(dh);
> + return ret;
> +}
> diff --git a/src/util/virfile.h b/src/util/virfile.h
> index c885b73..5f0dd2b 100644
> --- a/src/util/virfile.h
> +++ b/src/util/virfile.h
> @@ -108,4 +108,6 @@ int virFileUpdatePerm(const char *path,
> int virFileLoopDeviceAssociate(const char *file,
> char **dev);
>
> +int virFileDeleteTree(const char *dir);
> +
> #endif /* __VIR_FILES_H */
Very weak ACK, depending on what you answer to my commentary.
I guess the thing I should point out here is that I wasn't writing this
to deal with a hostile environment. This function is actually only used
by the testsuite I add in a later patch. It won't actully be used by
libvirtd / libvirt.so. So I wasn't too bothered about perfection, just
something good enough for the immediate need.
I guess the problem is that once we have this function, someone else is
bound to use it elsewhere, where upon my laziness might cause a problem.
I'll make a bunch of the changes you suggest & re-post for further
discussion.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|