On 19/06/13 21:44, John Ferlan wrote:
On 06/07/2013 01:03 PM, Osier Yang wrote:
> There is POSIX calls to walk through direcotry tree, nftw(3), but
s/is/are
s/direcotry/directory
s/tree/trees or s/tree/a tree/
hm... I reviewed it times before posting it. :(
> there is no way to allow one to pass user data to the callback:
>
> int nftw(const char *dirpath,
> int (*fn) (const char *fpath, const struct stat *sb,
> int typeflag, struct FTW *ftwbuf),
> int nopenfd, int flags);
>
> Assuming a simple requirement: one wants to simply find out all the
> files which have name "vendor" and it has a specific value, under a
> directory, It's not possible to achieve with nftw(3). To solve the
> requirements like this, this new util function comes.
Not sure the last sentence is necessary
> int virTraverseDirectory(const char *dirpath,
> int depth,
> unsigned int flags,
> virTraverseDirectoryCallback cb,
> virTraverseDirectorySkipCallback skipcb,
> void *opaque,
> char ***filepaths);
>
> * Which allow to traverse a directory limited to specified @depth
> (relative to the @dirpath)
>
> * Two callbacks are provided, callback @cb is to handle the file path
> passed to it, with the user data @opaque; callback @skipcb is to
> allow one to make the rules to skip the passed file path. For example,
> one can define regex patterns in @skipcb to skip all the file path
> which matches it.
>
> * @cb should return 0 to indicate the file path is successfully
> handled, otherwise -1 should be returned.
>
> * Like @cb, @skipcb also should return 0 to indiate the file path
> will be skipped to handle, otherwise -1 should be returned.
>
> * If passed @filepath is not NULL, it will be filled with the file
> paths which are successfully handled by @cb.
What use would a NULL filepath be? The whole purpose of the code is to
generate a list of files with a specific pattern. The end result may be
an empty list, but passing NULL
I think there are some use cases won't care about the file paths it
processed, it's a future consideration though, I don't use it in that
way currently.
> * @flags can be used to control the general behaviour of the
> traversing. E.g. Don't follow symbol links.
>
> A simple example:
>
> /*
> * utiltest.c: Prints all the file paths whose basename is
> * "vendor" under directory "./testdir".
> */
> static int
> traverseCallback(const char *path,
> void *opaque)
> {
> const char *name = opaque;
> char *p = NULL;
>
> if ((p = strrchr(path, '/')))
> p++;
>
> if (STRNEQ(p, name))
> return -1;
>
> return 0;
> }
>
> static int
> findVendor(void)
> {
> char **filepaths = NULL;
> unsigned int flags = 0;
> const char *name = "vendor";
> int n;
> int i;
>
> flags |= VIR_TRAVERSE_DIRECTORY_FALL_THROUGH |
> VIR_TRAVERSE_DIRECTORY_IGNORE_HIDDEN_FILES ;
>
> if ((n = virTraverseDirectory("./testdir", 2, flags,
> traverseCallback, NULL,
> (void *)name, &filepaths)) < 0)
> return -1;
>
> for (i = 0; i < n; i++)
> printf("Match%d: %s\n", i, filepaths[i]);
>
> for (i = 0; i < n; i++)
> VIR_FREE(filepaths[i]);
> VIR_FREE(filepaths);
>
> return 0;
> }
>
> Tree view of "./testdir":
> % tree -a ./testdir/
> ./testdir/
> |-- device
> |-- dir1
> | |-- device
> | |-- dir2
> | | |-- device
> | | |-- dir3
> | | | `-- vendor
> | | `-- vendor
> | |-- .dir7
> | | `-- vendor
> | |-- dir8 -> ../dir4
> | `-- vendor
> |-- dir4
> | |-- device
> | |-- dir5
> | | |-- device
> | | `-- vendor
> | `-- vendor
> `-- vendor
>
> 7 directories, 12 files
>
> % ./utiltest
> Match0: ./testdir/dir1/dir2/vendor
> Match1: ./testdir/dir4/dir5/vendor
>
> Only "vendor" in second depth are returned (flag *_FALL_THROUGH took
> effect), and the symbol link is not followed.
>
> "virTraverseDirectory" is implemented using recursion method, which
> is generally not good for performance and memory use, and even may
> eats up the file descriptors. But since we allow to specify the
> tree "depth", and I don't think libvirt has use cases which need to
> traverse a very deep directory tree. And on the other hand,
> Implementing it using either iteration or stack will be much less
> readable.
>
> Later patches will take use of virTraverseDirectory, and tests
> will be added.
> ---
> src/libvirt_private.syms | 1 +
> src/util/virutil.c | 151 ++++++++++++++++++++++++++++++++++++++++++++++-
> src/util/virutil.h | 52 ++++++++++++++++
> 3 files changed, 203 insertions(+), 1 deletion(-)
>
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 1ea7467..fe182e8 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -1965,6 +1965,7 @@ virSetNonBlock;
> virSetUIDGID;
> virSetUIDGIDWithCaps;
> virStrIsPrint;
> +virTraverseDirectory;
> virValidateWWN;
>
>
> diff --git a/src/util/virutil.c b/src/util/virutil.c
> index c5246bc..c1938f9 100644
> --- a/src/util/virutil.c
> +++ b/src/util/virutil.c
> @@ -2048,7 +2048,6 @@ virFindFCHostCapableVport(const char *sysfs_prefix
ATTRIBUTE_UNUSED)
> virReportSystemError(ENOSYS, "%s", _("Not supported on this
platform"));
> return NULL;
> }
> -
> #endif /* __linux__ */
>
> /**
> @@ -2070,3 +2069,153 @@ virCompareLimitUlong(unsigned long long a, unsigned long b)
>
> return -1;
> }
> +
> +/*
> + * virTraverseDirectory:
> + * @dirpath: Directory path, (relative or absolute)
> + * @depth: The max directory tree depth for traversing
> + * @cb: Callback to handle file (not directory) during traversing
> + * @skibcb: Callback to define rules to skip entry
s/skibcb/skipcb
> + * @opaque: User data to pass to @cb
> + * @filepaths: Pointer to array to store the file paths, the file
> + * path passed to @cb is stored into @filepaths as long
> + * as @cb returns 0.
> + *
> + * Traverse a directory tree into specified @depth, handing the file
s/handing/handling
> + * with callback in the process. Caller must free @filepaths and
> + * its elements after use.
> + *
> + * Returns the number of file paths successfully handled by @cb on
> + * success, with @fillpaths (if passed @fillpath is not NULL) filled,
s/@fillpaths/@filepaths (there's 2 of them)
> + * or -1 on failure.
> + */
NOTE: If you had run 'make -C tests valgrind' as part of your process
you would have seen there's a memory leak here and the 'utiltest' fails.
The output from my run has the following footprints repeated a few times:
==29216== 8 bytes in 1 blocks are definitely lost in loss record 2 of 39
==29216== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==29216== by 0x4A089F0: realloc (vg_replace_malloc.c:662)
==29216== by 0x4C6C7EE: virReallocN (viralloc.c:184)
==29216== by 0x4CBC251: virTraverseDirectory (virutil.c:2592)
==29216== by 0x4CBC384: virTraverseDirectory (virutil.c:2562)
==29216== by 0x4CBC384: virTraverseDirectory (virutil.c:2562)
==29216== by 0x4CBCC7D: virFindPCIDeviceByVPD (virutil.c:2084)
==29216== by 0x40202D: testFindPCIDeviceByVPD (utiltest.c:175)
==29216== by 0x40326F: virtTestRun (testutils.c:158)
==29216== by 0x401D20: mymain (utiltest.c:305)
==29216== by 0x4038AA: virtTestMain (testutils.c:722)
==29216== by 0x37C1021A04: (below main) (libc-start.c:225)
==29216==
==29216== 8 bytes in 1 blocks are definitely lost in loss record 3 of 39
==29216== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==29216== by 0x4A089F0: realloc (vg_replace_malloc.c:662)
==29216== by 0x4C6C7EE: virReallocN (viralloc.c:184)
==29216== by 0x4CBC3B4: virTraverseDirectory (virutil.c:2569)
==29216== by 0x4CBC384: virTraverseDirectory (virutil.c:2562)
==29216== by 0x4CBCC7D: virFindPCIDeviceByVPD (virutil.c:2084)
==29216== by 0x40202D: testFindPCIDeviceByVPD (utiltest.c:175)
==29216== by 0x40326F: virtTestRun (testutils.c:158)
==29216== by 0x401D20: mymain (utiltest.c:305)
==29216== by 0x4038AA: virtTestMain (testutils.c:722)
==29216== by 0x37C1021A04: (below main) (libc-start.c:225)
Beyond that the traversal seems OK to me - I think you're covering the
basics with hidden files and links. I do remember a recent discussion
regarding virFileDeleteTree() which may be useful to revisit to ensure
nothing else is forgotten.
Hm, what virFileDeleteTree does should be able to achieve with
virTraverseDirectory, that's the use case which doesn't have to care
about the returned file paths I think.
> +int
> +virTraverseDirectory(const char *dirpath,
> + int depth,
> + unsigned int flags,
> + virTraverseDirectoryCallback cb,
> + virTraverseDirectorySkipCallback skipcb,
> + void *opaque,
> + char ***filepaths)
> +{
> + struct dirent *entry = NULL;
> + DIR *dir = NULL;
> + char *fpath = NULL;
> + struct stat st;
> + int nfilepaths = 0;
> + int ret = -1;
> + int i;
> +
> + if (depth < 0)
> + return 0;
> +
> + if (filepaths)
> + *filepaths = NULL;
> +
> + if (!(dir = opendir(dirpath))) {
> + virReportSystemError(errno,
> + _("Failed to open directory '%s'"),
> + dirpath);
> + return -1;
> + }
> +
> + while ((entry = readdir(dir))) {
> + /* Always Ignore "." and ".." */
> + if (STREQ(entry->d_name, ".") || STREQ(entry->d_name,
".."))
> + continue;
> +
> + /* Ignore hidden files if it's required */
> + if (flags & VIR_TRAVERSE_DIRECTORY_IGNORE_HIDDEN_FILES &&
> + entry->d_name[0] == '.')
> + continue;
> +
> + if (virAsprintf(&fpath, "%s/%s", dirpath, entry->d_name)
< 0) {
> + virReportOOMError();
> + goto error;
> + }
> +
> + /* Skip to handle the entry as long as @skipcb returns 0 for it */
> + if (skipcb && skipcb(fpath, opaque) == 0) {
> + VIR_FREE(fpath);
> + continue;
> + }
> +
> + if (lstat(fpath, &st) < 0) {
> + virReportSystemError(errno,
> + _("Failed to stat '%s'"),
technically failed to 'lstat'
> + fpath);
> + goto error;
> + }
> +
> + /* Don't follow symbol link unless it's explicitly required */
> + if (S_ISLNK(st.st_mode) &&
> + !(flags & VIR_TRAVERSE_DIRECTORY_FOLLOW_LINK)) {
> + VIR_FREE(fpath);
> + continue;
> + }
> +
> + if (S_ISDIR(st.st_mode)) {
> + char **tmp = NULL;
> + int rc;
> +
> + if ((rc = virTraverseDirectory(fpath, depth - 1, flags, cb, skipcb,
> + opaque, filepaths ? &tmp : NULL))
< 0)
> + goto error;
> +
> + if (rc > 0) {
> + /* Copy the filepaths returned by recursive call */
> + if (filepaths) {
> + if (VIR_REALLOC_N((*filepaths), nfilepaths + rc) < 0) {
> + for (i = 0; i < rc; i++)
> + VIR_FREE(tmp[i]);
> + VIR_FREE(tmp);
> + virReportOOMError();
> + goto error;
> + }
> +
> + for (i = 0; i < rc; i++)
> + (*filepaths)[nfilepaths + i] = tmp[i];
> + }
> + nfilepaths += rc;
> + }
Perhaps the valgrind error is that the else clause here or the one for
"if (filepaths)" in the (rc > 0) condition.
In either case, tmp wouldn't be free()'d properly.
If you required filepaths to be non NULL - I believe the whole
mess/issue would be irrelevant...
> + } else {
> + /* Don't handle the file unless it's the last depth */
> + if ((flags & VIR_TRAVERSE_DIRECTORY_FALL_THROUGH) &&
> + depth != 0) {
> + VIR_FREE(fpath);
> + continue;
> + }
> +
> + if (cb(fpath, opaque) == 0) {
> + if (filepaths) {
> + if (VIR_REALLOC_N((*filepaths), nfilepaths + 1) < 0) {
> + virReportOOMError();
> + goto error;
> + }
> +
> + if (VIR_STRDUP((*filepaths)[nfilepaths], fpath) < 0)
> + goto error;
> +
> + }
> + nfilepaths++;
> + }
> + }
> +
> + VIR_FREE(fpath);
> + }
> +
> + ret = nfilepaths;
> +
> +cleanup:
> + closedir(dir);
> + return ret;
> +
> +error:
> + VIR_FREE(fpath);
> + if (filepaths) {
> + for (i = 0; i < nfilepaths; i++)
> + VIR_FREE((*filepaths)[i]);
> + VIR_FREE(*filepaths);
> + }
> + goto cleanup;
> +}
> diff --git a/src/util/virutil.h b/src/util/virutil.h
> index 280a18d..6c46f23 100644
> --- a/src/util/virutil.h
> +++ b/src/util/virutil.h
> @@ -166,4 +166,56 @@ char *virFindFCHostCapableVport(const char *sysfs_prefix);
>
> int virCompareLimitUlong(unsigned long long a, unsigned long b);
>
> +/**
> + * virTraverseDirectoryCallback:
> + * @fpath: file path to handle
> + * @opaque: User data to pass to the callback
> + *
> + * Callback for "virTraverseDirectory".
> + *
> + * Returns 0 on success, -1 on failure.
> + */
> +typedef int (*virTraverseDirectoryCallback)(const char *fpath,
> + void *opaque);
> +
> +/**
> + * virTraverseDirectorySkipCallback:
> + * @fpath: file path to handle
> + * @opaque: User data to pass to the callback
> + *
> + * Callback to control whether virTraverseDirectory should ship
> + * @fpath E.g. Skipping files whose name match a regex pattern.
> + *
> + * Return 0 to let "virTraverseDirectory" skip handling the @fpath,
> + * -1 to not skip.
> + */
> +typedef int (*virTraverseDirectorySkipCallback)(const char *fpath,
> + void *opaque);
> +
> +/**
> + * virTraverseDirectoryFlags:
> + *
> + * Except virTraverseDirectorySkipCallback, one can use these flags to
> + * control the general behaviour of virTraverseDirectory
> + */
> +enum {
> + /* Follow symbol links */
> + VIR_TRAVERSE_DIRECTORY_FOLLOW_LINK = 1 << 0,
> +
> + /* Ignore hidden files or directory */
> + VIR_TRAVERSE_DIRECTORY_IGNORE_HIDDEN_FILES = 1 << 1,
> +
> + /* Don't handle files unless it's in the last depth */
> + VIR_TRAVERSE_DIRECTORY_FALL_THROUGH = 1 << 2,
Better named "MATCH_DEPTH_ONLY"? FALL_THROUGH just has other
Not sure, I know FALL_THROUGH is not a good name, but I don't see
MATCH_DEPTH_ONLY makes it better either.. :-), since whether it's
matching or anything else, completely depends on what @cb does.
> +} virTraverseDirectoryFlags;
> +
> +int virTraverseDirectory(const char *dirpath,
> + int depth,
> + unsigned int flags,
> + virTraverseDirectoryCallback cb,
> + virTraverseDirectorySkipCallback skipcb,
> + void *opaque,
> + char ***filepaths);
> + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3)
Not sure "3" is what you mean - perhaps you meant 4 (eg, cb).
yes.. I made too many typos..
It also stands to reason filepaths shouldn't be NULL. I guess I just
don't see the reason to return just a count, but perhaps there's a use
case I'm not thinking about or will be apparent in future patches.
John
> +
> #endif /* __VIR_UTIL_H__ */
>