On 10/26/2011 05:36 AM, Stefan Berger wrote:
Add a function to the virHashTable for getting an array of the hash
table's
keys and have the keys (optionally) sorted.
Signed-off-by: Stefan Berger <stefanb(a)linux.vnet.ibm.com>
---
src/libvirt_private.syms | 3 +
src/util/hash.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++
src/util/hash.h | 14 ++++++
3 files changed, 115 insertions(+)
Reordering for review purposes:
Index: libvirt-acl/src/util/hash.h
===================================================================
--- libvirt-acl.orig/src/util/hash.h
+++ libvirt-acl/src/util/hash.h
@@ -130,6 +130,20 @@ void *virHashLookup(virHashTablePtr tabl
*/
void *virHashSteal(virHashTablePtr table, const void *name);
+/*
+ * Get the set of keys and have them optionally sorted.
+ */
+typedef struct _virHashKeyValuePair virHashKeyValuePair;
+typedef virHashKeyValuePair *virHashKeyValuePairPtr;
+struct _virHashKeyValuePair {
+ void *key;
Why isn't this 'const void *key'?
+ const void *value;
+};
+typedef int (*virHashKeyComparator)(const virHashKeyValuePairPtr ,
+ const virHashKeyValuePairPtr );
+void **virHashGetKeys(virHashTablePtr table, virHashKeyComparator
compar);
So the caller only knows how large the returned array is by calling
virHashSize? I take it passing NULL for compar is what gets the keys in
an unsorted order.
Should this return 'const void **', that is, an array that can be
modified, but whose elements are const void * pointers into the hash
table, where the keys are not modifiable through this array view?
+void virHashFreeKeys(virHashTablePtr table, void **);
I'm not sure we need this. It seems like the array returned by
virHashGetKeys should be able to just pass directly to VIR_FREE, without
reference to which table it was copied from, since all the elements of
the array are just pointers into the hash table. I guess I could see
using this if you are cloning keys in the process of creating the array,
but I'm not sure that cloning keys is worth it.
* Iterators
Index: libvirt-acl/src/libvirt_private.syms
===================================================================
--- libvirt-acl.orig/src/libvirt_private.syms
+++ libvirt-acl/src/libvirt_private.syms
@@ -559,6 +559,8 @@ virHashAddEntry;
virHashCreate;
virHashForEach;
virHashFree;
+virHashFreeKeys;
This line might not be needed, per my above questioning.
+virHashGetKeys;
virHashLookup;
virHashRemoveEntry;
virHashRemoveSet;
@@ -566,6 +568,7 @@ virHashSearch;
virHashSize;
virHashSteal;
virHashTableSize;
+virHashUpdateEntry;
Did we really have this one missing?
+++ libvirt-acl/src/util/hash.c
@@ -618,3 +618,101 @@ void *virHashSearch(virHashTablePtr tabl
return NULL;
}
+
+
+struct getKeysIter
+{
+ virHashTablePtr table;
+ void **array;
+ virHashKeyValuePair *sortArray;
+ unsigned arrayIdx;
+ bool error;
If you go with my idea of VIR_FREE() to clean up the returned array,
then you don't need bool error.
+};
+
+static void virHashGetKeysIterator(void *payload,
+ const void *name, void *data)
+{
+ struct getKeysIter *iter = data;
+ void *key;
+
+ if (iter->error)
+ return;
+
+ key = iter->table->keyCopy(name);
+
+ if (key == NULL) {
+ virReportOOMError();
+ iter->error = true;
+ return;
+ }
These statements disappear, replaced by:
key = name;
(const-correcness may require that you use const void * in more places).
+
+ if (iter->sortArray) {
+ iter->sortArray[iter->arrayIdx].key = key;
+ iter->sortArray[iter->arrayIdx].value = payload;
+ } else {
+ iter->array[iter->arrayIdx] = iter->table->keyCopy(name);
and this becomes iter->array[iter->arrayIdx] = key;
+ }
+ iter->arrayIdx++;
+}
+
+void virHashFreeKeys(virHashTablePtr table, void **keys)
+{
+ unsigned i = 0;
+
+ if (keys == NULL)
+ return;
+
+ while (keys[i])
+ table->keyFree(keys[i++]);
+
+ VIR_FREE(keys);
+}
then this function disappears.
+
+typedef int (*qsort_comp)(const void *, const void *);
+
+void **virHashGetKeys(virHashTablePtr table,
+ virHashKeyComparator compar)
This function needs better documentation; compare to virHashForEach to
see an example. Mention that the returned array must be passed to VIR_FREE
+{
+ int i, numElems = virHashSize(table);
+ struct getKeysIter iter = {
+ .table = table,
+ .arrayIdx = 0,
+ .error = false,
You need to explicitly initialize .sortArray to NULL, otherwise,
virHashGetKeysIterator will see it as uninitialized garbage, and may
attempt to store into .sortArray instead of the intended .array.
+ };
+
+ if (numElems < 0) {
+ return NULL;
+ }
+
+ if (VIR_ALLOC_N(iter.array, numElems + 1)) {
+ virReportOOMError();
+ return NULL;
+ }
+
+ /* null-terminate array */
+ iter.array[numElems] = NULL;
The array is already NULL-terminated, by virtue of VIR_ALLOC_N, so this
line is not necessary.
+
+ if (compar &&
+ VIR_ALLOC_N(iter.sortArray, numElems)) {
+ VIR_FREE(iter.array);
+ virReportOOMError();
+ return NULL;
+ }
+
+ virHashForEach(table, virHashGetKeysIterator, &iter);
+ if (iter.error) {
+ VIR_FREE(iter.sortArray);
+ virHashFreeKeys(table, iter.array);
+ return NULL;
+ }
By returning const void* into the array, the virHashForEach no longer
does allocation (just copying in-hash pointers into the array), so it
can't fail, and this if statement can be dropped.
+
+ if (compar) {
+ qsort(&iter.sortArray[0], numElems, sizeof(iter.sortArray[0]),
+ (qsort_comp)compar);
+ for (i = 0; i < numElems; i++)
+ iter.array[i] = iter.sortArray[i].key;
+ VIR_FREE(iter.sortArray);
+ }
+
+ return iter.array;
+}
Also a meta-question. I guess I can see where this might be useful -
the act of sorting hash keys is not worth reinventing with every caller.
And making the user store data in a sorted array instead of a hash
table hurts lookup performance; so if lookup is more frequent than
listing, but listing must be sorted, then this is a useful abstraction.
But why are we just returning keys, instead of key/value pairs? If
we're going to do sorting on key-value pairs, would it be any easier to
hand the user back key-value pairs, so they don't have to do key lookups
on every element of the returned array?
Also, some food for thought. Gnulib provides a GPLv3+ implementation of
a sorted list backed by a hash table, which provides the best of both
ideals of sorted traversal and fast lookup. While we can't use that
module for licensing reasons, the data structure used there may prove an
interesting study for what you are using this function for.
http://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/gl_list.h
In particular, if you make the hash payload carry linked-list data
alongside the normal lookup value, and maintain the list order on
insertion, then you have the benefit of both orderly traversal and fast
lookups, without having to add any new functions to util/hash.h, but at
the expense of more legwork in the caller. So, if we ever have more
than one client that wants to maintain a sorted list while still having
fast key lookup, it may make more sense to provide a
src/util/sortedhash.h that wraps hash.h with the additional legwork to
provide the sorting.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org