libdacav 0.9.0
|
This module provides a simple hash table implementation. More...
Typedefs | |
typedef struct dhash | dhash_t |
Opaque type for hash table. | |
typedef uintptr_t(* | dhash_cb_t )(const void *key) |
Callback for hashing of a generic object. | |
typedef struct pair | dhash_pair_t |
Opaque structure that contains the <key,value> pair. | |
Enumerations | |
enum | dhash_result_t { DHASH_FOUND = 0, DHASH_NOTFOUND = 1 } |
Return values for hash lookup functions. More... | |
Functions | |
const void * | dhash_key (dhash_pair_t *p) |
Given a dhash_pair_t element, returns the corresponding key. | |
void * | dhash_val (dhash_pair_t *p) |
Given a dhash_pair_t element, returns the corresponding value. | |
dhash_t * | dhash_new (unsigned nbuckets, dhash_cb_t hf, dcmp_cb_t cmp, const dcprm_t *key_cprm, const dcprm_t *val_cprm) |
Complete constructor for a hash table. | |
void | dhash_free (dhash_t *htab) |
Destructor for a hash table. | |
dhash_result_t | dhash_insert (dhash_t *htab, const void *key, const void *value) |
Value insertion procedure. | |
dhash_result_t | dhash_search_default (dhash_t *htab, const void *key, void **found, dcreate_cb_t create, void *ctx) |
Search for a value by providing a key. | |
dhash_result_t | dhash_search (dhash_t *htab, const void *key, void **found) |
Search for a value by providing a key. | |
dhash_result_t | dhash_delete (dhash_t *htab, const void *key, void **found) |
Delete the value associated to the given key. | |
diter_t * | dhash_iter_new (dhash_t *t) |
Build an iterator for the hash table. | |
void | dhash_iter_free (diter_t *i) |
Destroy the iterator. |
This module provides a simple hash table implementation.
When allocating a new hash table, five parameters are usually needed:
The number of buckets provided as first parameter is a trade-off between performances in time and space required. It should depend on the predicted number of stored element. The more objects are sparse among buckets, the more data retrieving will be optimized. Choosing a prime number will help in reducing hash clashes.
Providing no hash function will result in the library using directly the value of the pointer as hash. Providing no comparison function will result in comparison of stored object addresses and search keys addresses. This can be useful, for instance, if you are storing primitive values and don't want to allocate memory just for them: just use the pointer as it were an integer.
Providing no copy-constructor and destructor for the keys will result in the library directly storing the user-provided keys. This is usually a good idea only if you plan to store primitive types! You may do it also for allocated composite types, but this may result in memory leaks if you overwrite a stored value with dhash_insert(), as only the value (and not the key) will be overwritten, and the key you provided is simply not used. If you are crazy enough to do something like that, please check the return value of dhash_insert() in this case to decide what to do with your local copy of the key.
Example of hash table allocation:
dcprm_t cprm_k = { .cp = copy_key, // copy-constructor for key .rm = free_key // destructor for key }; dcprm_t cprm_v = { .cp = copy_key, // copy-constructor for value .rm = free_key // destructor for value }; dhash_t *tab = dhash_new(N_BUCKETS, hash_function, cmp_function, &cprm_k, &cprm_v);
In this example we show to use integer values as keys.
A good hashing function for integer (the one used as default hashing function) may be:
static unsigned hash_int (const void *key) { return (int) key; }
Since the required semantic for the comparison is the same used by strcmp(3), a simple subtraction between two integers behaves correctly:
static int cmp_int (const void *v0, const void *v1) { return (int) v0 - (int) v1; }
Since integers are primitive types, we don't need to specify how to copy and destroy them. The declaration of such a hash table is so far the following:
dhash_t *tab = dhash_new(N_BUCKETS, hash_int, cmp_int, NULL, NULL);
Actually the shown hash and comparison functions are already defined internally, so in order to obtain the same result, the hash table initialization can be done by passing NULL as second and third argument.
dhash_t *tab = dhash_new(N_BUCKETS, NULL, NULL, NULL, NULL);
Note that this kind of behavior can also be exploited to achieve comparison based on object identity instead of object equality.
Suppose you want to map strings to strings with an hash table.
A good hashing function may be the following (which btw is the one standardized by the ELF specification):
static unsigned hash_str (const void *key) { const char *name; unsigned h = 0, g; while (*name) { h = (h << 4) + *name++; if ((g = h & 0xf00000000) != 0) { h &= g >> 24; } h &= ~g; } return h; }
A good destructor is just free(3), while a good copy-constructor for strings is strcpy(3), except it doesn't work with NULL pointers. We can wrap it, just beware of zero-termination:
static void * copy_str (const void *key) { if (key == NULL) return NULL; return strdup((const char *)key); }
The string comparison function is simply supplied by strcmp(3), however since the signature of strcmp(3) is slightly different from the dcmp_func_t data type, a casting is needed.
The initialization would be the following
dhash_cprm_t cprm = { .cp = copy_str, .rm = free }; dhash_t *tab = dhash_new(N_BUCKETS, hash_str, (dcmp_func_t)strcmp, &cprm, &cprm);
Because of the particular nature of a hash table, for which every entry is composed by <key,value> couples, the return value of a diter_next() function called on a hash table iterator returns a pointer to an object of type dhash_pair_t
In order to obtain the correct key and the correct value for a dhash_pair_t object, the functions dhash_key() and dhash_val() must be called respectively.
By example, the following procedure will print all the <key, value> couples contained in a hash table and will remove them as long as they are listed:
void print_key_values () { diter_t *iter = dhash_iter_new(table); while (diter_hasnext(iter)) { dhash_pair_t *pair = diter_next(iter); int k = (int)dhash_key(pair); int v = (int)dhash_val(pair); printf("key=%d, value=%d\n", k, v); dhash_remove(iter); } dhash_iter_free(iter); }
typedef uintptr_t(* dhash_cb_t)(const void *key) |
Callback for hashing of a generic object.
typedef struct pair dhash_pair_t |
Opaque structure that contains the <key,value> pair.
typedef struct dhash dhash_t |
Opaque type for hash table.
enum dhash_result_t |
Return values for hash lookup functions.
dhash_result_t dhash_delete | ( | dhash_t * | htab, |
const void * | key, | ||
void ** | found | ||
) |
Delete the value associated to the given key.
This function behaves exactly as dhash_search() except that, if the object is found, it gets removed.
[in] | htab | The hash table to search in; |
[in] | key | The key to be searched; |
[out] | found | A pointer to a location in which the value will be stored; You may pass NULL as found parameter in order to simply delete the entry without having it back. |
dhash_result_t::DHASH_FOUND | on success; |
dhash_result_t::DHASH_NOTFOUND | on lookup failure. |
void dhash_free | ( | dhash_t * | htab | ) |
Destructor for a hash table.
[in] | htab | The table to destroy; |
dhash_result_t dhash_insert | ( | dhash_t * | htab, |
const void * | key, | ||
const void * | value | ||
) |
Value insertion procedure.
Depending on how the constructor has been parametrized, this function may directly store the given pointers for key and value, or may build an internal copy of them. In the latter case the copies will also be freed when calling dhash_free().
[in] | htab | The table in which insertion must be achieved; |
[in] | key | Key for the inserted value; |
[in] | value | The value to be inserted. |
dhash_result_t::DHASH_FOUND | if there has been a replacement; |
dhash_result_t::DHASH_NOTFOUND | if a new object has been added; |
void dhash_iter_free | ( | diter_t * | i | ) |
Build an iterator for the hash table.
[in] | t | The table on which the iterator must be built. |
const void* dhash_key | ( | dhash_pair_t * | p | ) |
Given a dhash_pair_t element, returns the corresponding key.
This is used to extract the key from pairs yielded by the iterator. The returned key is internal to the table. As such it cannot be modified.
[in] | p | The <key,value> pair. |
dhash_t* dhash_new | ( | unsigned | nbuckets, |
dhash_cb_t | hf, | ||
dcmp_cb_t | cmp, | ||
const dcprm_t * | key_cprm, | ||
const dcprm_t * | val_cprm | ||
) |
Complete constructor for a hash table.
[in] | nbuckets | The number of buckets of the table (should be somehow proportional to the number of inserted objects). A prime number would be a good choice; |
[in] | hf | The callback used to retrieve the hash value for the keys. Provide NULL if the key pointer should be directly used as hash; |
[in] | cmp | The callback used to compare two keys, must behave like the strcmp(3) function. Provide NULL if the key pointer should be directly used as hash; |
[in] | key_cprm | A pointer to a copy-constructor and destructor pair to be used with keys. If NULL is provided, the inserted keys shall be directly stored, and no internal copy for keys shall be allocated. |
[in] | val_cprm | A pointer to a copy-constructor and destructor pair to be used with values. If NULL is provided, the inserted keys shall be directly stored, and no internal copy for values shall be allocated. |
dhash_result_t dhash_search | ( | dhash_t * | htab, |
const void * | key, | ||
void ** | found | ||
) |
Search for a value by providing a key.
[in] | htab | The hash table to search in; |
[in] | key | The key to be searched; |
[out] | found | A pointer to a location in which a pointer to the value will be stored. You may pass NULL as found parameter in order to simply test the presence of the required <key,value> pair. |
dhash_result_t::DHASH_FOUND | on success; |
dhash_result_t::DHASH_NOTFOUND | on lookup failure. |
dhash_result_t dhash_search_default | ( | dhash_t * | htab, |
const void * | key, | ||
void ** | found, | ||
dcreate_cb_t | create, | ||
void * | ctx | ||
) |
Search for a value by providing a key.
If it doesn't exist create it.
This function works as dhash_search(), but also allows to specify a callback to be called when the object you are searching cannot be found. The callback shall return an object which will be directly inserted as value.
[in] | htab | The table in which insertion must be achieved; |
[in] | key | Key for the inserted value; |
[out] | found | The value to be inserted. |
[in] | create | The callback for object creation; |
[in] | ctx | The context for the callback. |
dhash_result_t::DHASH_FOUND | if the element has been found; |
dhash_result_t::DHASH_NOTFOUND | if the element has not been found and the new element has been allocated. |
void* dhash_val | ( | dhash_pair_t * | p | ) |
Given a dhash_pair_t element, returns the corresponding value.
This is used to extract the value from pairs yielded by the iterator. The returned value is internal to the table, and unlike for the key, modifying it is allowed (it will result in changing the stored value).
[in] | p | The <key,value> pair. |