This module provides a simple hash table implementation. More...

Typedefs
typedef struct dhash	dhash_t
	Opaque type for hash table.
typedef uintptr_t(*	dhash_cb_t )(const void *key)
	Callback for hashing of a generic object.
typedef struct pair	dhash_pair_t
	Opaque structure that contains the <key,value> pair.
Enumerations
enum	dhash_result_t { DHASH_FOUND = 0, DHASH_NOTFOUND = 1 }
	Return values for hash lookup functions. More...
Functions
const void *	dhash_key (dhash_pair_t *p)
	Given a dhash_pair_t element, returns the corresponding key.
void *	dhash_val (dhash_pair_t *p)
	Given a dhash_pair_t element, returns the corresponding value.
dhash_t *	dhash_new (unsigned nbuckets, dhash_cb_t hf, dcmp_cb_t cmp, const dcprm_t key_cprm, const dcprm_t val_cprm)
	Complete constructor for a hash table.
void	dhash_free (dhash_t *htab)
	Destructor for a hash table.
dhash_result_t	dhash_insert (dhash_t htab, const void key, const void *value)
	Value insertion procedure.
dhash_result_t	dhash_search_default (dhash_t htab, const void key, void *found, dcreate_cb_t create, void ctx)
	Search for a value by providing a key.
dhash_result_t	dhash_search (dhash_t htab, const void key, void **found)
	Search for a value by providing a key.
dhash_result_t	dhash_delete (dhash_t htab, const void key, void **found)
	Delete the value associated to the given key.
diter_t *	dhash_iter_new (dhash_t *t)
	Build an iterator for the hash table.
void	dhash_iter_free (diter_t *i)
	Destroy the iterator.

Detailed Description

This module provides a simple hash table implementation.

When allocating a new hash table, five parameters are usually needed:

The number of buckets of the hash table;
A callback that computes a hash of a key;
A callback that compares two keys;
A pair of callbacks for respectively copying and freeing internal copies of the keys;
A similar pair of callbacks for the values.

The number of buckets provided as first parameter is a trade-off between performances in time and space required. It should depend on the predicted number of stored element. The more objects are sparse among buckets, the more data retrieving will be optimized. Choosing a prime number will help in reducing hash clashes.

Providing no hash function will result in the library using directly the value of the pointer as hash. Providing no comparison function will result in comparison of stored object addresses and search keys addresses. This can be useful, for instance, if you are storing primitive values and don't want to allocate memory just for them: just use the pointer as it were an integer.

Providing no copy-constructor and destructor for the keys will result in the library directly storing the user-provided keys. This is usually a good idea only if you plan to store primitive types! You may do it also for allocated composite types, but this may result in memory leaks if you overwrite a stored value with dhash_insert(), as only the value (and not the key) will be overwritten, and the key you provided is simply not used. If you are crazy enough to do something like that, please check the return value of dhash_insert() in this case to decide what to do with your local copy of the key.

Allocation and deallocation:

Example of hash table allocation:


dcprm_t cprm_k = {
    .cp = copy_key, // copy-constructor for key
    .rm = free_key  // destructor for key
};

dcprm_t cprm_v = {
    .cp = copy_key, // copy-constructor for value
    .rm = free_key  // destructor for value
};

dhash_t *tab = dhash_new(N_BUCKETS, hash_function, cmp_function,
                         &cprm_k, &cprm_v);

See also:: dhash_cb_t; dcmp_cb_t; dprm_t; dhash_new

Example using integers as keys

In this example we show to use integer values as keys.

A good hashing function for integer (the one used as default hashing function) may be:

static unsigned hash_int (const void *key)
{
    return (int) key;
}

Since the required semantic for the comparison is the same used by strcmp(3), a simple subtraction between two integers behaves correctly:

static int cmp_int (const void *v0, const void *v1)
{
    return (int) v0 - (int) v1;
}

Since integers are primitive types, we don't need to specify how to copy and destroy them. The declaration of such a hash table is so far the following:

dhash_t *tab = dhash_new(N_BUCKETS, hash_int, cmp_int, NULL, NULL);

Actually the shown hash and comparison functions are already defined internally, so in order to obtain the same result, the hash table initialization can be done by passing NULL as second and third argument.

dhash_t *tab = dhash_new(N_BUCKETS, NULL, NULL, NULL, NULL);

Note that this kind of behavior can also be exploited to achieve comparison based on object identity instead of object equality.

Using strings as keys

Suppose you want to map strings to strings with an hash table.

A good hashing function may be the following (which btw is the one standardized by the ELF specification):

static unsigned hash_str (const void *key)
{
    const char *name;
    unsigned h = 0, g;

    while (*name) {
        h = (h << 4) + *name++;
        if ((g = h & 0xf00000000) != 0) {
            h &= g >> 24;
        }
        h &= ~g;
    }
    return h;
}

A good destructor is just free(3), while a good copy-constructor for strings is strcpy(3), except it doesn't work with NULL pointers. We can wrap it, just beware of zero-termination:

static void * copy_str (const void *key)
{
    if (key == NULL) return NULL;
    return strdup((const char *)key);
}

The string comparison function is simply supplied by strcmp(3), however since the signature of strcmp(3) is slightly different from the dcmp_func_t data type, a casting is needed.

The initialization would be the following

dhash_cprm_t cprm = {
    .cp = copy_str,
    .rm = free
};
dhash_t *tab = dhash_new(N_BUCKETS, hash_str, (dcmp_func_t)strcmp, &cprm,
                         &cprm);

Iterators for hash tables:

Because of the particular nature of a hash table, for which every entry is composed by <key,value> couples, the return value of a diter_next() function called on a hash table iterator returns a pointer to an object of type dhash_pair_t

In order to obtain the correct key and the correct value for a dhash_pair_t object, the functions dhash_key() and dhash_val() must be called respectively.

By example, the following procedure will print all the <key, value> couples contained in a hash table and will remove them as long as they are listed:

void print_key_values ()
{
    diter_t *iter = dhash_iter_new(table);
    while (diter_hasnext(iter)) {
        dhash_pair_t *pair = diter_next(iter);
        int k = (int)dhash_key(pair);
        int v = (int)dhash_val(pair);
        printf("key=%d, value=%d\n", k, v);
        dhash_remove(iter);
    }
    dhash_iter_free(iter);
}

Note:: Keys and values yielded through iterators are exactly the ones stored into the hash table. Keys are constant, but values are not, thus if you stored pointers you may modified stored objects. Of course this is not true if you stored primitve types.

See also:: diter_next; dhash_pair_t; dhash_key; dhash_val; Iterators

Typedef Documentation

typedef uintptr_t(* dhash_cb_t)(const void *key)

Callback for hashing of a generic object.

typedef struct pair dhash_pair_t

Opaque structure that contains the <key,value> pair.

See also:: dhash_key is a function for key extraction;; dhash_val is a function for value extraction;; Hash Tables contains a section with further information about iterator semantics for hash tables.

typedef struct dhash dhash_t

Opaque type for hash table.

Enumeration Type Documentation

enum dhash_result_t

Return values for hash lookup functions.

See also:: dhash_search;; dhash_delete;; dhash_insert.

Enumerator:

DHASH_FOUND	Object found;.
DHASH_NOTFOUND	Object not found.

Function Documentation

dhash_result_t dhash_delete	(	dhash_t *	htab,
		const void *	key,
		void **	found
	)

Delete the value associated to the given key.

This function behaves exactly as dhash_search() except that, if the object is found, it gets removed.

Parameters:

[in]	htab	The hash table to search in;
[in]	key	The key to be searched;
[out]	found	A pointer to a location in which the value will be stored; You may pass NULL as found parameter in order to simply delete the entry without having it back.

Return values:

dhash_result_t::DHASH_FOUND	on success;
dhash_result_t::DHASH_NOTFOUND	on lookup failure.

Note:: In case of failure the table remains unchanged.

void dhash_free ( dhash_t * htab )

Destructor for a hash table.

Parameters:

[in] htab The table to destroy;

dhash_result_t dhash_insert	(	dhash_t *	htab,
		const void *	key,
		const void *	value
	)

Value insertion procedure.

Depending on how the constructor has been parametrized, this function may directly store the given pointers for key and value, or may build an internal copy of them. In the latter case the copies will also be freed when calling dhash_free().

Parameters:

[in]	htab	The table in which insertion must be achieved;
[in]	key	Key for the inserted value;
[in]	value	The value to be inserted.

Return values:

dhash_result_t::DHASH_FOUND	if there has been a replacement;
dhash_result_t::DHASH_NOTFOUND	if a new object has been added;

void dhash_iter_free ( diter_t * i )

Destroy the iterator.

Parameters:

[in] i The iterator to destroy.

See also:: diter_t.

diter_t* dhash_iter_new ( dhash_t * t )

Build an iterator for the hash table.

Parameters:

[in] t The table on which the iterator must be built.

Returns:: The iterator.

See also:: Hash Tables contains a section with further information about iterator semantics for hash tables.

const void* dhash_key ( dhash_pair_t * p )

Given a dhash_pair_t element, returns the corresponding key.

This is used to extract the key from pairs yielded by the iterator. The returned key is internal to the table. As such it cannot be modified.

See also:: Hash Tables contains a section with further information about iterator semantics for hash tables.

Parameters:

[in] p The <key,value> pair.

Returns:: The key.

dhash_t* dhash_new	(	unsigned	nbuckets,
		dhash_cb_t	hf,
		dcmp_cb_t	cmp,
		const dcprm_t *	key_cprm,
		const dcprm_t *	val_cprm
	)

Complete constructor for a hash table.

Parameters:

[in]	nbuckets	The number of buckets of the table (should be somehow proportional to the number of inserted objects). A prime number would be a good choice;
[in]	hf	The callback used to retrieve the hash value for the keys. Provide NULL if the key pointer should be directly used as hash;
[in]	cmp	The callback used to compare two keys, must behave like the strcmp(3) function. Provide NULL if the key pointer should be directly used as hash;
[in]	key_cprm	A pointer to a copy-constructor and destructor pair to be used with keys. If NULL is provided, the inserted keys shall be directly stored, and no internal copy for keys shall be allocated.
[in]	val_cprm	A pointer to a copy-constructor and destructor pair to be used with values. If NULL is provided, the inserted keys shall be directly stored, and no internal copy for values shall be allocated.

See also:: dhash_new_simple;; dcprm_t.

Returns:: A pointer to the allocated hash table.

dhash_result_t dhash_search	(	dhash_t *	htab,
		const void *	key,
		void **	found
	)

Search for a value by providing a key.

Parameters:

[in]	htab	The hash table to search in;
[in]	key	The key to be searched;
[out]	found	A pointer to a location in which a pointer to the value will be stored. You may pass NULL as found parameter in order to simply test the presence of the required <key,value> pair.

Return values:

dhash_result_t::DHASH_FOUND	on success;
dhash_result_t::DHASH_NOTFOUND	on lookup failure.

Note:: In case of success *found will point to the internally stored value, thus modification will result in internal modification. In case of lookup failure, the *found pointer won't be modified.

dhash_result_t dhash_search_default	(	dhash_t *	htab,
		const void *	key,
		void **	found,
		dcreate_cb_t	create,
		void *	ctx
	)

Search for a value by providing a key.

If it doesn't exist create it.

This function works as dhash_search(), but also allows to specify a callback to be called when the object you are searching cannot be found. The callback shall return an object which will be directly inserted as value.

Parameters:

[in]	htab	The table in which insertion must be achieved;
[in]	key	Key for the inserted value;
[out]	found	The value to be inserted.
[in]	create	The callback for object creation;
[in]	ctx	The context for the callback.

Return values:

dhash_result_t::DHASH_FOUND	if the element has been found;
dhash_result_t::DHASH_NOTFOUND	if the element has not been found and the new element has been allocated.

See also:: dcreate_cb_t.

void* dhash_val ( dhash_pair_t * p )

Given a dhash_pair_t element, returns the corresponding value.

This is used to extract the value from pairs yielded by the iterator. The returned value is internal to the table, and unlike for the key, modifying it is allowed (it will result in changing the stored value).

See also:: Hash Tables contains a section with further information about iterator semantics for hash tables.

Parameters:

[in] p The <key,value> pair.

Returns:: The value.

Typedefs

Enumerations

Functions

Detailed Description

Allocation and deallocation:

Example using integers as keys

Using strings as keys

Iterators for hash tables:

Typedef Documentation

Enumeration Type Documentation

Function Documentation