.ft 5 #include <dict.h> .ft 1.Ss "DICTIONARY TYPES"
.ft 5 Void_p; Dict_t; Dtlink_t; Dtstat_t; Dtdisc_t; Dtmethod_t; typedef Void_p (*Dtmake_f)(Void_p); typedef void (*Dtfree_f)(Void_p); typedef int (*Dtcompar_f)(char*, char*); typedef unsigned long (*Dthash_f)(char*); .ft 1.Ss "CREATING/CLOSING DICTIONARIES"
.ft 5 Dict_t* dtopen(Dtdisc_t* disc, Dtmethod_t method) int dtclose(Dict_t* dict) Dtlink_t* dtextract(Dict_t* dict) void dtreset(Dict_t* dict, Dtlink_t* link) .ft 1.Ss "DICTIONARY CONTROL"
.ft 5 int dtmethod(Dict_t* dict, Dtmethod_t method, int size) Dict_t* dtview(Dict_t* dict, Dict_t* view) int dtreorder(Dict_t* dict, Dtcompar_f comparf) .ft 1.Ss "OBJECT OPERATIONS"
.ft 5 Void_p dtinsert(Dict_t* dict, Void_p obj) Void_p dtdelete(Dict_t* dict, Void_p obj) Void_p dtsearch(Dict_t* dict, Void_p obj) Void_p dtfirst(Dict_t* dict) Void_p dtnext(Dict_t* dict, Void_p obj) Void_p dtlast(Dict_t* dict) Void_p dtprev(Dict_t* dict, Void_p obj) int dtwalk(Dict_t* dict, int (*userf)(Void_p)) Dtlink_t* dtflatten(Dict_t* dict) Dtlink_t* dtlink(Dict_t*, Dtlink_t* link) Void_p dtobj(Dict_t* dict, Dtlink_t* link) .ft 1.Ss "DICTIONARY STATUS"
.ft 5 int dtsize(Dict_t* dict) Dict_t* dtgetview(Dict_t* dict) Dtstat_t* dtstat(Dict_t* dict, int all) .ft 1.Ss "HASH FUNCTIONS"
.ft 5 #define dtcharhash(unsigned long h, unsigned char c) #define dtstrhash(unsigned long h, unsigned char* str, int n) .ft 1
libdict provides functions to manage run-time dictionaries. Hash tables are used for unordered objects and self-adjusting binary trees for ordered ones. A dictionary may switch between hash mode and order mode. It can also employ different orderings at different times. As an example of using this, consider an application that builds and outputs a list of words. During its construction, for fast search and elimination of duplications, the dictionary can be in hash mode. In this phase, a simple and fast string comparison method such as \f5strcmp(3) could be used. However, at output time, it may be desirable to have the words in, say, the lexicographic order. This can be done with libdict in two function calls, one to change the comparison function and one to turn to order mode.
.Ss "DICTIONARY TYPES"
.Ss " Void_p" This defines a suitable type to pass objects between libdict and the application code. \f5Void_p is defined as \f5void* for ANSI-C and C++ and \f5char* for other compilation environment. .Ss " Dict_t" .Ss " Dtlink_t" .Ss " Dtdisc_t" .Ss " Dtmethod_t" .Ss " typedef Void_p (*Dtmake_f)(Void_p)" .Ss " typedef void (*Dtfree_f)(Void_p)" .Ss " typedef int (*Dtcompar_f)(char*, char*)" .Ss " typedef unsigned long (*Dthash_f)(char*)" .Ss " Dtstat_t" \f5Dict_t is the type of a dictionary. See \f5dtopen() for detailed discussion of the types: \f5Dict_t, \f5Dtlink_t, \f5Dtdisc_t, \f5Dtmake_f, \f5Dtfree_f, \f5Dtcompar_f, and \f5Dthash_f. See \f5dtstat() for \f5Dtstat_t which give statistics on a dictionary.
.Ss "CREATING/CLOSING DICTIONARIES"
.Ss " Dict_t* dtopen(Dtdisc_t* disc, Dtmethod_t method)" This function creates a new dictionary. It requires two arguments: \f5disc and \f5method. \f5method can be \f5Dttree for order mode or \f5Dthash for hash mode. \f5disc specifies methods to create/delete/compare objects in the dictionary. A discipline structure is of the type \f5Dtdisc_t:
\f5
Dtdisc_t
{
int key, size;
Dtmake_f makef;
Dtfree_f freef;
Dtcompar_f; comparf;
Dthash_f hashf;
}
.Tp
\f5int key, size:
\f5key defines the offset from the starting address of a user's object
to the key and \f5size is the length of the key.
Keys are used in comparing objects
and in calculating their hash values (see \f5(*comparf)() and \f5(*hashf)()).
If \f5key is negative, the object's address is the key.
In this case, the \f5size field is ignored.
If \f5key is non-negative, the key is offset \f5key bytes
into the object. Let's call this address the key address.
The key itself is defined by the value of \f5size.
If \f5size is positive, the key is a \f5char array of \f5size bytes
starting at the key address.
If \f5size is zero, the key is a null-terminated string starting at the key address.
Finally, if \f5size is negative, the key is
a null-terminated string whose address is stored at the key address.
.Tp
\f5Void_p (*makef)(Void_p obj):
This makes a copy of the object \f5obj suitable for
insertion into the dictionary (\f5dtinsert()).
Within the library,
object holders of the type \f5Dtlink_t are created to hold such objects.
To save space, it is possible to create a dictionary where
object holder data are embedded in the objects themselves.
This is done by specifying \f5NULL for \f5(*makef)().
Then, an object passed to a dictionary function must be
declared with \f5Dtlink_t as the first field as follows:
.ft 5
struct user_struct
{
Dtlink_t link;
...application-specific fields
}
.ft 1
Note that when \f5Dtlink_t is embedded in an object,
the object cannot be in more than one dictionaries.
.Tp
\f5void (*freef)(Void_p obj):
If not \f5NULL, it is called when an object is being deleted (\f5dtdelete()).
Any space or information associated with \f5obj can be freed or erased at this time.
.Tp
\f5int (*comparf)(char* key1, char* key1):
This compares the keys of two given objects.
It should return \f5<0, \f5=0, or \f5>0 to indicate
whether \f5key1 is smaller, equal to, or larger than \f5key2.
When \f5(*comparf)() is \f5NULL, there are two cases.
If \f5key is negative,
objects are compared by their addresses.
If \f5key is non-negative,
object keys are compared
either as \f5char arrays of fixed size or null-terminated strings.
See \f5key and \f5size for details on key definition.
.Tp
\f5long (*hashf)(char* key):
This computes the hash value of
\f5key when the dictionary is in hash mode.
When \f5(*hashf)() \f5NULL, there are two cases.
If \f5key is negative, objects are hashed by their addresses.
If \f5key is non-negative, an object's key is hashed
either as \f5char arrays of fixed size or null-terminated strings.
See \f5key and \f5size for details on key definition.
.Ss " int dtclose(Dict_t* dict)"
This function deletes all objects in \f5dict,
then frees all space associated with it.
It is an error to close a dictionary that is being viewed by
some other dictionaries.
\f5dtclose() returns \f50 on success and \f5-1 on failure.
.Ss " Dtlink_t* dtextract(Dict_t* dict)"
.Ss " void dtreset(Dict_t* dict, Dtlink_t* link)"
These functions extract and reset the objects in the dictionary \f5dict.
They are useful in applications that wish to save space by sharing
a common \f5dict structure among many dictionaries.
Though both of these functions operate on either ordered or hashed dictionaries,
their time complexity depends on the dictionary type. On an ordered dictionary,
the time complexity is constant while on a hashed dictionary, the time complexity
is proportional to the number of objects involved.
\f5dtextract() returns the root of the tree of objects if the
dictionary is in ordered mode or a linked list of objects if the dictionary is
in hashed mode.
\f5dtreset() repopulates \f5dict with
objects previously obtained via \f5dtextract().
Note that dictionaries to be operated on by these functions
should not be on a view path of dictionaries (see \f5dtview()).
.Ss "DICTIONARY CONTROL"
.Ss " Dtmethod_t dtmethod(Dict_t* dict, Dtmethod_t method, int size)" This function changes the manipulation method of \f5dict to \f5method. If \f5method is \f5Dthash, the \f5size argument is used as follows. If \f5size is \f5>0, the hash table is fixed at this size. If \f5size is \f5<=0, the number of slots in the hash table is dynamically sized by libdict. \f5dtmethod() returns the previous method or \f5NULL on failure. .Ss " Dict_t* dtview(Dict_t* dict, Dict_t* view)" This function sets a view path from \f5dict to \f5view. This means that a search for an object in \f5dict or a walk through it will continue to \f5view and any dictionaries recursively viewed thereof. If \f5dict is currently viewing a dictionary, that view is terminated before the new view is set. Then, if \f5view is not \f5NULL, a new view from \f5dict to \f5view is attempted. Thus, \f5dtview(dict,(Dict_t*)0) can be used to terminate a view. An attempt to set a new view will fail if there is already a direct or indirect viewpath from \f5view to \f5dict or if the two dictionaries are not of the same type. Two dictionaries are of the same type if their discipline structures (\f5Dtdisc_t) have the same values for \f5key, \f5size, and \f5(*comparf)(). \f5dtview() returns \f5NULL on errors. On success, if \f5view is \f5NULL, \f5dtview() returns the dictionary whose view from \f5dict was ended. Otherwise, \f5dtview() returns \f5view. .Ss " int dtreorder(Dict_t* dict, Dtcompar_f comparf)" This function changes the comparison function of \f5dict to \f5(*comparf)(). This change is allowed only if \f5dict is not currently on a view path of dictionaries (see \f5dtview()). Any duplicates found by the new comparison function will be deleted (see \f5Dtfree_f). \f5dtreorder() returns \f50 on success and \f5-1 on error.
.Ss "OBJECT OPERATIONS"
.Ss " Void_p dtinsert(Dict_t* dict, Void_p obj)" This function inserts \f5obj (or a copy of it made by \f5(*makef)() when \f5(*makef)() is not \f5NULL) into the dictionary unless the dictionary already contains an object matching \f5obj. Note that object'as addresses are passed between libdict and the application code via the type \f5Void_p. \f5dtinsert() only operates on \f5dict even if \f5dict has a view on another dictionary (see \f5dtview()). It returns the address of the new or matching object or \f5NULL on failure. .Ss " Void_p dtdelete(Dict_t* dict, Void_p obj)" This function deletes the object matching \f5obj from the dictionary. If \f5obj is \f5NULL, all objects are deleted. This function only operates on \f5dict even if \f5dict has a view on another dictionary. It always returns \f5NULL. .Ss " Void_p dtsearch(Dict_t* dict, Void_p obj)" This function returns an object matching \f5obj in \f5dict or other dictionaries being viewed from \f5dict (see \f5dtview()). It returns \f5NULL for no match. .Ss " Void_p dtfirst(Dict_t* dict)" .Ss " Void_p dtnext(Dict_t* dict, Void_p obj)" \f5dtfirst() returns the first object in the dictionary. \f5dtnext() returns the object following \f5obj as defined by the dictionary ordering. If \f5obj is \f5NULL, \f5dtnext() is equivalent to \f5dtfirst(). Note that the ordering of objects in a hashed dictionary is not well-defined and may change dynamically on calls to \f5dtsearch() and \f5dtinsert().
A way to walk a dictionary is: \f5for(obj = dtfirst(dict); obj; obj = dtnext(dict,obj))
Note that if \f5dict has a view onto some other dictionaries (see \f5dtview()), the objects in these dictionaries will also be traversed during the loop. Objects will be traversed dictionary by dictionary and any duplications will be ignored. In addition, when a view path is involved, only one \f5for(;;) loop should be used. Nested loops may result in unexpected behaviors. See also \f5dtwalk() and \f5dtflatten() for alternative walking methods. .Ss " Void_p dtlast(Dict_t* dict)" .Ss " Void_p dtprev(Dict_t* dict, Void_p obj)" \f5dtlast() returns the last object in \f5dict or the last object in the bottom dictionary on a view path from \f5dict. \f5dtprev() returns the object preceding \f5obj (see also \f5dtnext()). If \f5obj is \f5NULL, \f5dtprev() is equivalent to \f5dtlast(). Similar to \f5dtfirst()/dtnext(), \f5dtlast()/dtprev() can be used to traverse a dictionary or a view path of dictionaries in reverse order. .Ss " dtwalk(Dict_t* dict, int (*userf)(Void_p obj))" This function calls \f5(*userf)(obj) on each object in their order. If \f5userf() returns a \f5<0 value, \f5dtwalk() terminates and returns the same value. Upon a successful traversal, \f5dtwalk() returns \f50. .Ss " Dtlink_t* dtflatten(Dict_t* dict)" .Ss " Dtlink_t* dtlink(Dict_t* dict, Dtlink_t* link)" .Ss " Void_p dtobj(Dict_t* dict, Dtlink_t* link)" Using \f5dtfirst()/dtnext() to walk dictionaries can incur large cost due to function calls. \f5dtflatten() and \f5dtlink() can be used to flatten the objects in \f5dict into a linked list and walk them as follows: \f5for(link = dtflatten(dict); link; link = dtlink(dict,link) )
Here, \f5dtlink() is a fast macro that returns the object in the linked list following \f5link. Note that the return value of \f5dtflatten() is of the type \f5Dtlink_t* not \f5Void_p. That is, it returns a dictionary object pointer not a user object pointer. The macro function \f5dtobj(dict,link) should be used to get at the user object associated with \f5link, If \f5dict is on a view path of dictionaries (see \f5dtview()), the flattened list of objects will also contain objects in other dictionaries viewable from \f5dict where repeated objects are ignored. The flattened object list will be unflattened on a subsequent dictionary operation on any of the involved dictionaries.
.Ss "DICTIONARY STATUS"
.Ss " Dict_t* dtsize(Dict_t* dict)" This function returns the current number of objects stored in \f5dict. .Ss " Dict_t* dtgetview(Dict_t* dict)" This function returns the dictionary immediately viewed from \f5dict if any. .Ss " Dtstat_t* dtstat(Dict_t *dict, int all)" This function returns the pointer to a static \f5Dtstat_t structure that reports statistics on the dictionary \f5dict. If \f5all is non-zero, all fields are filled. Otherwise, only the \f5dt_hash and \f5dt_size fields are filled. Following are the elements in \f5Dtstat_t: .Tp \f5int\ dt_hash: This is non-zero only if the dictionary is in hash mode. .Tp \f5int\ dt_size: This contains the number of objects in the dictionary. .Tp \f5int\ dt_n: In hashed mode, this is the number of non-empty chains in the hash table. In ordered mode, this is the maximum level number of the levels in the tree. Each level in the tree contains all nodes of equal distance from the root node. .Tp \f5int\ dt_max: In hashed mode, this contains the size of a largest chain. In ordered mode, this is the size of a largest level. .Tp \f5int*\ dt_count: In hashed mode, this is the list of counts for chains of particular sizes. For example, \f5dt_count[1] is the number of chains of size \f51. In ordered mode, this is the list of sizes of the levels. For example, \f5dt_count[1] is the size of level \f51.
.Ss "HASH FUNCTIONS"
.Ss " dtcharhash(ulong h, uchar c)" .Ss " dtstrhash(ulong h, uchar* str, int n)" These are macro functions to compute hash values from bytes or strings. They are useful for building customized hash functions. \f5dtcharhash() computes a new hash value given the byte \f5c and the current hash value accumulator \f5h. Since \f5dtcharhash() is a macro, \f5h must be passed as itself, not by its address. \f5dtstrhash() is a macro function to compute hash values for byte strings. If \f5n is negative, \f5str is considered a null-terminated string. Otherwise, \f5str is a string of length \f5n which may nor may not contain \f50's. As with \f5dtcharhash(), \f5h must be passed to \f5dtstrhash() as itself, not by its address. In this case, \f5dtstrhash() will initialize \f5h as appropriate.
When a dictionary is ordered, objects are stored in a Tarjan-Sleator splay tree. This data structure guarantees that each search, insert, or delete operation has O(logn) amortized time complexity where n is the total number of objects. In addition, searching all objects in order is O(n). The data structure is space economical as each object requires only two pointers for storage. When a dictionary is unordered, a hash table with chaining is used for fast access. In this case, the space overhead per object is slightly more but all access operations have O(1) averaged time complexity.
|
Hush Online Technology
hush@cs.vu.nl
09/24/99 |
|
|