Static hashing

Static Hashing is another form of the hashing problem which allows users to perform lookups on a finalized dictionary set (all objects in the dictionary are final and not changing).

Usage [1]

Application

Since static hashing requires that the database, its objects and reference remain the same its applications are limited. Databases which contain information which changes rarely are also eligible as it would only require a full rehash of the entire database on rare occasion. Examples of this include sets of words and definitions of specific languages, sets of significant data for an organization's personnel, etc.

Perfect Hashing

Perfect hashing is a model of hashing in which any set of n elements can be stored in a hash table of equal size and can have lookups done in constant time. It was specifically discovered and discussed by Fredman, Komlos and Szemeredi (1984) and has therefore been nicknamed "FKS Hashing".[2]

FKS Hashing

FKS Hashing makes use of a hash table with two levels in which the top level contains n buckets which each contain their own hash table. FKS hashing requires that if collisions occur they must do so only on the top level.

Implementation

The top level contains a randomly created hash function, h(x), which fits within the constraints of a Carter and Wegman hash function - seen in Universal hashing. Having done so the top level shall contain n buckets labeled k1, k2, k2, ..., kn. Following this pattern, all of the buckets hold a hash table of size si and a respective hash function, hi(x). The hash function will be decided by setting si to ki2 and randomly going through functions until there are no collisions. This can be done in constant time.

Performance

Because there are n \choose 2 pairs of elements, of which have a probability of collision equal to 1/n, FKS hashing can expect to have strictly less than n/2 collisions. Based on this fact and that each h(x) was selected so that the number of collisions would be at most n/2, the size of each table on the lower level will be no greater than 2n.

References

  1. Daniel Roche (2013). SI486D: Randomness in Computing, Hashing Unit. United States Naval Academy, Computer Science Department.
  2. Michael Fredman, János Komlós, Endre Szemerédi (1984). Storing a Sparse Table with O(1) Worst Case Access Time. Journal of the ACM (Volume 31, Issue 3).
This article is issued from Wikipedia - version of the Sunday, May 24, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.