Hashing: Hash Table

A hash tableA hash table (sometimes also called a hash map) is a data structure that is useful in situations when: you need very quick insertion and retrieval, you very rarely need to iterate over all your items, you don't care what order the items are stored (they don't need to be sorted, etc.), and the data you want to store has... (sometimes also called a hash mapA hash table (sometimes also called a hash map) is a data structure that is useful in situations when: you need very quick insertion and retrieval, you very rarely need to iterate over all your items, you don't care what order the items are stored (they don't need to be sorted, etc.), and the data you want to store has...) is a data structure that is useful in situations when:

  • you need very quick insertion and retrieval,
  • you very rarely need to iterate over all your items,
  • you don’t care what order the items are stored (they don’t need to be sorted, etc.),
  • and the data you want to store has a unique attribute (an ID number, a username, an email address, etc.)

Underlying a hash table is an array. The array should be about twice the size of your data (but don’t worry, if your data grows, your hash table can grow, too).

Based on your data, you will choose a hash functionA hash function exists within the context of hashing and hash tables. Data to be stored in hash tables includes a unique attribute (id number, username, etc.). Hash functions take the unique attribute and translate it into a hash index. Most algorithms take a numerical value as input, so if your unique attribute is text-based, translate your text into a.... Hash functions take the unique attribute for your data point and translate it into an index into the array. The simplest hash function called the “remainder method”: 1) turn the unique attribute into a numerical value (perhaps adding ASCII values if it is a string) and 2) perform a modulus operation on that numerical value by the size of the array. So if your unique attribute had a numerical value of 214 and your array had a size of 23, you would perform 214 % 23, which equals 7. The number 7 becomes your hash indexA hash index is an index into the underlying array that supports a hash table. A hash index is determined by a hash function.....

Now you’ll look into your array and try to store your data at index 7. If the slot at index 7 is empty, then store your data in the slot, and you’re done! If index 7 is already occupied, you’ll need to employ a collision functionA collision function exists within the context of hashing and hash tables. Collision functions determine how to handle situations when the array underlying the hash table already has another datapoint in the slot chosen by the hash function. Here are some of the most popular collision functions: Linear Probing - Linear probing simply increments the hash index each time it.... The simplest collision function is called “linear probing”. Basically, linear probingA collision function exists within the context of hashing and hash tables. Collision functions determine how to handle situations when the array underlying the hash table already has another datapoint in the slot chosen by the hash function. Here are some of the most popular collision functions: Linear Probing - Linear probing simply increments the hash index each time it... dictates that if slot 7 is occupied, try slot 8, then slot 9, slot 10, etc. until you find an empty slot. When you find an empty slot, you’ll store your item there.

When you want to retrieve an item (based on it’s unique attribute), you’ll perform the hash function to find the hash index, then you’ll follow the directives of the collision function until the item is found. If an empty slot is encountered before the item is found, the item does not exist in the hash table, and the search function should return false.

Posted in