A hash table is a data structure that implements an dictionary abstract data type, and is used to map keys to values.
array (key)
┌───┐
0 │ ⊙─┼─► value
├───┤
1 │ ⊙─┼─► value
├───┤
2 │ │ A Hash table is implemented by using a large array.
├───┤ This is usually called the table's "backing array
3 │ │ because it backs the hash table.
├───┤
To place an item A 4 │ │
into the backing array, ├───┤
calculate a hash value. 5 │ │
├───┤
6 │ │
├╶╶╶│
┌──────┐ │ │
"cab" ──► │ hash │ │ │
│ func │ │ │
└───┼──┘ │ │ Multiple items might have the same hash value.
│ │ │ This is called a collision. There are 2 ways
│ │ │ to handle collisions: chaining and linear probing.
│ ├╶╶╶│
│ │ │ linked-list
│ ├───┤ ┌───────┬───┐ ┌─────────┬───┐
└──► 98 │ ⊙─┼──► │ "cab" │ ⊙─┼──┼─► "cat" │ ⊙─┼────►
the hash value is ├───┤ └───────┴───┘ └─────────┴───┘
then mod(%) with 99 │ │
the array size. └───┘
- Load Factor is the Number of Entries in the Hash Table divided by the size of the backing array. The higher the load factor, the greater the probability of collisions.
- A Load factor of 0.75 is generally considered a good point to double the size of the backing array.
- When you resize the backing array, you have to remap all the entries in the table.
- Deterministic
- Evenly distributed
- Should use all values of the input
To generate hash codes, we use polynomials. We set x to a prime
number, (e.g, 31, 53 or 101).