A hash table is a data structure that implements an dictionary abstract data type, and is used to map keys to values.
array (key) ┌───┐ 0 │ ⊙─┼─► value ├───┤ 1 │ ⊙─┼─► value ├───┤ 2 │ │ A Hash table is implemented by using a large array. ├───┤ This is usually called the table's "backing array 3 │ │ because it backs the hash table. ├───┤ To place an item A 4 │ │ into the backing array, ├───┤ calculate a hash value. 5 │ │ ├───┤ 6 │ │ ├╶╶╶│ ┌──────┐ │ │ "cab" ──► │ hash │ │ │ │ func │ │ │ └───┼──┘ │ │ Multiple items might have the same hash value. │ │ │ This is called a collision. There are 2 ways │ │ │ to handle collisions: chaining and linear probing. │ ├╶╶╶│ │ │ │ linked-list │ ├───┤ ┌───────┬───┐ ┌─────────┬───┐ └──► 98 │ ⊙─┼──► │ "cab" │ ⊙─┼──┼─► "cat" │ ⊙─┼────► the hash value is ├───┤ └───────┴───┘ └─────────┴───┘ then mod(%) with 99 │ │ the array size. └───┘
- Load Factor is the Number of Entries in the Hash Table divided by the size of the backing array. The higher the load factor, the greater the probability of collisions.
- A Load factor of 0.75 is generally considered a good point to double the size of the backing array.
- When you resize the backing array, you have to remap all the entries in the table.
- Deterministic
- Evenly distributed
- Should use all values of the input
To generate hash codes, we use polynomials. We set x
to a prime
number, (e.g, 31, 53 or 101).