Home | Projects | Notes > Data Structures & Algorithms > Hash Tables

Hash Tables

Why is Map Necessary?

In storing key-value pairs,

Arrays limit the key to integer values which will be used as an index.


4
1
a1[1] = 3;
2
a2[3] = "three";
3
a3[0] = 'a';
4
...

Maps can store any pairs of any types.


4
1
m1["three"] = 3
2
m2[5] = 2
3
m3['A'] = 'a'
4
...

C++ Built-in Maps

std::map (header: <map>)
- Implemented using binary search tree (BST). Due to the nature of the BST, it is ordered map.
- Iterator will iterate through the elements based on key order.
- The time complexity of all the functions is O(log n).
std::unordered_map (header: <unordered_map>)
- Implemented using hash table.
- Keys are unique.
- It is NOT guaranteed that the iterator will iterate through the elements based on key order.
- Time complexity of all the functions is O(1).
- Functions: insert(), at(), size(), erase(), etc.

Hash Tables (Hash Maps)

basic-concept-of-a-hash-table

Hash tables support one of the most efficient types of searching: hashing.
A hash table consists of an array (a.k.a. bucket array) in which data is accessed via a special index called key.
The primary idea behind a hash table is to establish a mapping between the set of all possible keys and positions in the array using a hash function.

A hash function accepts a key and returns its hash coding, or hash value.

Keys vary in type, but hash codings are always integers.

Two components of hash functions:

Hash code (converts a key to an integer)


2
1
IF key = 'A', hash code = convert character to ascii
2
THEN h('A') = 97

Compression function (scales the converted integer to a valid index of the bucket array)


3
1
IF bucket array can contain 26 entries
2
THEN good compression function can be '% 26' since it scales whatever the hash
3
value is to an integer (hash coding, or hash value) ranging 0-25.

A good hash function produces wide-spread hash values as randomly as possible within the valid range of integers. Some well-known hash functions:

Base-26


13
1
// e.g., A hash function which gets passed a 4-character string and returns a
2
// hash index as an int. Assume the table size is defined as 'ts'.
3

4
int hashBase26(char *key, int ts)
5
{
6
    int index;
7
    index = ((key[0] - 'A' + 1) * pow(26,3) + 
8
            ((key[1] - 'A' + 1) * pow(26,2) + 
9
            ((key[2] - 'A' + 1) * pow(26,1) + 
10
            ((key[3] - 'A' + 1) * pow(26,0) + 
11

12
    return index % ts;
13
}

Folding


10
1
// e.g., A hash function using Folding
2

3
int hashFolding(char *key, int ts)
4
{
5
    int index;
6
    index = ((key[0] - 'A' + 1) * (key[1] - 'A' + 1)) + 
7
            ((key[2] - 'A' + 1) * (key[3] - 'A' + 1));  
8

9
    return index % ts;
10
}

Middle Squaring


9
1
// e.g., A hash function using Middle Squaring
2

3
int hashMiddleSquaring(char *key, int ts)
4
{
5
    int index;
6
    index = ((key[1] - 'A' + 1) * (key[2] - 'A' + 1));
7

8
    return index % ts;
9
}

Division Method
A good general approach:
- Select a table size that is a prime number
- Translate the key into an integer
- Divide Key / M = Q with remainder R
- Use R as a hash value
- Use max(1, Q) as the probe in/decrement auxiliary hash function (double hashing)

Collision Handling Techniques

Closed Addressing (Closed Hashing)

A key is always stored in the bucket it is hashed to.
Collisions are handled by using separate data structure on a per-bucket basis.
Arbitrary number of keys per bucket.
e.g., Chained Hash Table (Separate Chaining)
- Consists of an array of linked lists.
- Each list forms a bucket in which we place all elements hashing to a specific position in the array.
- Performs as good as open addressing techniques.

concept-of-chained-hash-table

Open Addressing (Open Hashing)

At most one key per bucket
$h(k,i) = h(k) + f(i)$ where,
- $k$ is the key
- $i$ is the number of attempts starting at 0
- $h(k)$ is an auxiliary hash function
- $f(i)$ $f(0) = 0$ ) that is responsible for distributing the keys upon collision
  $f(i)$ ) are:
  linear probing
  - $h(k, i) = h(k) + i % n$ $n$ is the table size
  - Although it is simple to implement, it is prone to primary clustering (cluster developed due to the local popular index).
  quadratic probing
  - $h(k, i) = h(k) + i^2 % n$ $n$ is the table size
  - This alleviates the primary clustering to some extent.
  double hashing
  - $h(k, i) = h_1(k) + h_2(k)$
  - Two auxiliary hash functions should be carefully chosen so that the elements are distributed in a uniform and random manner.

Implementation (C++) - Chained Hash Table

Header (`chtable.hpp`)


x
1
#ifndef CHTABLE_HPP
2
#define CHTABLE_HPP
3

4
#include <vector>
5
#include <list>
6
#include <string>
7

8
class pair 
9
{
10
public:
11
    pair(const std::string &key, const int val) : k(key), v(val) {}
12

13
    std::string k;
14
    int v;
15
};
16

17
class chtable
18
{
19
public:
20
    chtable() : bucket_arr(NUM_OF_BUCKETS) {}
21
    void insert(const std::string &key, const int val);
22
    bool get(const std::string &key, int &val_out) const;
23
    bool contains(const std::string &key) const;
24
    void remove(const std::string &key);
25
    void print() const;
26

27
private:
28
    static const int NUM_OF_BUCKETS = 11; // prime number is good
29
    std::vector<std::list<pair>> bucket_arr;
30

31
    int hash(const std::string &key) const;
32
};
33

34
#endif

Source (`chtable.cpp`)


85
1
#include "chtable.hpp"
2
#include <iostream>
3

4
void chtable::insert(const std::string &key, const int val)
5
{
6
    int i = hash(key);
7

8
    for (pair &p : bucket_arr[i])
9
    {
10
        if (p.k == key)
11
        {
12
            p.v = val; // update value
13
            return;
14
        }
15
    }
16

17
    // NOTE: here push_back(key, val) won't work!
18
    bucket_arr[i].emplace_back(key, val); // insert a new pair
19
}
20

21
bool chtable::get(const std::string &key, int &val_out) const
22
{
23
    int i = hash(key);
24

25
    for (const pair &p : bucket_arr[i])
26
    {
27
        if (p.k == key)
28
        {
29
            val_out = p.v;
30
            return true;
31
        }
32
    }
33

34
    // key not found
35
    return false;
36
}
37

38
bool chtable::contains(const std::string &key) const
39
{
40
    int dummy;
41
    return get(key, dummy);
42
}
43

44
void chtable::remove(const std::string &key)
45
{
46
    int i = hash(key);
47
    auto &bucket = bucket_arr[i];
48

49
    for (auto it = bucket.begin(); it != bucket.end(); ++it)
50
    {
51
        if (it->k == key)
52
        {
53
            bucket.erase(it);
54
            return;
55
        }
56
    }
57
}
58

59
void chtable::print() const
60
{
61
    for (int i = 0; i < NUM_OF_BUCKETS; ++i)
62
    {
63
        std::cout << i << ": ";
64

65
        for (const pair &p : bucket_arr[i])
66
        {
67
            std::cout << "{" << p.k << ", " << p.v << "} ";
68
        }
69
        
70
        std::cout << std::endl;
71
    }
72
}
73

74
// private function
75
int chtable::hash(const std::string &key) const
76
{
77
    int sum = 0;
78

79
    for (char c : key)
80
    {
81
        sum += c; // add ascii values
82
    }
83

84
    return sum % NUM_OF_BUCKETS;
85
}

L19: Revisit why you can't use push_back(key, val) there!

Test Driver


29
1
#include <iostream>
2
#include "chtable.hpp"
3

4
int main(int argc, char *argv[])
5
{
6
    chtable ht;
7

8
    ht.insert("apple", 10);
9
    ht.insert("banana", 20);
10
    ht.insert("grape", 30);
11
    ht.insert("banana", 25); // update
12

13
    ht.print();
14
    std::cout << std::endl;
15

16
    int value;
17
    if (ht.get("banana", value))
18
    {
19
        std::cout << "banana -> " << value << std::endl;
20
    }
21

22
    std::cout << std::endl;
23

24
    ht.remove("apple");
25
    std::cout << "After removing 'apple':" << std::endl;
26
    ht.print();
27

28
    return 0;
29
}


26
1
0:
2
1:
3
2: {apple, 10}
4
3:
5
4: {banana, 25}
6
5:
7
6:
8
7:
9
8:
10
9:
11
10: {grape, 30}
12

13
banana -> 25
14

15
After removing 'apple':
16
0:
17
1:
18
2:
19
3:
20
4: {banana, 25}
21
5:
22
6:
23
7:
24
8:
25
9:
26
10: {grape, 30}

Chained Hash Table 1 (C++)

Interface


49
1
//==============================================================================
2
// File     : hash_table.h
3
// Brief    : Interface for Hash Table
4
// Author   : Kyungjae Lee
5
// Date     : Jul 31, 2023
6
//
7
// Note     : It is known that when the number of bucket array elements is a 
8
//            prime number, the (key, value) pairs are distributed more 
9
//            randomely (i.e., less collision).
10
//==============================================================================
11

12
#ifndef HASH_TABLE_H
13
#define HASH_TABLE_H
14

15
#include <iostream>
16
#include <string>
17
#include <vector>
18

19
using namespace std;
20

21
// Class for hash table nodes
22
class Node
23
{
24
public:
25
    string key;
26
    int value;
27
    Node *next;
28

29
    Node(string key, int value);    // Constructor
30
};
31

32
// Class for hash table 
33
class HashTable
34
{
35
public:
36
    HashTable(int size);        // Constructor
37
    ~HashTable(void);           // Destructor
38
    void printHashTable(void);  // Prints all the nodes in the hash table
39
    void insert(string key, int value); // Insert a node into the hash table
40
    int lookup(string key);     // Looks up a value by key from the hash table
41
    vector<string> keys(void);  // Returns a vector of keys in the hash table
42
    int hash(string key);       // Hash function
43

44
private:
45
    int bucketArrSize;          // Bucket array size (prime number recommended)
46
    vector<Node *> bucketArr;   // Bucket array
47
};
48

49
#endif // HASH_TABLE_H

Implementation


164
1
//==============================================================================
2
// File     : hash_table.cpp
3
// Brief    : Implementation of Hash Table
4
// Author   : Kyungjae Lee
5
// Date     : Jul 31, 2023
6
//
7
// Note     : It is known that when the number of bucket array elements is a 
8
//            prime number, the (key, value) pairs are distributed more 
9
//            randomely (i.e., less collision).
10
//==============================================================================
11

12
#include "hash_table.h"
13

14
using namespace std;
15

16
//------------------------------------------------------------------------------
17
// Implementation of Node class interface
18
//------------------------------------------------------------------------------
19

20
// Constructor
21
// T = O(1)
22
Node::Node(string key, int value)
23
{
24
    this->key = key;
25
    this->value = value;
26
    next = nullptr;
27
} // End of Node class constructor
28

29
//------------------------------------------------------------------------------
30
// Implementation of Hash Table class interface
31
//------------------------------------------------------------------------------
32

33
// Constructor
34
// T = O(1)
35
HashTable::HashTable(int size)
36
{
37
    bucketArrSize = size;
38
    bucketArr.reserve(size);    // Size of prime number recommended
39
} // End of Hash Table class constructor
40

41
// Destructor
42
// T = O(n)
43
HashTable::~HashTable(void)
44
{
45
    // Destroy all the nodes
46
    Node *delNode, *head;
47

48
    for (int i = 0; i < bucketArrSize; i++)
49
    {
50
        delNode = bucketArr[i];
51
        head = bucketArr[i];
52

53
        while (head)
54
        {
55
            head = head->next;
56
            delete delNode;
57
            delNode = head;
58
        }
59
    }
60

61
    // Destory the bucket array
62
    bucketArrSize.~vector();
63
} // End of Desctructor
64

65
// Prints all the nodes in the hash table
66
// T = O(n)
67
void HashTable::printHashTable(void)
68
{
69
    for (int i = 0; i < bucketArrSize; i++)
70
    {
71
        cout << i << ": ";
72
        if (bucketArr[i])
73
        {
74
            Node *temp = bucketArr[i];
75
            while (temp)
76
            {
77
                cout << "(" << temp->key << ", " << temp->value << "} ";
78
                temp = temp->next;
79
            }
80
        }
81
        cout << endl;
82
    }
83
} // End of printHashTable
84

85
// Inserts a node into the hash table (Closed addressing; chaining)
86
// T = O(1)
87
void HashTable::insert(string key, int value)
88
{
89
    int index = hash(key);
90
    Node *newNode = new Node(key, value);
91
    
92
    if (bucketArr[index] == nullptr)
93
    {
94
        // No collision detected, so simply insert the new node
95
        bucketArr[index] = newNode;
96
    }
97
    else
98
    {
99
        // Collision detected
100
        Node *temp = bucketArr[index];
101

102
        // Find the last node of the chain
103
        while (temp->next != nullptr)
104
            temp = temp->next;
105

106
        // Insert the new node after the last node in the chain
107
        temp->next = newNode;
108
    }
109
} // End of insert
110

111
// Looks up a value by key from the hash table
112
// T = O(1)
113
int HashTable::lookup(string key)
114
{
115
    int index = hash(key);
116
    Node *temp = bucketArr[index];
117
    
118
    // Search the key
119
    while (temp)
120
    {
121
        if (temp->key == key)
122
            return temp->value;
123
        
124
        temp = temp->next;
125
    }
126
    
127
    // Key is not found, return 0
128
    return 0;
129
} // End of lookup
130

131
// Returns a vector of keys in the hash table
132
// T = O(n)
133
vector<string> HashTable::keys()
134
{
135
    vector<string> allKeys;
136
    for (int i = 0; i < bucketArrSize; i++)
137
    {
138
        Node *temp = bucketArr[i];
139

140
        while (temp)
141
        {
142
            allKeys.push_back(temp->key);
143
            temp = temp->next;
144
        }
145
    }
146

147
    return allKeys;
148
} // End of keys
149

150
// Hash function
151
int HashTable::hash(string key)
152
{
153
    int hash = 0;
154
    for (int i = 0; i < key.length(); i++)
155
    {
156
        int ascii = key[i];
157

158
        // Multiply by 23, a prime number, to make the result more random
159
        // Divide by bucketArrSize to normalize the result
160
        hash = (hash + ascii * 23) % bucketArrSize;
161
    }
162

163
    return hash;
164
} // End of hash

Test Driver


45
1
//==============================================================================
2
// File     : main.cpp
3
// Brief    : Test driver for Hash Table
4
// Author   : Kyungjae Lee
5
// Date     : Jul 31, 2023
6
//==============================================================================
7

8
#include <iostream>
9
#include "hash_table.h"
10

11
using namespace std;
12

13
int main(int argc, char *argv[])
14
{
15
    // Create a hash table
16
    HashTable *ht = new HashTable(7);
17
    
18
    // Insert elements
19
    ht->insert("nails", 100);
20
    ht->insert("tile", 50);
21
    ht->insert("lumber", 80);
22
    ht->insert("bolts", 200);
23
    ht->insert("screws", 140);
24

25
    // Print hash table
26
    ht->printHashTable();
27

28
    // Create a vector of keys present in the hash table
29
    vector<string> allKeys = ht->keys();
30

31
    cout << endl;
32

33
    // Print the keys
34
    for (auto key : allKeys)
35
        cout << key << " ";
36

37
    cout << endl;
38

39
    cout << "tile: " << ht->lookup("tile") << endl;
40

41
    // Destroy the hash table
42
    delete ht;
43

44
    return 0;
45
}


10
1
0: 
2
1: 
3
2: 
4
3: (screws, 140} 
5
4: (bolts, 200} 
6
5: 
7
6: (nails, 100} (tile, 50} (lumber, 80} 
8

9
screws bolts nails tile lumber 
10
tile: 50

Chained Hash Table 2 (C++)

For the ease of implementation, key will be string type, and value will be template.

Interface


60
1
//==============================================================================
2
//  File    : chtbl.h
3
//  Brief   : Interface for chained hash table (in C++)
4
//  Author  : Kyungjae Lee
5
//  Date    : Dec 23, 2022
6
//==============================================================================
7

8
#ifndef CHTBL_H
9
#define CHTBL_H
10

11
#define LOAD_FACTOR_THRESHOLD 0.7
12

13
using namespace std;
14

15
// Forward declaration
16
template <typename T> class CHTbl;
17

18
// Chained Hash Table Node class
19
template <typename T>
20
class CHTblNode
21
{
22
public:
23
    CHTblNode(string key, T value);     // parameterized constructor                 
24
    ~CHTblNode();                       // parameterized constructor                 
25

26
    // friend declaration
27
    template <typename> friend class CHTbl;
28
    
29
private:
30
    string key;
31
    T value;
32
    CHTblNode *next;
33
};
34

35
// Chained Hash Table
36
template <typename T>
37
class CHTbl
38
{
39
public:
40
    // function
41
    CHTbl();                        // default constructor
42
    ~CHTbl();                       // destructor
43
    int size();                     // retrieve the total number of entries in the CHTbl     
44
    void insert(string key, T value);   // insert an entry
45
    T remove(string key);           // remove an entry by key
46
    T getValue(string key);         // retrieve the value corresponding to the given key
47
    double getLoadFactor();         // retrieve the current load factor
48
    
49
private:
50
    // variable
51
    CHTblNode<T> **bucketArr;       // bucket array
52
    int buckets;                    // max number of elements in the bucket array
53
    int count;                      // total number of entries in the CHTbl  
54

55
    // function
56
    int hash(string key);           // convert key into the bucketArr index (hash value)
57
    void rehash();                  // transfer entries to the resized bucket array
58
};
59

60
#endif // CHTBL_H

Implementation


247
1
//==============================================================================
2
//  File    : chtbl.c
3
//  Brief   : Implementation of chained hash table (in C++)
4
//  Author  : Kyungjae Lee
5
//  Date    : Dec 23, 2022
6
//==============================================================================
7

8
#include <iostream>
9

10
// CHTblNode -------------------------------------------------------------------
11

12
// Parameterized constructor
13
// T = O(1)
14
template <typename T>
15
CHTblNode<T>::CHTblNode(string key, T value)
16
{
17
    this->key = key;
18
    this->value = value;
19
    next = nullptr;
20
} // end parameterized constructor
21

22
// destructor
23
// T = O(1)
24
template <typename T>
25
CHTblNode<T>::~CHTblNode()
26
{
27
    delete next;                // recursively destroy the list
28
} // end destructor
29

30
// CHTbl -----------------------------------------------------------------------
31

32
// Default constructor
33
// T = O(1)
34
template <typename T>
35
CHTbl<T>::CHTbl()
36
{
37
    count = 0;
38
    buckets = 5;                // set the initial number of buckets
39
    bucketArr = new CHTblNode<T>*[buckets];
40
} // end default constructor
41

42
// destructor
43
// T = O(n)
44
template <typename T>
45
CHTbl<T>::~CHTbl()
46
{
47
    // traverse each element of the bucketArr and delete lists associated with it
48
    // (just deleting the bucketArr won't delete these lists)
49
    for (int i = 0; i < buckets; i++)
50
        delete bucketArr[i];
51

52
    delete[] bucketArr;                 // delete the bucketArr
53
} // end destructor
54

55
// Retrieve the total number of entries in the CHTbl     
56
// T = O(1)
57
template <typename T>
58
int CHTbl<T>::size()                     
59
{
60
    return count;
61
} // end size
62

63
// Insert an entry
64
// T = O(n/b) on average ; // T = O(1)
65
//  - n: the total number of entries, b: the number of buckets, n/b: load factor
66
//     - If we could keep n/b < 1, then T = O(1)
67
//       - If this is the case, getValue and delete function will also run at O(1).
68
//     - One way is to implement the number of buckets doubles the size every time the 
69
//       load factor exceeds the properly chosen threshold. (e.g., n/b < 0.7)
70
//       - When this happens, you need to 'rehash' all the exsisting keys because the hash
71
//         value is dependent on the number of buckets. 
72
//  - Hash function runs at O(l) where l is the length of the key, but it is ignored
73
//    because l << n.
74
//  - In the worst case, where the hash function is really bad such that it produces only
75
//    one hash value for all the given keys, insert function will run at T = O(n). (i.e.,
76
//    inserting and entry at the end of the list that contains all the n entries)
77
//    But, this is unlikely to happen with good hash functions that are proven to work.
78
template <typename T>
79
void CHTbl<T>::insert(string key, T value)
80
{
81
    int bucketIndex = hash(key);
82
    CHTblNode<T> *curr = bucketArr[bucketIndex];        // pointer to traverse the list
83

84
    // search the list for the key
85
    while (curr)
86
    {
87
        // if the key was found
88
        if (curr->key == key)
89
        {
90
            curr->value = value;        // just update the value
91
            return;
92
        }
93

94
        curr = curr->next;
95
    }
96

97
    // if key was not found insert a new node
98
    CHTblNode<T> *newNode = new CHTblNode(key, value);  // create a node
99
    newNode->next = bucketArr[bucketIndex];             // insert at the head of the list
100
    bucketArr[bucketIndex] = newNode;                   // update the head
101
    count++;                                            // increment count
102
    double loadFactor = (1.0 * count) / buckets;        // update the load factor
103

104
    // if the load factor becomes greater than LOAD_FACTOR_THRESHOLD
105
    if (loadFactor > LOAD_FACTOR_THRESHOLD)
106
        rehash();
107
} // end insert
108

109
// Remove an entry by key
110
// T = O(1)
111
template <typename T>
112
T CHTbl<T>::remove(string key)
113
{
114
    int bucketIndex = hash(key);
115
    CHTblNode<T> *curr = bucketArr[bucketIndex];        // pointer to traverse the list
116
    CHTblNode<T> *prev = nullptr;                       // pointer to traverse the list
117

118
    // search the list for the key
119
    while (curr)
120
    {
121
        // if the key was found
122
        if (curr->key == key)
123
        {
124
            // remove the head from the list
125
            if (prev == nullptr)
126
                bucketArr[bucketIndex] = curr->next;
127
            // remove the node other than the head from the list
128
            else
129
                prev->next = curr->next;
130

131
            // delete the removed node
132
            T value = curr->value;      // set aside the value to return
133
            curr->next = nullptr;       // isolate curr to prevent recursive deletion
134
            delete curr;                // deallocate memory space pointed at by curr
135
            count--;                    // decrement count
136
            
137
            return value;
138
        }
139
        
140
        prev = curr;                    // advance prev
141
        curr = curr->next;              // advance curr
142
    }
143

144
    return 0;                           // '0' as a cross-type signal for key not found
145
} // end remove
146

147
// Retrieve the value corresponding to the given key
148
// T = O(1)
149
template <typename T>
150
T CHTbl<T>::getValue(string key)
151
{
152
    int bucketIndex = hash(key);
153
    CHTblNode<T> *curr = bucketArr[bucketIndex];        // pointer to traverse the list
154

155
    // search the list for the key
156
    while (curr)
157
    {
158
        // if the key was found
159
        if (curr->key == key)
160
            return curr->value;
161

162
        curr = curr->next;              // advance curr
163
    }
164

165
    return 0;                           // '0' as a cross-type singal for key not found
166
} // end getValue
167

168
// Hash function - Convert key into bucketArr index (hash value)
169
// T = O(l) where l is the length of the key
170
template <typename T>
171
int CHTbl<T>::hash(string key)
172
{
173
    int hashCode = 0;
174
    int base = 37;
175
    int place = 1;          // b^0 (ones') place, b^1s' place, b^2s' place and so on
176

177
    // step1: hash code (convert a key to an integer)
178

179
    // base-37
180
    // "abc" -> ('a' * 37^0) + ('b' * 37^1) + ('c' * 37^2)
181
    for (int i = 0; i < key.size(); i++)
182
    {
183
        hashCode += key[i] * place;
184
        place *= base;
185
        
186
        // Since place and hashCode grow fast, let's apply the property of modulo
187
        // operation and stay away from int overflow.
188
        // e.g., (a*b)%m = [(a%m)*(b%m)]%m
189
        hashCode %= buckets;
190
        base %= buckets;
191
    }
192

193
    // setp 2: compression (scale the converted integer to a valid bucketArr index)
194
    return hashCode % buckets;
195
    
196
} // end hash
197

198
// Transfer entries to the resized bucket array
199
// T = O(n)
200
template <typename T>
201
void CHTbl<T>::rehash()
202
{
203
    // create a new bucket array with double the size of the original one
204
    CHTblNode<T> **oldBucketArr = bucketArr;
205
    bucketArr = new CHTblNode<T>*[2 * buckets];
206

207
    // initialize the new bucket array
208
    for (int i = 0; i < 2 * buckets; i++)
209
        bucketArr[i] = nullptr;
210

211
    // update the number of buckets
212
    int oldBuckets = buckets;
213
    buckets *= 2;
214

215
    // reset count
216
    count = 0;
217

218
    // transfer the entries in the old bucket array to the new bucket array
219
    for (int i = 0; i < oldBuckets; i++)
220
    {
221
        CHTblNode<T> *curr = oldBucketArr[i]; // pointer to traverse the old bucket array
222

223
        while (curr)
224
        {
225
            string key = curr->key;
226
            T value = curr->value;
227
            insert(key, value);         // entry gets inserted into the new bucket array
228

229
            curr = curr->next;          // advance curr
230
        }
231
    }
232

233
    // deallocate the memory used for the lists associated the old bucket array
234
    for (int i = 0; i < oldBuckets; i++)
235
        delete oldBucketArr[i];         // recursively delete a list
236

237
    // deallocate the memory used for the old bucket array 
238
    delete[] oldBucketArr;
239
} // end rehash
240

241
// Retrieve the current load factor
242
// T = O(1)
243
template <typename T>
244
double CHTbl<T>::getLoadFactor()
245
{
246
    return (1.0 * count) / buckets;
247
} // end getLoadFactor

Test Driver


73
1
//==============================================================================
2
//  File    : chtbl_main.c
3
//  Brief   : Test driver for chained hash table (in C++)
4
//  Author  : Kyungjae Lee
5
//  Date    : Dec 23, 2022
6
//==============================================================================
7

8
#include <iostream>
9
#include "chtbl.h"
10

11
using namespace std;
12

13
int main(int argc, char *argv[])
14
{
15
    CHTbl<int> map;       
16

17
    cout << "Initial map info" << endl;
18
    cout << "size : " << map.size() << endl;
19
    cout << endl;
20

21

22
    // insert ("abc0", 1), ("abc1", 2), ("abc2", 3) ... , ("abc9", 10) into the map
23
    cout << "Insert (\"abc0\", 1), (\"abc1\", 2), ... , (\"abc9\", 10) "
24
        << "into the map" << endl;
25
        
26
    for (int i = 0; i < 10; i++)
27
    {
28
        // construct the key
29
        char c = '0' + i;
30
        string key = "abc";
31
        key = key + c;
32
        
33
        // construct the value
34
        int value = i + 1;
35

36
        // insert
37
        map.insert(key, value);
38

39
        // print the insertion status
40
        cout << "(" << key << ", " << value << ") inserted - ";
41
        cout << "load factor: " << map.getLoadFactor() << endl;
42
    }
43

44
    cout << "size : " << map.size() << endl;
45
    cout << endl;
46

47
    // remove entry
48
    cout << "Remove (\"abc1\", 2) and (\"abc6\", 7)" << endl;
49
    map.remove("abc1");
50
    map.remove("abc6");
51

52
    cout << "size : " << map.size() << endl;
53
    cout << endl;
54

55
    // printt the key, value pairs
56
    cout << "Print the key, value pairs (value of the removed entry : 0)" << endl;
57
    
58
    for (int i = 0; i < 10; i++)
59
    {
60
        // construct the key
61
        char c = '0' + i;
62
        string key = "abc";
63
        key = key + c;
64

65
        // print the key, value pairs
66
        cout << key << ", " << map.getValue(key) << endl;
67
    }
68

69
    cout << "size : " << map.size() << endl;
70
    cout << endl;
71

72
    return 0;
73
}


31
1
Initial map info
2
size : 0                                <---- initial buckets: 5
3

4
Insert ("abc0", 1), ("abc1", 2), ... , ("abc9", 10) into the map
5
(abc0, 1) inserted - load factor: 0.2
6
(abc1, 2) inserted - load factor: 0.4
7
(abc2, 3) inserted - load factor: 0.6
8
(abc3, 4) inserted - load factor: 0.4   <---- rehash, buckets: 10
9
(abc4, 5) inserted - load factor: 0.5
10
(abc5, 6) inserted - load factor: 0.6
11
(abc6, 7) inserted - load factor: 0.7
12
(abc7, 8) inserted - load factor: 0.4   <---- rehash, buckets: 20
13
(abc8, 9) inserted - load factor: 0.45
14
(abc9, 10) inserted - load factor: 0.5
15
size : 10
16

17
Remove ("abc1", 2) and ("abc6", 7)
18
size : 8
19

20
Print the key, value pairs (value of the removed entry : 0)
21
abc0, 1
22
abc1, 0     <---- removed entry
23
abc2, 3
24
abc3, 4
25
abc4, 5
26
abc5, 6
27
abc6, 0     <---- removed entry
28
abc7, 8
29
abc8, 9
30
abc9, 10
31
size : 8

Hash Tables

Why is Map Necessary?

C++ Built-in Maps

Hash Tables (Hash Maps)

Collision Handling Techniques

Closed Addressing (Closed Hashing)

Open Addressing (Open Hashing)

Implementation (C++) - Chained Hash Table

Header (chtable.hpp)

Source (chtable.cpp)

Test Driver

Chained Hash Table 1 (C++)

Interface

Implementation

Test Driver

Chained Hash Table 2 (C++)

Interface

Implementation

Test Driver

Header (`chtable.hpp`)

Source (`chtable.cpp`)