Hash Tables tutorial (C#, C++, Java)


Hash Tables are a type of abstract data structure that are used to implement a type of associative array. That is, an array that stores data using key and value components like so: (Key, Value). Much like a dictionary or a map, hash tables store some type of data that has a key element used to retrieve data, and the data itself. When using a hash table, you specify an object that is used as a key, and the value that you want linked to that key. The key is then hashed, and the resulting hash code is used as the index at which the value is stored within the table. Later on when you go to get something from the hash table, if you already know the key associated with some data, it is quickly retrieved in constant time.

In simple terms, a hash table usually has the following functions:

  • Add(key, data): inserts a new entry into the hash table with its associated key, can return either bool or void. Insertion is done by a method of hashing (open, closed) and location in the table can be determined by a method of probing (linear, quadratic, double).
  • Retrieve(key): retrieves some data given a key, can return a reference or pointer to the data. Retrieval is done by
  • Remove(key): removes an entry from the hash table given a key.
  • Print(): prints entire hash table contents

In C#, you can easily use the given hash table by “using System.Collections;” and then declaring a new generic hash table structure like so: “Hashtable ht = new Hashtable();”. For this tutorial, because we are learning about how a hash table works, we are implementing this data structure from scratch.

See the Java and C++ implementations here.

Hashing techniques:

  • Closed: closed hashing means having an array of some size, each new entry will be placed in some open index in the array (to avoid collisions). When all spots in the array have been filled, no more new entries can be added. In other words, we never leave the hash table, and its always guaranteed that any new entry will be stored directly into the hash table.
  • Open: we have an array of some size, however, the index at which on object is stored in the hash table is not completely determined by its hash code. Instead, the index may vary depending on what’s already in the hash table. Open hashing also means the ability to leave the hash table and store entries in a linked list that starts at a particular index in an array, this is called “separate chaining”. Using this linked list technique is a good way of avoiding collisions when we insert new entries. [For example, if we have two keys 37 and 47, both entries end up with the same index in the array, they can be stored in a linked list.]

    • Separate chaining: all keys that map to the same hash value are kept in a linked list (or bucket). It does has its drawbacks, while it is a good way to resolve collisions, it has additional memory costs because of the linked list.

Open Hashing/Separate chaining illustration

Closed Hashing

Closed Hashing

 

 

 

 

 

 

 

 

 

 

 

Probing methods and Collisions:

No matter what the hash function, there is the possibility that two keys could resolve to the same hash key. This situation is known as a collision. There’s some ways to handle and prevent collisions.

  • Linear: The most simple of methods, inserting linearly yields the function: H(x, i) = (h(x) + i) % n, where h(x) = key % n and it is the starting value, “n” is the size of the table, and i is the step size. Linear probing means that the step size “i” is constant, so usually i = 1. In simpler terms, linear probing can be expressed as: newLocation = (startingValue + stepSize) % arraySize. Lastly, the result from performing the function will give the index value to insert into the hash table.
  • Quadratic: This method works just like the linear version, except the step size ‘i’ is squared, and it yields the function: H(x, i) = (h(x) + i^2) % n.
  • Double: double hashing works by have two functions that compute a hash value, and in the insertion, both values are used to hash a new entry. Double hashing yields the function: H(x,i) = (h1(x) + i * h2(x)) % n.
    • In the first hash function h1(x), we compute a hash value by simply taking the key and doing modulus on the size of the table like so: “return key % size”
    • In the second hash function h2(x), we compute the second hash function by taking a value (must be less than the table size, non-zero, and ideally odd/prime) and doing this calculation: value – (key % value).
    • In the insertion, we insert just like we did with linear and quadratic hashing, except this time, we use the double hashing function above.
Quadratic hashing/probing

Quadratic hashing/probing

Double probing/hashing

Double probing/hashing

Big-O Complexity:

  • Insert: O(1) best/average case, O(n) for separate chaining and worst case
  • Retrieve: O(1) best/average case, O(n) for separate chaining and worst case
  • Remove: O(1) best/average case, O(n) for separate chaining and worst case

What are the uses of a hash table?

  • Phone book/company database example: your phone stores contacts with a name, and number associated with a contact
  • Associative arrays, Dictionaries, Maps
  • Searching strings (such as frequent search keywords)
  • Internet routers: router tables that use IP addresses to store hundreds of gateways/packets need to be able to access the table quickly
  • Game data: such as keeping track of high scores, in game items a player can use, etc
  • And many more uses 🙂

Closed Hashing Implementation in C# (this has linear, quadratic, and double probing):

    class hashtable
    {
        class hashentry
        {
            int key;
            string data;
            public hashentry(int key, string data)
            {
                this.key = key;
                this.data = data;
            }
            public int getkey()
            {
                return key;
            }
            public string getdata()
            {
                return data;
            }
        }
        const int maxSize = 10; //our table size
        hashentry[] table;
        public hashtable()
        {
            table = new hashentry[maxSize];
            for (int i = 0; i < maxSize; i++)
            {
                table[i] = null;
            }
        }
        public string retrieve(int key)
        {
            int hash = key % maxSize;
            while (table[hash] != null && table[hash].getkey() != key)
            {
                hash = (hash + 1) % maxSize;
            }
            if (table[hash] == null)
            {
                return "nothing found!";
            }
            else
            {
                return table[hash].getdata();
            }
        }
        public void insert(int key, string data)
        {
            if (!checkOpenSpace())//if no open spaces available
            {
                Console.WriteLine("table is at full capacity!");
                return;
            }
            int hash = (key % maxSize);
            while (table[hash] != null && table[hash].getkey() != key)
            {
                hash = (hash + 1) % maxSize;
            }
            table[hash] = new hashentry(key, data);
        }
        private bool checkOpenSpace()//checks for open spaces in the table for insertion
        {
            bool isOpen = false;
            for (int i = 0; i < maxSize; i++)
            {
                if (table[i] == null)
                {
                    isOpen = true;
                }
            }
            return isOpen;
        }
        public bool remove(int key)
        {
            int hash = key % maxSize;
            while (table[hash] != null && table[hash].getkey() != key)
            {
                hash = (hash + 1) % maxSize;
            }
            if (table[hash] == null)
            {
                return false;
            }
            else
            {
                table[hash] = null;
                return true;
            }
        }
        public void print()
        {
            for (int i = 0; i < table.Length; i++)
            {
                if (table[i] == null && i <= maxSize)//if we have null entries
                {
                    continue;//dont print them, continue looping
                }
                else
                {
                    Console.WriteLine("{0}", table[i].getdata());
                }
            }
        }
        public void quadraticHashInsert(int key, string data)
        {
            //quadratic probing method
            if (!checkOpenSpace())//if no open spaces available
            {
                Console.WriteLine("table is at full capacity!");
                return;
            }

            int j = 0;
            int hash = key % maxSize;
            while(table[hash] != null && table[hash].getkey() != key)
            {
                j++;
                hash = (hash + j * j) % maxSize;
            }
            if (table[hash] == null)
            {
                table[hash] = new hashentry(key, data);
                return;
            }
        }
        public void doubleHashInsert(int key, string data)
        {
            if (!checkOpenSpace())//if no open spaces available
            {
                Console.WriteLine("table is at full capacity!");
                return;
            }

            //double probing method
            int hashVal = hash1(key);
            int stepSize = hash2(key);

            while(table[hashVal] != null && table[hashVal].getkey() != key)
            {
                hashVal = (hashVal + stepSize * hash2(key)) % maxSize;
            }
            table[hashVal] = new hashentry(key, data);
            return;
        }
        private int hash1(int key)
        {
            return key % maxSize;
        }
        private int hash2(int key)
        {
            //must be non-zero, less than array size, ideally odd
            return 5 - key % 5;
        }
    }
}

Open Hashing (Separate Chaining) Implementation in C#:

//implements a separate chaining hash table
class opentable 
{
    class hashnode
    {
        int key;
        string data;
        hashnode next;
        public hashnode(int key, string data)
        {
            this.key = key;
            this.data = data;
            next = null;
        }
        public int getkey()
        {
            return key;
        }
        public string getdata()
        {
            return data;
        }
        public void setNextNode(hashnode obj)
        {
            next = obj;
        }
        public hashnode getNextNode()
        {
            return this.next;
        }
    }

    hashnode[] table;
    const int size = 10;

    public opentable()
    {
        table = new hashnode[size];
        for (int i = 0; i < size; i++)
        {
            table[i] = null;
        }
    }
    public void insert(int key, string data)
    {
        hashnode nObj = new hashnode(key, data);
        int hash = key % size;
        while (table[hash] != null && table[hash].getkey() % size != key % size)
        {
            hash = (hash + 1) % size;
        }
        if (table[hash] != null && hash == table[hash].getkey() % size)
        {
            nObj.setNextNode(table[hash].getNextNode());
            table[hash].setNextNode(nObj);
            return;
        }
        else
        {
            table[hash] = nObj;
            return;
        }
    }
    public string retrieve(int key)
    {
        int hash = key % size;
        while (table[hash] != null && table[hash].getkey() % size != key % size)
        {
            hash = (hash + 1) % size;
        }
        hashnode current = table[hash];
        while (current.getkey() != key && current.getNextNode() != null)
        {
            current = current.getNextNode();
        }
        if (current.getkey() == key)
        {
            return current.getdata();
        }
        else
        {
            return "nothing found!";
        }
    }
    public void remove(int key)
    {
        int hash = key % size;
        while (table[hash] != null && table[hash].getkey() % size != key % size)
        {
            hash = (hash + 1) % size;
        }
        //a current node pointer used for traversal, currently points to the head
        hashnode current = table[hash];
        bool isRemoved = false;
        while (current != null)
        {
            if (current.getkey() == key)
            {
                table[hash] = current.getNextNode();
                Console.WriteLine("entry removed successfully!");
                isRemoved = true;
                break;
            }
            
            if (current.getNextNode() != null)
            {
                if (current.getNextNode().getkey() == key)
                {
                    hashnode newNext = current.getNextNode().getNextNode();
                    current.setNextNode(newNext);
                    Console.WriteLine("entry removed successfully!");
                    isRemoved = true;
                    break;
                }
                else
                {
                    current = current.getNextNode();
                }
            }
            
        }

        if (!isRemoved)
        {
            Console.WriteLine("nothing found to delete!");
            return;
        }
    }
    public void print()
    {
        hashnode current = null;
        for (int i = 0; i < size; i++)
        {
            current = table[i];
            while (current != null)
            {
                Console.Write(current.getdata() + " ");
                current = current.getNextNode();
            }
            Console.WriteLine();
        }
    }

}
Advertisements

13 thoughts on “Hash Tables tutorial (C#, C++, Java)

  1. Pingback: 09 Oracle SQL Tuning for Join Methods – Life Less Ordinary

  2. What’s the point of the parameter constructor in the open hash version? Why won’t the array of table classes initialize without it?

    =======================
    public opentable()
    {
    table = new hashnode[size];
    for (int i = 0; i < size; i++)
    {
    table[i] = null;
    }
    }
    =========================

    Like

  3. Thanks so much for writing this! A couple of questions:
    1. In the open chaining implementation for the Remove method, what if table[hash] contains the key you want to delete? I think you need to check for that first before checking getNextNode()

    while (current.getNextNode().getkey() != key && current.getNextNode() != null)
    {
    current = current.getNextNode();
    }
    if (current.getNextNode().getkey() == key)
    {
    current.deleteNode();
    Console.WriteLine(“entry removed successfully!”);
    return;
    }

    2. When you are deleting a node you found, you call DeleteNode() which does this.next = null. But what if you have a long linked list after it? If you just set the next to null, don’t you break the link to next.next?

    I believe this line: current.deleteNode();

    Should be: current.setNextNode(current.getNextNode().getNextNode());

    (Or something more elegant.)

    Liked by 1 person

    • Thank you for commenting! Yes, that is a much better solution than what I originally wrote. I will make changes to it. Its been a little while since I did work using hash tables.

      Like

      • Hi Karim, sorry to nitpick. About the new added lines, do you need that many lines of code? I think if you set table[hash] = current.getNextNode(); it should work just fine whether the next node is a node or null, right? What do you think?

        if (current.getNextNode() != null)
        {//deleting the head, find the successor
        table[hash] = current.getNextNode();
        Console.WriteLine(“entry removed successfully!”);
        isRemoved = true;
        break;
        }
        if(current.getNextNode() == null)
        {//case of deleting table[hash] as the only node in there
        table[hash] = null;
        Console.WriteLine(“entry removed successfully!”);
        isRemoved = true;
        break;
        }

        Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s