Hashing and Consistent Hashing : Simplifying Data Management and Distribution

4 min readJul 15, 2024

In the world of computer science, hashing is a common technique used to manage and store data efficiently. Let’s dive into what hashing is and why consistent hashing is so important.

What is Hashing?

Hashing is like a magical way to quickly find and store data. Think of it as putting your keys in a specific drawer so you can find them easily later.

How it works:

Input Data: You have some data (like a key).
Hash Function: This data goes through a special function called a hash function.
Hash Value: The hash function converts the input data into a unique number called a hash value.
Storage: This hash value tells you exactly where to store or find your data in a database or memory.

Example: Imagine you have a list of names, and you want to store them in a way that you can quickly find them later. Using hashing, the name “Alice” might be turned into a number like 123. This number tells you where “Alice” is stored.

Use Cases of Hashing:

Fast Data Retrieval: Hashing helps you find data quickly without searching through everything.
Databases: Used to index and retrieve data efficiently.
Password Storage: Hashing stores passwords securely by converting them into hash values.

In a distributed system, the number of server nodes often changes due to scaling or unexpected failures. We can’t assume the number of nodes will always stay the same. If a node fails with a basic hashing approach, we’d need to rehash all the data because the mapping of keys depends on the number of nodes and their positions.

Issues in Hashing when no. of available locations/server nodes changes

Here comes the concept of Consistent hashing :

What is Consistent Hashing?

Consistent Hashing is a technique used in distributed systems to efficiently distribute data across multiple servers. It ensures that the system can handle changes like adding or removing servers without needing to reassign all the data.

Sure! Here’s an easy-to-understand explanation of consistent hashing:

What is Consistent Hashing?

How Consistent Hashing Works:

Hash Ring: Imagine a big circle (the hash ring) where both servers and data requests are placed. Each point on the circle is a possible position for a server or a data request.

2. Placing Servers on the Ring :

Servers are assigned positions on the circle using a hash function, which generates a unique number (position) for each server.
These positions are random and spread around the circle.

3. Placing Data Requests on the Ring : Data requests (like user queries or stored data) are also assigned positions on the circle using the same hash function.

4. Assigning Requests to Servers :

To determine which server handles a request, start at the request’s position and move clockwise around the circle.
The first server you encounter in this direction will handle the request.
If the request is beyond the last server’s position on the circle, it wraps around to the first server.

Server A,B,C & Request R1 on the HashRing

Server A is at position 10.
Server B is at position 50.
Server C is at position 80.
Request R1 is at position 5.
Request R1 is assigned to server A (the first server clockwise from R1’s position).

Handling Server Failures:

1. If a Server Fails:

Suppose server B (at position 50) fails.
Any requests that server B was handling will now go to the next server clockwise, which is server C.

2. Minimal Data Movement:

Only the data handled by the failed server (B) needs to be reassigned.
The rest of the data on the ring remains unaffected.

Why is Consistent Hashing Useful?

Scalability: Easily add or remove servers with minimal disruption.
Efficiency: Only a small portion of data is moved when servers are added or removed.
Reliability: System continues to function smoothly even if servers fail.

Conclusion

Consistent hashing is like a smart way of organizing a distributed system so that changes in the number of servers cause minimal disruption. It places both servers and data requests on a circle, ensuring that data is evenly distributed and easily reassigned when needed.

I hope this explanation helps you understand consistent hashing!

That’s it!. Feel free to follow me and share your thoughts on what else I can improve.

See you in the next part! 😊

Hashing and Consistent Hashing : Simplifying Data Management and Distribution

What is Hashing?

What is Consistent Hashing?

What is Consistent Hashing?

How Consistent Hashing Works:

Handling Server Failures:

Why is Consistent Hashing Useful?

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Jaspreet Singh Sodhi

No responses yet