Hashing Algorithm Overview: Types, Methodologies & Usage
A hashing algorithm is a mathematical function that garbles data and makes it unreadable.
Hashing algorithms are one-way programs, so the text can’t be unscrambled and decoded by anyone else. And that’s the point. Hashing protects data at rest, so even if someone gains access to your server, the items stored there remain unreadable.
Hashing can also help you prove that data isn’t adjusted or altered after the author is finished with it. And some people use hashing to help them make sense of reams of data.
What Is a Hashing Algorithm?
Dozens of different hashing algorithms exist, and they all work a little differently. But in each one, people type in data, and the program alters it to a different form.
All hashing algorithms are:
- Mathematical. Strict rules underlie the work an algorithm does, and those rules can’t be broken or adjusted.
- Uniform. Choose one type of hashing algorithm, and data of any character count put through the system will emerge at a length predetermined by the program.
- Consistent. The algorithm does just one thing (compress data) and nothing else.
- One way. Once transformed by the algorithm, it’s nearly impossible to revert the data to its original state.
It’s important to understand that hashing and encryption are different functions. You might use them in concert with one another. But don’t use the terms interchangeably.
How Does a Hashing Algorithm Work?
It's possible to create an algorithm with nothing more than a chart, a calculator, and a basic understanding of math. But most people use computers to help.
Most hashing algorithms follow this process:
- Create the message. A user determines what should be hashed.
- Choose the type. Dozens of hashing algorithms exist, and the user might decide which works best for this message.
- Enter the message. The user taps out the message into a computer running the algorithm.
- Start the hash. The system transforms the message, which might be of any length, to a predetermined bit size. Typically, programs break the message into a series of equal-sized blocks, and each one is compressed in sequence.
- Store or share. The user sends the hash (also called the "message digest") to the intended recipient, or the hashed data is saved in that form.
The process is complicated, but it works very quickly. In seconds, the hash is complete.
What Are Hashing Algorithms Used For?
The very first hashing algorithm, developed in 1958, was used for classifying and organizing data. Since then, developers have discovered dozens of uses for the technology.
Your company might use a hashing algorithm for:
- Password storage. You must keep records of all of the username/password combinations people use to access your resources. But if a hacker gains entry, stealing unprotected data is easy. Hashing ensures that the data is stored in a scrambled state, so it's harder to steal.
- Digital signatures. A tiny bit of data proves that a note wasn't modified from the time it leaves a user's outbox and reaches your inbox.
- Document management. Hashing algorithms can be used to authenticate data. The writer uses a hash to secure the document when it's complete. The hash works a bit like a seal of approval.
A recipient can generate a hash and compare it to the original. If the two are equal, the data is considered genuine. If they don't match, the document has been changed.
- File management. Some companies also use hashes to index data, identify files, and delete duplicates. If a system has thousands of files, using hashes can save a significant amount of time.
Hashing Algorithm Examples
It may be hard to understand just what these specialized programs do without seeing them in action.
Imagine that we'd like to hash the answer to a security question. We've asked, "Where was your first home?" The answer we're given is, "At the top of an apartment building in Queens." Here's how the hashes look with:
- MD5: 72b003ba1a806c3f94026568ad5c5933
- SHA-256: f6bf870a2a5bb6d26ddbeda8e903f3867f729785a36f89bfae896776777d50af
Now, imagine that we've asked the same question of a different person, and her response is, "Chicago." Here's how hashes look with:
- MD-5: 9cfa1e69f507d007a516eb3e9f5074e2
- SHA-256: 0f5d983d203189bbffc5f686d01f6680bc6a83718a515fe42639347efc92478e
Notice that the original messages don't have the same number of characters. But the algorithms produce hashes of a consistent length each time.
And notice that the hashes are completely garbled. It's nearly impossible to understand what they say and how they work.
Popular Hashing Algorithms Explained
Many different types of programs can transform text into a hash, and they all work slightly differently.
Common hashing algorithms include:
- MD-5. This is one of the first algorithms to gain widespread approval. It was designed in 1991, and at the time, it was considered remarkably secure.
Since then, hackers have discovered how to decode the algorithm, and they can do so in seconds. Most experts feel it's not safe for widespread use since it is so easy to tear apart.
- RIPEMD-160. The RACE Integrity Primitives Evaluation Message Digest (or RIPEMD-160) was developed in Belgium in the mid-1990s. It's considered remarkably secure, as hackers haven't quite figured out how to crack it.
- SHA. Algorithms in the SHA family are considered slightly more secure. The first versions were developed by the United States government, but other programmers have built on the original frameworks and made later variations more stringent and harder to break. In general, the bigger the number after the letters "SHA," the more recent the release and the more complex the program.
For example, SHA-3 includes sources of randomness in the code, which makes it much more difficult to crack than those that came before. It became a standard hashing algorithm in 2015 for that reason.
- Whirlpool. In 2000, designers created this algorithm based on the Advanced Encryption Standard. It's also considered very secure.
The government may no longer be involved in writing hashing algorithms. But the authorities do have a role to play in protecting data. The Cryptographic Module Validation Program, run in part by the National Institute of Standards and Technology, validates cryptographic modules. Companies can use this resource to ensure that they're using technologies that are both safe and effective.
How Much Should You Know About Hashing?
If you work in security, it's critical for you to know the ins and outs of protection. Hashing is a key way you can ensure important data, including passwords, isn't stolen by someone with the means to do you harm.
Private individuals might also appreciate understanding hashing concepts. If you've ever wanted to participate in bitcoin, for example, you must be well versed in hashing. Your trading partners are heavy users of the technology, as they use it within blockchain processes.
But if the math behind algorithms seems confusing, don't worry. Most computer programs tackle the hard work of calculations for you.
At Okta, we also make protecting your data very easy. We have sophisticated programs that can keep hackers out and your work flowing smoothly.
And we're always available to answer questions and step in when you need help. Contact us to find out more.
Hans Peter Luhn and the Birth of the Hashing Algorithm. (January 2018). IEEE Spectrum.
Hashing Algorithms. IBM Knowledge Center.
MD5 Hash Generator. Dan's Tools.
SHA-256 Hash Generator. Dan's Tools.
Cryptographic Hash Function. Wikipedia.
Cryptographic Module Validation Program. National Institute of Standards and Technology.