All about bcrypt

By Eric Lewis

This is an introduction to storing user passwords responsibly, the bcrypt library, and using bcrypt in Node.js projects. bcrypt is available in nearly every programming language, so this article should be useful even if you're not a Node.js developer.

When you create a password on a website, your password isn't stored in plain text, it's encrypted. Data breaches unfortunately happen. When they do, storing passwords in plain text gives an attacker immediate access to log in as any of our users.

To protect user accounts even in the case of a data breach, passwords are typically stored as a hash, which is garbled text like this:

AsxIMAKrV6KlBzOiEbInqcrMARG7fJq

A hash function converts plain text into the nonsense. My password steam3dhams is put through a hash function and stored in a website's database.

hash(password) = hashed password
hash(steam3dhams) = AsxIMAKrV6KlBzOiEbInqcrMARG7fJq

Enter some text below and see it convert to a hash using the bcrypt hashing function:

Hashed value:

 

When I log into a website, the password is put through the hash function, and compared to the stored hash value. If they're the same, my password is correct and the website logs me in.

A hash function has an interesting quality: you can hash a password, but you can't unhash it. So, if an attacker gains access a user database, they would have access to my hashed password, but not the original password required for log in.

Although a hashing function encrypts the password, a hashing process by itself does not offer strong security. An attacker could come up with a list of millions of possible passwords, put them all through the hash function, and check if any match my user's hashed password.

Attacker searching for a password that matches hash:
AsxIMAKrV6KlBzOiEbInqcrMARG7fJq

# Millions of previous attempts...

Checking: steamedhams 
Hash: 3WFs2gtV3YlSmcH1y6Bhkl33uFrfLvC 
✖ No match

Checking: st3amedhams
Hash: 1sjUFB0RvkNiJz8nh3ljbF2FEIFZzoC
✖ No match

Checking: steamedh4ms
Hash: BGK1wfhINXs.gmZpxdocJ45CkfUpy2q
✖ No match

Checking: steam3dhams
Hash: AsxIMAKrV6KlBzOiEbInqcrMARG7fJq 
✔ Voila! The user's password is steam3dhams

bcrypt protects encrypted passwords from brute force dictionary attacks like these in two ways: salts and slowdown.

Salts make for flavorful hashes

A salt is a bit of random text to prefix to a password before it goes through the hash function. It adds some flavor to the hash.

hash(salt + password) = hashed password

bcrypt can generate a random salt for every separate password. When Ada creates a password her hash might look like

hash("d4jf02" + "jamiexx") = aifubvjkbrvjkbvkjrv

Even if two users have the same password, their hashes will be different because the salts are different.

hash("rf9hv3" + "jamiexx") = ijrkjvbrvkjrnrihrfiurbvr

bcrypt outputs hashes in a digest, like this:

$2a$04$hz9nbNQO0w9bwFJiLsyFge74NnLei9CUr/XNNbZAPbujL2b5GRfc.

The digest text includes a few pieces of information:

$2a$ - the version of the hashing function, to disambiguate bcrypt from others. $2a$ is the version number for bcrypt.

04$ - the number 04 is a cost factor or number of salt rounds to compute with the hash. More on this in a bit.

hz9nbNQO0w9bwFJiLsyFge - the following 22 characters are the salt.

74NnLei9CUr/XNNbZAPbujL2b5GRfc. - the last 31 characters are the actual hashed value.

So when I log into cool-videos.com, the website looks up the password digest for my user, hashes my provided password with the salt, and compares the hash with the stored hash.

      hash( "asdfjrnf" + "steam3dhams") = adsfaivbrivljbrlijrfkljnf
    

If every user's password has a different salt, an attacker would need to run the hashing algorithm for their entire password dictionary for every user password.

Up to now I've described the generalities of hash functions that predate bcrypt like crypt or md5. What makes bcrypt special?

bcrypt is slow, on purpose

For an attacker who gains access to my user's password digest, all they need to do is run the hash function against tons of potential passwords. The only thing standing between the attacker and my original password is the time it takes to run that function millions of times.

We can literally slow the attacker down by setting a cost factor for the hash, sometimes called salt rounds. bcrypt will repeat the interal hashing process as many times as specified by the cost factor. For every extra cost factor, the hashing process slows down by a factor of two.

So, even if the attacker gets our password digest, running the hash function will be as slow as we set the number of salt rounds. You can slow down the process as much as you like, which was novel in bcrypt.

Previous hashing functions would only work for so long. crypt, another hashing function, came out in 1976. On computers of that age, crypt could run 4 times a second. 20 years later, crypt could run 200,000 times a second.

The ability to configure how slow the hashing function takes to run allows bcrypt to be useful in 20 years. As processors get faster, you can turn the cost factor up in your implementation.

In Node.js, use the bcrypt package to encrypt a plain-text password into a password digest for storage in a database.

When a user submits a plain-text password for log in, you can compare it to the value in the database: