Authenticated Encryption and how not to get caught chasing a coyote
by Jeffrey Goldberg on
I introduced HMAC (Hash-based Message Authentication Code) through the back door when talking about the Time-based One Time Password (TOTP) of Dropbox’s two-step verification. But TOTP is actually a peculiar way to use HMAC. Let’s explore what what Message Authentication Codes (MACs) are normally used for and why they play such an important role in the future of 1Password. If you guessed that MACs are for authenticating messages, you’re on the right track.
In a sense (but this is a lie), you can think of MACs as kind of like encrypting things that aren’t secret. The recipient doesn’t decrypt the data, instead the recipient verifies that it really was the data sent by the sender and that the data hasn’t been tampered with. MACs work with the sender and the recipient both sharing a secret key. (This is the key difference between MACs and digital signatures. With MACs, the sender and the recipient share the same secret key.)
Only someone with knowledge of the secret MAC key could have created the MAC for some data, and (unlike the case of digital signatures) only someone with knowledge of the secret MAC key can verify that the MAC is valid for the data.
Suppose my dog Patty (the clever one) leaves a warning for Molly (a simple dog) that says, “There’s a coyote in the back yard”. The message isn’t secret. Neither Patty nor Molly care who reads it. But now suppose that the neighbor’s cat, Mr Talk, in his attempt to get rid of Molly, changes the message to “There’s a squirrel in the back yard”. This would not be good news for Molly, who would blindly run behind the house, chase a squirrel up a tree and then bark at it for the next thirty minutes. Coyotes, however, do not climb trees; and so Molly would have an entirely different experience if she tried the same action against a coyote.
To avoid tampering with the message, and to prevent Mr Talk from sending a counterfeit message in Patty’s name, Patty and Molly can share a secret key which is used to create a MAC of the message. Suppose that Patty and Molly, ignoring earlier advice on creating passwords, have secretly agreed on the key (well, a password from which a key is derived)—”Kitty Smells”—for their MACs. When Patty constructs the message, she will calculate the MAC (and if she is using HMAC with SHA1 as the hashing algorithm) with ‘Kitty Smells’ as the password, the HMAC should come out as:
Patty will leave the message, “There’s a coyote behind the house”. along with the MAC. When Molly sees a message, she should verify the MAC before even looking at the contents of the message. If Mr Talk has changed the message to say “squirrel” instead of “coyote,” the MACs won’t match up. Mr Talk cannot create a valid MAC because he doesn’t know the shared secret password used to create the secret key that Molly and Patty have.
Sadly, Mr Talk’s trick will still work. That is because as soon as Molly encounters the word “squirrel” she will react. All thoughts of verifying the MAC will be pushed out of her brain, which will now be entirely occupied by the single thought, “squirrel”. Indeed, if I were reading this aloud, Molly would have run out the back in blind excitement. This is why it is very important to not even look at the contents of a message before verifying its authenticity.
In this example, the original message isn’t secret and so didn’t need to be encrypted. But with authenticated encryption, we first encrypt the message with an encryption key and then compute a MAC of the encrypted text using an authentication key. Just as it was important for Molly to not do anything with the message until she verified the MAC, it is vital that we don’t try to decrypt an encrypted message before verifying the MAC. This system is called “Encrypt-then-MAC”, but the emphasis should be put on the other end of the process. I like calling it “Verify-then-decrypt”.
A scheme like Encrypt-then-MAC that both encrypts a message and provides authentication (proof of who it came from) is called “authenticated encryption”. Encrypt-then-MAC isn’t the only secure authenticated encryption scheme out there, but it is the one that we use in the 1Password 4 Cloud Keychain format.
You might think that there is no reason to authenticate encrypted data. After all, the data was encrypted with a secret key, so if you can decrypt the message with that secret key, then you know it only could have been encrypted by someone with knowledge of the secret key. Many people have thought that, but they were wrong.
Suppose that Molly sends an encrypted message to Patty, but doesn’t use authenticated encryption. Now when Patty gets a message from Molly she decrypts it. If the decrypted message is garbled in a specific way, Patty tells Molly that it didn’t decrypt properly and that Molly should send it again. If it isn’t garbled in a particular way, Patty will just let Molly know she got the message.
Mr Talk can listen to this exchange between Patty and Molly. Without the secret encryption key, Mr Talk won’t be able to figure out what is in the message that Molly sent to Patty. But now suppose that Mr Talk is able to send encrypted messages (seemingly from Molly) to Patty. Mr Talk can send Patty modified forms of the message that Molly sent and find out whether Patty got a garbled message when she decrypted it. Mr Talk makes small changes to the original encrypted message to create a bunch of new slightly different encrypted messages. By finding out which ones are garbled and which ones aren’t, Mr Talk can actually decrypt the original message. This is a type of “Chosen Ciphertext Attack (CCA)”. It is called this because the attacker is able to choose ciphertext for the recipient to attempt to decrypt.
Now, the particular attack that Mr Talk used depends on the precise details of how Patty determines whether a message was garbled. That means changing those details can defend against this particular attack. Those who are familiar with all of this will know that I’m talking about the “padding oracle attack”, and will know that it can be defended against by using different padding or using a different encryption mode. But such a defense only addresses this particular attack. Is there a way to defend against all CCAs, even those that haven’t been invented yet?
The good news is that it is possible to defend against all Chosen Ciphertext Attacks. The way to do that is to properly use authenticated encryption. Padding or other “oracles” are not a particular threat to 1Password, as there is no back-and-forth exchange in normal operation. These sorts of attacks are practical when there is an automated oracle on the network or in a specific device that will attempt to decrypt ciphertext on demand. In 1Password, there is no opportunity for an attacker to set up the kind of dialogue needed for this kind of attack. But we also know that theoretical weaknesses have a habit of turning into exploitable weakness over time. So we look ahead, and build authenticated encryption into 1Password now.
What happened to Molly?
I am pleased to say that no harm came to Molly or the coyote. She was on her leash, and you can see in this staged and contrived photo that I was – just barely – able to restrain her from running to her doom after a coyote. However, our attempts to teach her that she must verify messages before she does any other processing of said messages have not gone well.