Colin Percival and I often disagree about how cryptography fits into software design. My opinion is that custom crypto designs always require intense third-party review, leaving companies with two undesirable possibilities. Either the review and maintenance cost of custom crypto increases the expense over competing design options that don’t require crypto (such as keeping critical state on the server) or leads to security flaws if third-party review is skipped. But we often agree on the concrete details, such as the preference for encrypt-then-MAC construction.
If you MAC the plaintext, the recipient must decrypt the message and then verify the MAC before rejecting it. An attacker can send garbage or carefully chosen ciphertexts and they can’t be rejected until they are decrypted. If the ciphertext is authenticated, then the server can reject the forged or modified message earlier.
This has a number of advantages, including some not emphasized in Colin’s post. He points out that this reduces the surface area of your code accessible to an attacker. This is especially true if the higher-level protocol is designed to minimize interpretation of unprotected header fields (and hopefully, not parsing anything unprotected). The earlier the MAC can be performed, the less code accessible to an attacker who is choosing messages.
Another advantage is that this helps prevent side channel attacks. If you decrypt and then validate the plaintext, an attacker can send chosen ciphertext and collect information from a side channel to recover the encryption key. However, if you reject such messages early, the decryption key is never used. The attacker is reduced to replaying valid messages seen previously, which limits (but doesn’t prevent) side channel attacks.
However, one point he gets wrong is the emphasis on counter mode (CTR) to prevent side channel attacks. A former colleague of mine, Josh Jaffe, published a paper on attacking counter mode at CHES 2007. The common lore before his paper was that counter mode limits side channel attacks because the attacker does not get to choose the plaintext input to the block cipher. Instead, a possibly-unknown counter value is fed in as plaintext and is incremented with each block.
Colin’s post states this as follows:
In CTR mode, you avoid passing attacker-provided data to the block cipher (with the possible exception of the nonce which forms part of each block). This reduces the attack surface even further: Using CTR mode, an attacker cannot execute a chosen-ciphertext (or chosen-plaintext) attack against your block cipher, even if (in the case of Encrypt-then-MAC) he can correctly forge the MAC (say, if he stole the MAC key but doesn’t have the Encrypt key). Is this an attack scenario worth considering? Absolutely — the side channel attacks published by Bernstein (exploiting a combination of cache and microarchitectural characteristics) and Osvik, Shamir, and Tromer (exploiting cache collisions) rely on gaining statistical data based on a large number of random tests, and it appears unlikely that such attacks would be feasible in a context where an attacker could not provide a chosen input.
Josh’s paper shattered this assumption by showing how to attack several versions of counter mode with AES. He treats the unknown counter value as part of the key, solving for the combined key+counter using DPA. This gives intermediate values that have an error component. He first solves for varying portions of the inputs, leaving this unknown but constant error term. These error terms then cancel out between rounds, revealing the secret round keys.
His attack works because counter mode only changes a few bits each time the counter is incremented. Seeding an LFSR to use as the counter could make this attack more difficult. I would hesitate to say it makes it impossible since it seems like a more complex version of Josh’s attack could succeed against this. In conclusion, DPA countermeasures are still needed, even with counter mode.