I find it interesting that people tend to follow the same stages of awareness when first hearing about timing attacks. A recent discussion about Django redesigning their cookie scheme shows people at the various stages.
1. The timing difference is too small to attack
This is the result of not understanding statistics or how jitter can be modeled and filtered. Usually, people are thinking that a single measurement for a given input is probably obscured by noise. This is correct. That’s why you take many measurements for the same input and average them. The more noise, the more measurements you need. But computers are fast, so it doesn’t take long to do thousands or even millions of runs per guessed byte.
The basic statistical method is simple. You apply a hypothesis test of the mean of all runs with a given guess for a byte (say, 0) against the mean for all the others (1, 2, …, 255) If there is a significant difference in the means, then you have guessed that byte correctly. If no difference, repeat for the other 255 values. If still no difference, take more measurements.
The other misconception is that jitter is too great to get anything useful out of the measurements, especially in a network. There is an excellent paper by Crosby and Wallach that debunks this myth by carefully analyzing noise and its causes as well as how to filter it. They conclude that an attacker can reliably detect processing differences as low as 200 nanoseconds on a LAN or 30 microseconds on a WAN given only 1000 measurements. If your server is hosted somewhere an attacker can also buy rackspace, then you are vulnerable to LAN timing attacks.
2. I’ll just add a random delay to my function
Then I’ll take more measurements. See above.
3. I’ll add a deadline to my function and delay returning until it is reached
This usually has a higher overhead and is more prone to errors than implementing your function such that its timing does not vary, based on the input data. If you make a mistake in your interval calculation or an attacker is able to “stun” your function somewhere in the middle such that all target computations occur after the interval timer has expired, then you’re vulnerable again. This is similar to avoiding buffer overflow attacks by lengthening the buffer — better to fix the actual problem instead.
4. Ok, ok, I developed a constant-time function
def full_string_cmp(s1, s2): total = 0 for a, b in zip(s1, s2): total += (a != b) return not total
The first bug is that this opens a worse hole if the two strings are not the same length. In fact, an attacker can send in a zero-length string and the result of zip() is an empty array. This results in the for() loop never executing and any input being accepted as valid! The fix is to check the two lengths first to make sure they’re equal or in C, compare two fixed-length arrays that are guaranteed to be the same size.
The next bug is smaller, but still valid. There’s a timing leak in the comparison of a and b. When you write:
total += (a != b)
Both Python and a C compiler actually generate low-level instructions equivalent to:
if (a != b) tmp = 1 else tmp = 0 total += tmp
Thus, there is still a small timing leak. If they are equal, an additional branch instruction is executed. If they are not equal, this is skipped. A common intuitive mistake among programmers is that single lines of code are atomic. This is why you might find a C version of this such as “total += (a != b) ? 1 : 0″, which generates the same code. Just because it fits on a single line does not mean it is atomic or constant-time.
5. Ok, I fixed those bugs. Are we done yet?
Let’s see what we have now (adapted from my keyczar posting).
if len(userMsg) != len(correctValue): return False result = 0 for x, y in zip(userMsg, correctValue): result |= ord(x) ^ ord(y) return result == 0
This now uses an arithmetic operator (XOR) instead of a logical compare (!=), and thus should be constant-time. The += operator could possibly have a leak if carries take longer than an add without carry, so we went with |= instead. We check the lengths and abort if they don’t match. This does leak timing information about the length of the correct value, so we have to make sure that is not a problem. Usually it’s not, but if these were passwords instead of cryptographic values, it would be better to hash them with PBKDF2 or bcrypt instead of working with them directly. But are we done?
Maybe. But first we need to figure out if our high-level language has any behavior that affects the timing of this function. For example, what if there is sign-bit handling code in the Python interpreter that behaves differently if x or y is negative? What if the zip() operator has an optimization that compares two arrays first to see if they’re identical before returning their union?
The answer is you can never be sure. In C, you have more assurance but still not enough. For example, what if the compiler optimization settings change? Is this still safe? What about the microarchitecture of the CPU? What if Intel came out with a new byte-wise cache alignment scheme in the Pentium 1000?
I hope this has convinced some people that side channel attacks are not easily solved. Important code should be carefully reviewed to have higher assurance this class of attack has been mitigated.