Next Baysec: Jan 19 at Irish Bank

The next Baysec meeting is Tuesday, January 19, 7 pm at the Irish Bank (note: new location). Come out and meet fellow security people from all over the Bay Area. As always, this is not a sponsored meeting, there is no agenda or speakers, and no RSVP is needed.

This is our first meeting at the new location so it would be great if we could get a good turnout. We have the entire back room reserved for our group. They will keep some tables on the edges for dining and clear out
the center.

See you there!

The Irish Bank
10 Mark Lane
San Francisco, CA
415.788.7152
theirishbank.com

Side channel attacks on cryptographic software

Below is a recent article I wrote for IEEE Security and Privacy magazine titled “Side Channel Attacks on Cryptographic Software” (pdf). It covers simple timing attacks against HMAC validation, AES cache timing, and RSA branch prediction attacks. It’s a survey article, covering excellent research in side channel attacks over the past few years.

I think the attacker position is gaining an advantage recently. As CPU microarchitecture gets more complicated, more covert channels appear. Also, the move to virtualization and high-resolution timers gives better quality measurements and more opportunities to exploit even the tiniest of leaks. We’ll need to come up with more clever ways of modeling and eliminating covert channels, moving crypto operations into hardware, and giving software more control over microarchitectural state.

Let me know what you think.

Interesting talks at 26c3

I hope to attend a CCC event some day. While there are many great talks in the 26c3 schedule, here are some talks that look particularly interesting.

Others that may be interesting but haven’t posted slides or papers yet:

Hope everyone at 26c3 has a great time. Best wishes for a safe and secure 2010.

Stop using unsafe keyed hashes, use HMAC

The HMAC construction turns a cryptographic hash algorithm into a keyed hash. It is commonly used for integrity protection when the sender and recipient share a secret key. It was developed to address various problems with arbitrary keyed hash constructions. So why are developers still rolling their own?

One of the original papers on keyed hash constructions describes the motivations for developing a standard for HMAC. In 1995, there was no standardization and cryptographers only worked from hunches as to what constituted a secure keyed hash. This paper summarized two known attacks on some common schemes that had evolved in the absence of a standard.

The first construction the paper attacks is H(k || m), aka “secret prefix”. The key and the message to be authenticated were concatenated and hashed. The authenticator was the resulting hash. This was fatally flawed, as I mentioned in my previous talks on web crypto. Standard hash algorithms that use the Merkle-Damgard construction (like SHA-1) are subject to a length-extension attack. An attacker can trivially create an authenticator for m’ where m’ = m1 || pad || m2 if they have seen the authenticator for m1. (The “pad” value makes the input a multiple of the compression function block size and includes the total length hashed). This flaw was most recently found in the Flickr API.

The second construction was H(m || k), aka “secret suffix”. While the length-extension attack no longer applies because k is unknown to the attacker, this still maximally exposes you to weaknesses in the hash algorithm. Preneel et al described two attacks on this approach.

The first attack is that secret suffix is weaker against offline second-preimage attacks. That is, an attacker can take an authenticator for a known plaintext m and calculate their own plaintext m’ that hashes to the same value as the block just before k. If the input to the hash function just before k is identical, then the output is also the same. This means the attacker can just send m’ and the previously seen authenticator for m and the two will match.

For a secure cryptographic hash function, a second-preimage attack takes 2n tries where n is the hash size in bits[1]. However, the secret suffix approach is marginally weaker to this kind of attack. If an attacker has seen t text and authenticator pairs, then the effort is only 2n / t since they can attempt a second-preimage match against any of the authenticators they have seen. This is usually not a problem since second-preimage attacks are usually much harder than finding collisions. As they have aged, all widely-used hash algorithms have fallen to collisions before second-preimage attacks.

The other attack is much more powerful. If the attacker can submit a chosen message to be authenticated, she can attempt an offline collision search. In this case, an attacker searches for two messages, m and m’, that hash to the same value. Once they are found, she requests an authenticator for the innocuous message m. Since a collision means the intermediate hash state before k is mixed in is identical (an “internal collision”), the final authenticator for both will be identical. The attacker then sends the evil message m’ with the authenticator for m, the two match, and the message is accepted as authentic.

This means the secret suffix construction is insecure if collisions can be found in the underlying hash function. Due to the birthday paradox, this takes 2n/2 work even for a secure hash function (e.g., 264 operations for a 128-bit hash). But it gets worse if the hash is weaker to collisions.

MD5 has multiple demonstrated collisions. Many systems continue to use HMAC-MD5 because a collision alone is not enough to compromise it. Because of the way the key is applied in HMAC, an attacker would have to generate an internal collision with the secret key, which is much harder than colliding with a chosen message[2]. Although this may provide some immediate comfort, it is still important to move to HMAC-SHA256 soon if you are using HMAC-MD5.

In contrast, MD5 with secret suffix is completely compromised due to collisions, especially with the recent advance in chosen-prefix collisions. Currently, this takes about 30 seconds on a laptop. To repeat, under no circumstances should you use an arbitrary hash construction instead of HMAC, and MD5 with secret suffix is completely broken. If you were putting off moving away from MD5(m || k), now would be an excellent time to move to HMAC-SHA256.

Thanks go to Trevor Perrin and Peter Gutmann for comments on this article.

[1] This is not true for longer messages. Multicollisions can be used against each block of a longer message. See the work by Kelsey and Schneier and Joux for more details.

[2] This is a very broad statement about HMAC. A more detailed analysis of its security will have to wait for another post.

Just another day at the office

The following does not take place over a 24 hour period. But any one of these situations is a good example of a typical day at Root Labs.

Attack Windows device driver protection scheme

Certain drivers in Windows must implement software protection (PMP) in order to prevent audio/video ripping attacks. Since drivers run in ring 0, they can do a lot more than just the standard SEH tricks. It’s 1992 all over again as we dig into the IDT and attempt to find how they are trying to protect their decryption process.

Whitebox cryptography is a method of combining a key and cipher implementation to create a keyed cipher. It can only do whatever operation it was initialized with and uses a hard-coded key. The entire operation becomes a series of table lookups. The intermediate values are obscured by randomness merged into the tables. But it’s not impossible to defeat.

To get at the cipher though, we’re going to need a way to bypass some of these anti-debugging traps. Since ring 0 code has direct access to all the CPU’s registers (including the debug registers, MSRs, and page tables), it is free to wreak havoc with our attempts to circumvent it. A common approach is to disable any breakpoints in the registers by overwriting them. A less common method is to use the debug registers as a local procedure call gate so that an attacker that writes to them breaks the main loop.

We write our own hook and patch into the int 1 handler at a point that isn’t integrity-checked and set the GD bit. This causes our hook to be called whenever anyone writes or reads the debug registers. The trap springs! There’s the heart of the protection code right there. Now to figure out what it’s doing.

Design self-reinforcing integrity checks

Preventing attackers from patching your code is very difficult. One approach is to insert small hash functions that verify a region of code or data has not been changed. Of course, there’s the obvious problem of “who watches the watcher?” Since it’s easy for attackers to NOP out the checksum routine or modify their patch to compensate if you use CRC, our mission today is to design and implement a more robust approach.

First, we analyze the general problem of mutually-reinforcing checks. The code and data for the check itself both need to be covered. But if two checks exactly mirrored each other, you’d have a chicken-and-egg problem in building them since a change in one would require changes in the other, and so on. It’s like standing between two mirrors, infinite recursion.

A data structure that describes the best we can do is a directed acyclic graph. There are no cycles so the checks (and checksums) can be generated in reverse order. But the multiple paths through the graph provide overlapping coverage so we can be certain that a single patch cannot bypass the protection. Of course, the roots of each of these paths needs to be hidden throughout the code. Putting them all in one big loop run by a single watchman thread would be a mistake. Now we have to come up with a way of automatically generating, randomizing, and inserting them into the code while also hiding the root nodes in various places.

Reverse-engineer RFID transponder

What exactly is in those FasTrak transponders? Do they securely identify themselves via cryptographic challenge/response? Do they preserve your privacy? What about this rumor that they are tracked by antennas all over Bay Area freeways?

Not being an existing user, we bought one at a local supermarket and took it apart. Inside was a microcontroller, some passive electronics, and not a whole lot else. An older one turned out to still have JTAG enabled, so it was simple to dump its firmware. The newer one did have the lock bit set, and Flylogic Engineering was kind enough to decap it and zap it for us. The firmware was identical. Does IDA have an MSP430 plugin? Well, there was one but it was on Geocities and then vaporized. Time to dig around a bit. Find the source code and hack it to work with the newer SDK.

Then it’s off to drop in the code and analyze it for switch statements (protocol handlers usually). Trace everything that starts from the IO pins that map to the receive side of the antenna. Then manually walk up the stack to see where it goes. Hey, there’s an over-the-air update function in here. Just send the magic handshake, then the next packet gets written to flash. There are some checks to try to maintain the writes within the area that stores the ID. But there are multiple paths to this function and one of them is obviously not tested because it pushes the wrong size argument on the stack (a 2-byte instead of 1-byte length argument). Time to notify the agency that 1 million transponders are at risk of a permanent DoS at best and code execution at worst.

Coda

If you’ve made it this far, you might be interested to know we’re hiring. We tackle difficult security problems in environments with clever, persistent adversaries. We’re just as likely to design a system as attack it (and often do both for the same customer). If this sounds like your kind of job, please see here for more details.