Glitch attacks revealed

(First in a series of articles on attacking hardware and software by inducing faults)

One of the common assumptions software authors make is that the underlying hardware works reliably. Very few operating systems add their own parity bits or CRC to memory accesses. Even fewer applications check the results of a computation. Yet when it comes to cryptography and software protection, the attacker controls the platform in some manner and thus faulty operation has to be considered.

Fault induction is often used to test hardware during production or simulation runs. It was probably first observed when mildly radioactive material that is a natural part of chip packaging led to random memory bit flips.

When informed that an attacker in possession of a device can induce faults, most engineers respond that nothing useful could come of that. This is a similar response to when buffer overflows were first discovered in software (“so what, the software crashes?”) I often find this “engineering mentality” gets in the way of improving security, even insisting you must prove exploitability before fixing a problem.

A good overview paper is “The Sorcerer’s Apprentice Guide to Fault Attacks” by Bar-el et al. In their 1997 paper “Low Cost Attacks on Tamper Resistant Devices,” Anderson and Kuhn conclude:

“We have improved on Differential Fault Analysis. Rather than needing about 200 faulty ciphertexts to recover a DES key, we need between one and ten. We can factor RSA moduli with a single faulty ciphertext. We can also reverse engineer completely unknown algorithms; this appears to be faster than Biham and Shamir’s approach in the case of DES, and is particularly easy with algorithms that have a compact software implementation such as RC5.”

This is quite a powerful class of attacks, and is sometimes applicable to software-only systems as well. For instance, a signal handler often can be triggered from remote, inducing faults in execution if the programmer wasn’t careful.

Of course, glitch attacks are most applicable to smart cards, HSMs, and other tamper-resistant hardware. Given the movement to DRM and trusted computing, we can expect to see this category of attack and its defenses become more sophisticated.  Why rob banks? Because that’s where the money is.

RSA public keys are not private (implementation)

Previously, I described a proposed system that will both sign and encrypt code updates using an RSA private key. The goal is to derive the corresponding public key even though it is kept securely within the device.

If you delve into the details of RSA, the public key is the exponent and modulus pair (e, n) and private key is (d, n). The modulus is shared between the two keys and e is almost always 3, 17, or 65537. So in the vendor’s system, only d and n are actually secrets. If we could figure out d, we’ve broken RSA, so that’s likely a dead-end. But if we figure out e and n, we can decrypt the updates.

Let’s see how to efficiently figure out n using guesses of e and a pair of encrypted updates. The signature verification process, RSA-Verify, can be represented by the standard equation shown below in various forms.

rsa-sig2.png

The first step is the standard RSA-Pub-Op and the result is m, the padded hash of the message. The remaining steps in RSA-Verify (i.e., validating the signature) are not relevant to this attack. In the second equation, m is refactored to the left side. Finally, 0 mod n is described in an alternate form as k * n. The particular k value is the number of times the odometer has wrapped during exponentiation. It’s not important to us, just the modulus n.

Now, assume we’ve seen two updates and their signatures, sig1 and sig2. We can convert them to the above form by raising each signature to a guess for e and subtracting the associated message m. Note that m is still encrypted and we don’t know its contents.

rsa-sig3.png

This gives us two values with n as the common factor. The GCD algorithm gives us the common factors of two values very quickly. The reason there is no “=” sign above is that we actually get n times some small prime factors, not just n. Those can easily be removed by trial division by primes up to 1000 since the real factors of n (p and q) are at least 100-digit numbers. Once we have n, we can take each encrypted message and decrypt it with me mod n as usual, giving us the plaintext code updates.

I implemented this attack in python for small values of e. Try it out and let me know what you think.

Doing this attack with e = 65537 is difficult due to the fact that it would have to work with values millions of bits long. However, it is feasible and was implemented by Jaffe and Kocher in 1997 using an interesting FFT-multiply method. Perhaps they will publish it at some point. Thanks go to them for explaining this attack to me a few years ago.

Finally, I’d like to reiterate what I’ve said before. Public key cryptosystems and RSA in particular are extremely fragile. Do not use them differently than they were designed. Better yet, seek expert review or design assistance when working with any design involving crypto.

RSA public keys are not private

RSA is an extremely malleable primitive and can hardly be called “encryption” in the traditional sense of block ciphers. (If you disagree, try to imagine how AES could be vulnerable to full plaintext exposure to anyone who can submit a message and get a return value of “ok” or “decryption incorrect” based only on the first couple bits.) When I am reviewing a system and see the signing operation described as “RSA-encrypt the signature with the private key,” I know I’ll be finding some kind of flaw in that area of the implementation.

The entire PKCS#1 standard exists to add structure to the payload of RSA to eliminate as much malleability as possible. To distinguish the RSA primitive operation from higher-level constructs that use the RSA primitive, I use the following terminology.

RSA-Pub/Priv-Op Perform an RSA operation using the appropriate key (exponent/modulus): dataexp mod n
RSA-Encrypt Add random padding and transform with RSA-Pubkey-Op
RSA-Decrypt Transform with RSA-Privkey-Op and check padding in result
RSA-Sign Add padding and transform with RSA-Privkey-Op
RSA-Verify Transform with RSA-Pubkey-Op and check padding in result, hash data, compare with hash in payload and return True if match, else False

With that out of the way, a common need in embedded systems is secure code updates. A good approach is to RSA-Sign the code updates at the manufacturer. The fielded device then loads the update into RAM, performs RSA-Verify to be sure it is valid, checks version and other info, then applies the update to flash.

One vendor was signing their updates, but decided they also wanted privacy for the update contents. Normally, this would involve encrypting the update with AES-CBC before signing. However, they didn’t want to add another key to the secure processor since the design was nearly complete.

The vendor decided that since they already had an RSA public key in the device for signature verification, they would also encrypt the update to that key, using the corresponding private key. The “public” key was never revealed outside their secure processor so it was considered secret.

Let’s call the public/private key pair P1 and S1, respectively. Here’s their new update process:

rsa-sig1.png

On receiving the update, the secure processor would do the inverse, verifying the signature and then decrypting the code. (Astute reader: yes, there would be a chaining problem if encrypting more than one RSA block’s worth of data but let’s assume the code update is small.)

What’s wrong? Nothing, if you’re thinking about RSA in terms of conventional crypto. The vendor is encrypting something to a key that is not provided to an attacker. Shouldn’t this be as secure as a secret AES key? Next post, I’ll reveal how the malleability of the RSA primitive allows n to be easily calculated.

JTAG attacks and PR submarines

Security research publication comes in two varieties: genuine advances and PR submarines (stories that sound like real advances but are more clever PR than substance.) Barnaby Jack’s recent announcement of attacking embedded systems via JTAG is definitely the latter. Since the trade press is always looking for interesting angles, they are especially susceptible to PR submarines.

Background: the attack uses the standard JTAG port present on nearly all chipsets and CPUs. This port is used for factory and field diagnostics and provides device-specific access to the internal flip-flops that store all the chip’s state. A technician typically uses a GUI (aka in-circuit emulator) on the host PC to set breakpoints, read/write internal registers, dump memory, and perform other debugger-like functions. Secure processors like smart cards already disable JTAG before the chips leave the factory to prevent this kind of attack.

Like Schneier’s snake oil crypto test, let’s examine how to identify security PR submarines.

1. Attack has been done before (bonus: no citation of prior work in the same area)

Check. Since JTAG access gives the hardware equivalent of a software debugger, attackers have been using it from the beginning. The first attackers were probably competitors reverse engineering designs to copy them or improve their own. Currently, a packaged version of this attack has been in use for years to get free satellite TV. No mention of any of this history can be found in the article.

2. Researcher previously gave same talk at another conference

Check. Keep these slides open for reference below. He is probably speaking on another application of the same attack, but count on the talk being quite similar.

3. Implications of attack wildly speculative

An attacker with physical access to the circuit board can control a device. Yes, that’s what JTAG is for. But there is no way this allows an attacker to “redirect Internet traffic on routers” without physical access to those routers. Perhaps Mr. Jack was unaware that this attack primarily matters to tamper-resistant devices (i.e., smart cards) where the device itself must protect stored cash, authentication secrets, or other data subject to physical attacks. That may be why he added a nice, but wholly-unnecessary application of modifying the software on a home router to insert trojan code in EXEs (slides 35-38.)

4. Attack uses very polished, mature tools and requires little or no custom development

Check. Note use of GUI in-circuit emulator on slides 18 and 21. The only custom development I can see is for the ARM code to modify the TCP packets. He could have inserted that code via a socketed flash chip instead of using JTAG but that would not sound as cool.

5. Deployed systems already have defenses against the attack

Check. JTAG is already disabled with any use of a tamper-resistant processor, and nearly every microcontroller made has a fuse to disable JTAG.

6. New researcher or new field for existing researcher

Barnaby Jack (formerly of eEye) has done awesome work on win32 kernel shellcode. Not to slight his previous work, but hardware is a new direction for him.

7. Venue is a talk at a minor conference, not a peer-reviewed paper (bonus: no details given)

Check. CanSecWest does not require a paper, and I don’t expect Mr. Jack to publish one although it’s possible he might. And what’s this about Juniper, his employer, sponsoring CanSecWest?

8. Announcement first appears in trade press or Slashdot

Check and check.

9. Slogan or catch-phrase consistently used to advertise attack

Check. Closing quote for the article is “I’m looking at my microwave oven right now, but I don’t think there’s much I could do with that.” See also intro slide 3 for the previous talk.

What is it about CanSecWest that attracts such sensationalism? Is there just no other way to justify a trip to Canada in your travel budget?