Chris Tarnovsky demos smart card hacking

May 30, 2008May 30, 2008 ~ Nate Lawson

Chris Tarnovsky spoke to Wired and demonstrated his smart card hacking techniques. Chris became famous for hacking Echostar cards, which resulted in a multi-billion dollar lawsuit against News Corp. The trial ended in a pretty big win for DirecTV smart card supplier NDS, with the jury only finding them guilty of trying out a packaged attack that converted a DirecTV smart card to decrypt Dish Network video. The penalty for this? A whopping $46.95 (i.e., one month’s subscription fee) plus statutory damages.

In the video, Chris demonstrates some interesting techniques. It shows him decapsulating an ST16 CPU made by ST Micro from an IBM Java card. Then, he uses nail polish to mask the die and rust remover (i.e., hydrofluoric acid) to etch away the top metal layer of protective mesh to get at the CPU’s bus. He then uses a sewing needle to tap each line of the 8-bit bus in turn and then reassemble the data in software. He could just have easily driven the lines, allowing him to change instructions or data at will.

This attack is pretty amazing. It is simple in that it does not require an expensive FIB or other rework tools. Instead, a microscope and careful work with some household chemicals is all it takes. While the ST16 is a bit old, it was used in Echostar, hence the relevance of Chris’s demonstration. It will be interesting to see what happens next in the evolving capabilities for home-based silicon analysis.

Anti-debugging: using up a resource versus checking it

May 21, 2008May 22, 2008 ~ Nate Lawson ~ 7 Comments

After my last post claiming anti-debugging techniques are overly depended on, various people asked how I would improve them. My first recommendation is to go back to the drawing board and make sure all your components of your software protection design hang together in a mesh. In other words, each has the property that the system will fail closed if a given component is circumvented in isolation. However, anti-debugging does have a place (along other components including obfuscation, integrity checks, etc.) so let’s consider how to implement it better.

The first principle of implementing anti-debugging is to use up a resource instead of checking it. The initial idea most novices have is to check a resource to see if a debugger is present. This might be calling IsDebuggerPresent() or if more advanced, using GetThreadContext() to read the values of hardware debug registers DR0-7. Then, the implementation is often as simple as “if (debugged): FailNoisily()” or if more advanced, a few inline versions of that check function throughout the code.

The problem with this approach is it provides a choke point for the attacker, a single bottleneck that circumvents all instances of a check no matter where they appear in the code or how obfuscated they are. For example, IsDebuggerPresent() reads a value from the local memory PEB struct. So attaching a debugger and then setting that variable to zero circumvents all uses of IsDebuggerPresent(). Looking for suspicious values in DR0-7 means the attacker merely has to hook GetThreadContext() and return 0 for those registers. In both cases, the attacker could also skip over the check so it always returned “OK”.

Besides the API bottleneck, anti-debugging via checking is fundamentally flawed as suffering from “time of check, time of use“, aka race conditions. An attacker could quickly swap out the values or virtualize accesses to the particular resources in order to lie about their state. Meanwhile, they could still be using the debug registers for their own purposes while stubbing out GetThreadContext(), for example.

Contrast this with anti-debugging by using up a resource. For example, a timer callback could be set up to vector through PEB!BeingDebugged. (Astute readers will notice this is a byte-wide field, so it would actually have to be an offset into a jump table). This timer would fire normally until a debugger was attached. Then the OS would store a non-zero value in that location, overwriting the timer callback vector. The next time the timer fired, the process would crash. Even if an attacker poked the value to zero, it would still crash. They would have to find another route to read the magic value in that location to know what to store there after attaching the debugger. To prevent merely disabling the timer, the callback should perform some essential operation to keep the process operating.

Now this is a simple example and there are numerous ways around it, but it illustrates my point. If an application uses a resource that the attacker also wants to use, it’s harder to circumvent than a simple check of the resource state.

An old example of this form of anti-debugging in the C64 was storing loader code in screen memory. As soon as the attacker broke into a debugger, the prompt would overwrite the loader code. Resetting the machine would also overwrite screen memory, leaving no trace of the loader. Attackers had to realize this and jump to a separate routine elsewhere in memory that made a backup copy of screen memory first.

For x86 hardware debug registers, a program could keep an internal table of functions but do no compile-time linkage. Each site of a function call would be replaced with code to store an execute watchpoint for EIP+4 in a debug register and a “JMP SlowlyCrash” following that. Before executing the JMP, the CPU would trigger a debug exception (INT1) since the EIP matched the watchpoint. The exception handler would look up the calling EIP in a table, identify the real target function, set up the stack, and return directly there. The target could then return back to the caller. All four debug registers should be utilized to avoid leaving room for the attacker’s breakpoints.

There are a number of desirable properties for this approach. If the attacker executes the JMP, the program goes off into the weeds but not immediately. As soon as they set a hardware breakpoint, this occurs since the attacker’s breakpoint overwrites the function’s watchpoint and the JMP is executed. The attacker would have to write code that virtualized calls to get and set DR0-7 and performed virtual breakpoints based on what the program thought should be the correct values. Such an approach would work, but would slow down the program enough that timing checks could detect it.

I hope these examples make the case for anti-debugging via using up a resource versus checking it. This is one implementation approach that can make anti-debugging more effective.

[Followup: Russ Osterlund has pointed out in the comments section why my first example is deficient. There’s a window of time between CreateProcess(DEBUG_PROCESS) and the first instruction execution where the kernel has already set PEB!BeingDebugged but the process hasn’t been able to store its jump table offset there yet. So this example is only valid for detecting a debugger attach after startup and another countermeasure is needed to respond to attaching beforehand. Thanks, Russ!]

Debian needs some serious commit review

May 19, 2008May 19, 2008 ~ Nate Lawson ~ 22 Comments

You’ve probably heard by now about the gaping hole in keys generated by Debian’s OpenSSL. If not, the summary is that your SSH keys and SSL certs were selected from a fixed pool of 2¹⁵ (32,767) possibilities, and are thus easy to brute-force over the network. If you have any keys generated on a Debian system, you need to immediately replace them or disable the associated service. It’s that bad — remote login or root with only a few thousand tries.

Luckily, Debian recently fixed this 2-year-old hole in this commit. Great, right? Except, I made a quick comparison to the commit that introduced the bug, which shows they missed reverting both places the bug was added. So they still didn’t fix it completely.

As a past FreeBSD committer and crypto engineer, I knew any commits to the PRNG or other such critical code were subject to intense review. That’s before they could be committed. If a committer were found to have introduced such a fatal flaw, the patch to fix it would have been doubly-scrutinized before being allowed into the tree. Apparently, the same guy who introduced the bug was left to screw up the fix.

Once more, this time with prior review!

Edit: a commenter informed me that there was review of this fix, and Debian decided to leave their implementation silently incompatible with the OpenSSL API docs.

“If a Microsoft developer commented out seeding in Vista CryptGenRandom(), they would be fired 12 times. Then Microsoft would buy the next company that hired them in order to fire them again.”
— Thomas Ptacek

Are single-purpose devices the solution to malware?

May 12, 2008May 12, 2008 ~ Nate Lawson ~ 1 Comment

I recently watched this fascinating talk by Jonathan Zittrain, author of The Future of the Internet — And How to Stop It. He covers everything from the Apple II to the iPhone and trusted computing. His basic premise is that malware is driving the resurgence of locked-down, single-purpose devices.

I disagree with that conclusion. I think malware will always infect the most valuable platform. If the iPhone was as widely-deployed as Windows PCs, you can bet people would be targeting it with keyloggers, closed platform or not. In fact, the motivation of people to find ways around the vendor’s protection on their own phone leads to a great malware channel (trojaned jailbreak apps, anyone?)

However, I like his analysis of what makes some open systems resilient (his example: Wikipedia defacers) and some susceptible to being gamed (Digg users selling votes). He claims it’s a matter of how much members consider themselves a part of the system versus outside it. I agree that designing in aspects of accountability and aggregated reputation help, whereas excessive perceived anonymity can lead to antisocial behavior.

Warning signs you need a crypto review

May 6, 2008 ~ Nate Lawson ~ 4 Comments

At the RSA Conference, attendees were given a SanDisk Cruzer Enterprise flash drive. I decided to look up the user manual and see what I could find before opening up the part itself. The manual appears to be an attempt at describing its technical security aspects without giving away too much of the design. Unfortunately, it seems more targeted at buzzword compliance and leaves out some answers critical to determining how secure its encryption is.

There are some good things to see in the documentation. It uses AES and SHA-1 instead of keeping quiet about some proprietary algorithm. It appears to actually encrypt the data (instead of hiding it in a partition that can be made accessible with a few software commands). However, there are also a few troublesome items that are a good example of signs more in-depth review is needed.

1. Defines new acronyms not used by cryptographers

Figure 2 is titled “TDEA Electronic Code Book (TECB) Mode”. I had to scratch my head for a while. TDEA is another term for Triple DES, an older NIST encryption standard. But this documentation said it uses AES, which is the replacement for DES. Either the original design used DES and they moved to AES, or someone got their terms mixed up and confused a cipher name for a mode of operation. Either way, “TECB” is meaningless.

2. Uses ECB for bulk encryption

Assuming they do use AES-ECB, that’s nothing to be proud of. ECB involves encrypting a cipher-sized block at a time. This results in a “spreading” of data by the cipher block size. However, patterns are still visible since every 16-byte pattern that is the same will also encrypt to the same ciphertext.

All flash memory is accessed in pages much bigger than the block size of AES. Flash page sizes are typically 1024 bytes or more versus AES’s 16-byte blocksize. So there’s no reason to only encrypt in 16-byte units. Instead, a cipher mode like CBC where all the blocks in the page are chained together would be more secure. A good review would probably recommend that, along with careful analysis of how to generate the IV, supply integrity protection, etc.

3. Key management not defined

The device “implements a SHA-1 hash function as part of access control and creation of a symmetric encryption key”. It also “implements a hardware Random Number Generator”.

Neither of these statements is sufficient to understand how the bulk encryption key is derived. Is it a single hash iteration of the password? Then it is more open to dictionary attacks. Passphrases longer than the input size would also be less secure since the second half of the password might be hashed by itself. This is the same attack that was usable against Microsoft LANMAN hashes but that scheme was designed in the late 1980’s, not 2007.

4. No statements about tamper resistance, side channels, etc.

For all its faults, the smart card industry has been hardening chips against determined attackers for many years now. I have higher hopes for an ASIC design that originated in the satellite TV or EMV world where real money is at stake than in complex system-on-chip designs. They just have a different pedigree. Some day, SoC designs may have weathered their own dark night of the soul, but until then, they tend to be easy prey for Christopher Tarnovsky.

Finally, I popped open the case (glued, no epoxy) to analyze it. Inside are the flash chips and a single system-on-chip that contains the ARM CPU, RAM, USB, and flash controller. It would be interesting to examine the test points for JTAG, decap it, etc.

Knowing only what I’ve found so far, I would be uncomfortable recommending such a device to my clients. There are many signs that an independent review would yield a report better suited to understanding the security architecture and even lead to fixing various questionable design choices.

Next Baysec: May 7 at Pete’s Tavern

May 5, 2008 ~ Nate Lawson

The next Baysec meeting is this Wednesday at Pete’s Tavern again. Come out and meet fellow security people from all over the Bay Area. As always, this is not a sponsored meeting, there is no agenda or speakers, and no RSVP is needed.

See you on Wednesday, May 7th, 7-11 pm.

Pete’s Tavern
128 King St. (at 2nd)
San Francisco