Next Baysec: June 19th at Pete’s Tavern

The next Baysec meeting is Thursday at Pete’s Tavern. Come out and meet fellow security people from all over the Bay Area.  As always, this is not a sponsored meeting, there is no agenda or speakers, and no RSVP is needed.  Thanks go to Ryan for planning all this.

See you on Thursday, June 19th, 7-11 pm.

Pete’s Tavern
128 King St. (at 2nd)
San Francisco

China hax0rs US

Like any mainstream article on security, this recent AP article sensationalizes China’s response to multiple accusations of state-sponsored hacking. First, the money quote:

“Is there any evidence? … Do we have such advanced technology? Even I don’t believe it.”
— Foreign Ministry spokesman Qin Gang

Is this supposed to play into some pompous Western belief that China is a backwater and thus incapable of hacking computers? Does anyone believe it takes advanced technology to break into PCs?

Next we have the meaningless numbers. The Pentagon claims its network is scanned or attacked 300 million times a day. For this to be true, that would be an average of 3400 times per second. If we consider every packet to be a scan, that is about 200 KB/second. However, the entire port scan should be considered a single attempt. Of course, bigger numbers sound more scary and justify a higher budget. Perhaps each TCP option in the header of each packet could be considered a separate attempt since they could be attacking both timestamp and window scaling implementations!

The more interesting allegations are that China copied the contents of a laptop of the visiting U.S. Commerce Secretary and hacked into the office computers of two House representatives. The laptop incident is more interesting since it seems easier to prove. Did they confiscate the laptop and take it to another room? Did the file access times change or was it powered off? I assume he continued using the laptop during the trip and thus it would be harder to tell. Was he using disk encryption? Why not?

The allegations regarding the two House members are much less provable. The FBI investigated their computers and said they’d been accessed by people in China. How did they first decide they should call the FBI? Porn popups? Without more evidence showing a clear intent, this is more likely a malware incident. It is surprisingly convenient that their allegations appear alongside House Intelligence committee meetings on hacking.

Interview about DRM on Security Focus

Security Focus just posted this interview of me, talking about DRM. Here are a few choice quotes.

On authoring software protection for Vista:

The rules of the game are changing recently with Microsoft Vista kernel patch protection. If you’re a rootkit author, you just bypass it. If you’re a software protection designer, you have to play by its rules. For the first time in the PC’s history, it’s not a level playing field any more. Virus scanner authors were the first to complain about this, and it will be interesting to see how this fundamental change affects the balance of power in the future.

On using custom hardware for protection:

Custom hardware often gives you a longer period until the first break since it requires an attacker’s time and effort to get up to speed on it. However, it often fails more permanently once cracked since the designers put all their faith in the hardware protection.

Chris Tarnovsky demos smart card hacking

Chris Tarnovsky spoke to Wired and demonstrated his smart card hacking techniques. Chris became famous for hacking Echostar cards, which resulted in a multi-billion dollar lawsuit against News Corp. The trial ended in a pretty big win for DirecTV smart card supplier NDS, with the jury only finding them guilty of trying out a packaged attack that converted a DirecTV smart card to decrypt Dish Network video. The penalty for this? A whopping $46.95 (i.e., one month’s subscription fee) plus statutory damages.

In the video, Chris demonstrates some interesting techniques. It shows him decapsulating an ST16 CPU made by ST Micro from an IBM Java card. Then, he uses nail polish to mask the die and rust remover (i.e., hydrofluoric acid) to etch away the top metal layer of protective mesh to get at the CPU’s bus. He then uses a sewing needle to tap each line of the 8-bit bus in turn and then reassemble the data in software. He could just have easily driven the lines, allowing him to change instructions or data at will.

This attack is pretty amazing. It is simple in that it does not require an expensive FIB or other rework tools. Instead, a microscope and careful work with some household chemicals is all it takes. While the ST16 is a bit old, it was used in Echostar, hence the relevance of Chris’s demonstration. It will be interesting to see what happens next in the evolving capabilities for home-based silicon analysis.

Anti-debugging: using up a resource versus checking it

After my last post claiming anti-debugging techniques are overly depended on, various people asked how I would improve them. My first recommendation is to go back to the drawing board and make sure all your components of your software protection design hang together in a mesh. In other words, each has the property that the system will fail closed if a given component is circumvented in isolation. However, anti-debugging does have a place (along other components including obfuscation, integrity checks, etc.) so let’s consider how to implement it better.

The first principle of implementing anti-debugging is to use up a resource instead of checking it. The initial idea most novices have is to check a resource to see if a debugger is present. This might be calling IsDebuggerPresent() or if more advanced, using GetThreadContext() to read the values of hardware debug registers DR0-7. Then, the implementation is often as simple as “if (debugged): FailNoisily()” or if more advanced, a few inline versions of that check function throughout the code.

The problem with this approach is it provides a choke point for the attacker, a single bottleneck that circumvents all instances of a check no matter where they appear in the code or how obfuscated they are. For example, IsDebuggerPresent() reads a value from the local memory PEB struct. So attaching a debugger and then setting that variable to zero circumvents all uses of IsDebuggerPresent(). Looking for suspicious values in DR0-7 means the attacker merely has to hook GetThreadContext() and return 0 for those registers. In both cases, the attacker could also skip over the check so it always returned “OK”.

Besides the API bottleneck, anti-debugging via checking is fundamentally flawed as suffering from “time of check, time of use“, aka race conditions. An attacker could quickly swap out the values or virtualize accesses to the particular resources in order to lie about their state. Meanwhile, they could still be using the debug registers for their own purposes while stubbing out GetThreadContext(), for example.

Contrast this with anti-debugging by using up a resource. For example, a timer callback could be set up to vector through PEB!BeingDebugged. (Astute readers will notice this is a byte-wide field, so it would actually have to be an offset into a jump table). This timer would fire normally until a debugger was attached. Then the OS would store a non-zero value in that location, overwriting the timer callback vector. The next time the timer fired, the process would crash. Even if an attacker poked the value to zero, it would still crash. They would have to find another route to read the magic value in that location to know what to store there after attaching the debugger. To prevent merely disabling the timer, the callback should perform some essential operation to keep the process operating.

Now this is a simple example and there are numerous ways around it, but it illustrates my point. If an application uses a resource that the attacker also wants to use, it’s harder to circumvent than a simple check of the resource state.

An old example of this form of anti-debugging in the C64 was storing loader code in screen memory. As soon as the attacker broke into a debugger, the prompt would overwrite the loader code. Resetting the machine would also overwrite screen memory, leaving no trace of the loader. Attackers had to realize this and jump to a separate routine elsewhere in memory that made a backup copy of screen memory first.

For x86 hardware debug registers, a program could keep an internal table of functions but do no compile-time linkage. Each site of a function call would be replaced with code to store an execute watchpoint for EIP+4 in a debug register and a “JMP SlowlyCrash” following that. Before executing the JMP, the CPU would trigger a debug exception (INT1) since the EIP matched the watchpoint. The exception handler would look up the calling EIP in a table, identify the real target function, set up the stack, and return directly there. The target could then return back to the caller. All four debug registers should be utilized to avoid leaving room for the attacker’s breakpoints.

There are a number of desirable properties for this approach. If the attacker executes the JMP, the program goes off into the weeds but not immediately. As soon as they set a hardware breakpoint, this occurs since the attacker’s breakpoint overwrites the function’s watchpoint and the JMP is executed. The attacker would have to write code that virtualized calls to get and set DR0-7 and performed virtual breakpoints based on what the program thought should be the correct values. Such an approach would work, but would slow down the program enough that timing checks could detect it.

I hope these examples make the case for anti-debugging via using up a resource versus checking it. This is one implementation approach that can make anti-debugging more effective.

[Followup: Russ Osterlund has pointed out in the comments section why my first example is deficient.  There’s a window of time between CreateProcess(DEBUG_PROCESS) and the first instruction execution where the kernel has already set PEB!BeingDebugged but the process hasn’t been able to store its jump table offset there yet.  So this example is only valid for detecting a debugger attach after startup and another countermeasure is needed to respond to attaching beforehand.   Thanks, Russ!]

Debian needs some serious commit review

You’ve probably heard by now about the gaping hole in keys generated by Debian’s OpenSSL. If not, the summary is that your SSH keys and SSL certs were selected from a fixed pool of 215 (32,767) possibilities, and are thus easy to brute-force over the network. If you have any keys generated on a Debian system, you need to immediately replace them or disable the associated service. It’s that bad — remote login or root with only a few thousand tries.

Luckily, Debian recently fixed this 2-year-old hole in this commit. Great, right? Except, I made a quick comparison to the commit that introduced the bug, which shows they missed reverting both places the bug was added. So they still didn’t fix it completely.

As a past FreeBSD committer and crypto engineer, I knew any commits to the PRNG or other such critical code were subject to intense review. That’s before they could be committed. If a committer were found to have introduced such a fatal flaw, the patch to fix it would have been doubly-scrutinized before being allowed into the tree. Apparently, the same guy who introduced the bug was left to screw up the fix.

Once more, this time with prior review!


Edit: a commenter informed me that there was review of this fix, and Debian decided to leave their implementation silently incompatible with the OpenSSL API docs.

“If a Microsoft developer commented out seeding in Vista CryptGenRandom(), they would be fired 12 times. Then Microsoft would buy the next company that hired them in order to fire them again.”
— Thomas Ptacek