Anti-debugging techniques of the past

April 24, 2007April 24, 2007 ~ Nate Lawson ~ 1 Comment

During the C64 and Apple II years, a number of interesting protection schemes were developed that still have parallels in today’s systems. The C64/1541 disk drive schemes relied on cleverly exploiting hardware behavior since there was no security processor onboard to use as a root of trust. Nowadays, games or DRM systems for the PC still have the same limitation, while modern video game systems rely more on custom hardware capabilities.

Most targeted anti-debugger techniques rely on exploiting shared resources. For example, a single interrupt vector cannot be used by both the application and the debugger at the same time. Reusing that resource as part of the protection scheme and for normal application operations forces the attacker to modify some other shared resource (perhaps by hooking the function prologue) instead.

One interesting anti-debugger technique was to load the computer’s protection code into screen memory. This range of RAM, as with modern integrated video chipsets, is regular system memory that is also mapped to the display chip. Writing values to this region changes the display. Reading it returns the pixel data that is onscreen. Since it’s just RAM, the CPU can execute out of it as well but the user would see garbage on the screen. The protection code would change the foreground and background colors to be the same, making the data invisible while the code executed.

When an attacker broke into program execution with a machine language monitor (i.e., debugger), the command prompt displayed by the monitor would overwrite the protection code. If it was later resumed, the program would just crash because the critical protection code was no longer present in RAM. The video memory was a shared resource used by both the protection and the debugger.

Another technique was to load code into an area of RAM that was cleared during system reset. If an attacker reset the machine without powering it off, the data would ordinarily still be present in RAM. However, if it was within a region that was zeroed by the reset code in ROM, nothing would remain for the attacker to examine.

As protection schemes became stronger in the late 1980’s, users resorted to hardware-based attacks when software-only copying was no longer possible. Many protection schemes took advantage of the limited RAM in the 1541 drive (2 KB) by using custom bit encoding on the media and booting a custom loader/protection routine into the drive RAM to read it. The loader would lock out all access by the C64 to drive memory so it could not easily be dumped and analyzed.

The copy system authors responded in a number of ways. One of them was to provide a RAM expansion board (8 KB) that allowed the custom bit encoding to be read all in one chunk, circumventing the boundary problems that occurred when copying in 2 KB chunks and trying to stitch them back together. It also allowed the protection code in the drive to be copied up to higher memory, saving it from being zeroed when the drive was reset. This way the drive would once again allow memory access by the C64 and the loader could be dumped and analyzed.

Protection authors responded by crashing the drive when expanded RAM was present. With most hardware memory access, there’s a concept known as “mirroring.” If the address space is bigger than the physical RAM present, accesses to higher addresses wrap around within the actual memory. For example, accesses to address 0, 2 KB and 4 KB would map to the same RAM address (0) on a 1541 with stock memory. But on a drive with 8 KB expanded RAM, these would map to three different locations.

One technique was to scramble the latter part of the loader. The first part would descramble the remainder but use addresses above 2 KB to do its work. On a stock 1541, the memory accesses would wrap around, descrambling the proper locations in the loader. On a modified drive, they would just write garbage to upper memory and the loader would crash once it got to the still-scrambled code in the lower 2 KB.

These schemes were later circumvented by adding a switch to the RAM expansion that allowed it to be switched off when it wasn’t in use, but this did add to the annoyance factor of regularly using a modified drive.

Reverse engineering with a VM

April 21, 2007April 22, 2007 ~ Nate Lawson ~ 11 Comments

In a previous comment, Tim Newsham mentions reverse engineering an application by running it in a VM. As it so happened, I gave a talk on building and breaking systems using VMs a couple years ago. One very nice approach is ReVirt, which records the state of a VM, allowing debugging to go forwards or backwards. That is, you can actually rewind past interrupts, IO, and other system events to examine the state of the software at any arbitrary point. Obviously, this would be great for reverse engineering though, as Tim points out, there haven’t been many public instances of people doing this. (If there have, can you please point them out to me?)

An idea I had a few years back was to design a VM-based system to assist in developing Linux or FreeBSD drivers when only Windows drivers are available. The VM would be patched to record data associated with all IO instructions (inb, outb, etc.), PCI config space access, and memory-mapped IO (a “wedge” device.) It would pass through the data for a single real hardware device. To the guest OS, it would appear to be a normal VM with one non-virtual device.

To reverse engineer a device, you would configure the VM with the bus:slot:function of the device to pass through. Boot Windows in the VM with the vendor driver installed. Use the device normally, marking the log at various points (“boot probe”, “associating with an AP”). Pass that log on to the open source developer to assist in implementing or improving a driver.

A similar approach without involving a VM would be to make a Windows service that loads early and hooks HAL.DLL as well as sets protection on any memory mappings of the target device. Similar to copy-on-write, access to that memory would trigger an exception that the service could handle, recording the data and permitting access. This could be distributed to end users to help in remote debugging of proprietary hardware.

Anti-debugger techniques are overrated

April 19, 2007April 22, 2007 ~ Nate Lawson ~ 49 Comments

Most protection schemes include various anti-debugger techniques. They can be as simple as IsDebuggerPresent() or complex as attempting to detect or crash a particular version of SoftICE. The promise of these techniques is that they will prevent attackers from using their favorite tools. The reality is that they are either too simple and thus easy to bypass or too specific to a particular type or version of debugger. When designing software protection, it’s best to build a core that is resistant to reverse-engineering of all kinds and not rely on anti-debugger techniques.

One key point that is often overlooked is that anti-debugger techniques, at best, increase the difficulty of the first break. This characteristic is similar to other approaches, including obfuscation. Such techniques do nothing to prevent optimizing the first attack or packaging an attack for distribution.

In any protection system, there are two kinds of primitives: checks and landmines. Checks produce a changing value or code execution path based on the status of the item checked. Landmines crash, hang, scramble, or otherwise interfere with the attacker’s tools themselves. Anti-debugger techniques come in both flavors.

IsDebuggerPresent() is an example of a simple check. It is extremely general but can be easily bypassed with a breakpoint script or by traditional approaches like patching the import table (IAT). However, since the implementation of this function merely returns a value from the process memory, it can even be overwritten to always be 0 (False). The approach is to find the TIB (Thread Information Block) via the %fs segment register, dereference a pointer to the PEB (Process Environment Block), and overwrite a byte at offset 2. Since it is so general, it has little security value.

More targeted checks or landmines have been used before, against SoftICE for example. (Note that the SIDT method listed is very similar to the later Red Pill approach — everything old is new again.) To the protection author, targeted anti-debugger techniques are like using a 0day exploit. Once you are detected and the hole disabled or patched, you have to find a new one. That may be a reasonable risk if you are an attacker secretly trying to compromise a few valuable servers. But as a protection author, you’re publishing your technique to the entire world of reverse engineers, people especially adept at figuring it out. You may slow down the first one for a little while, but nothing more than that.

Anti-debugging techniques do have a small place if their limits are recognized. IsDebuggerPresent() can be used to provide a courtesy notice reminding the user of their license agreement (i.e., the “no reverse engineering” clause.) However, since it is so easily bypassed, it should not be used as part of any protection scheme. Debugger-specific checks and landmines can be sprinkled throughout the codebase and woven into the overall scheme via mesh techniques. However, their individual reliability should be considered very low due to constantly improving reversing tools and the ease with which attackers can adapt to any specific technique.

Bright future for counter-attacks

April 10, 2007April 11, 2007 ~ Nate Lawson ~ 3 Comments

Counter-attack stories come and go, but this time it’s supported by the courts. The question was whether the defendant’s 4th Amendment rights against unreasonable search and seizure were violated by the campus system administrator logging into his dorm computer without authorization. Campus police and the administrator then followed up by visiting the dorm room and gathering evidence.

The 9th Circuit Court ruled that the physical search was not justified, but since the same evidence was found independently by the remote search that the court ruled was justified, it did not violate the defendant’s 4th Amendment rights and was admissible.

Electronic access was justified, under a “special needs” exemption. The administrator supposedly went against Qualcomm requests to wait for an FBI warrant and didn’t make an effort to collect extensive evidence (or create/delete files), reinforcing the point that it was not an evidentiary search. The claim that he was acting to protect the campus email server (and not on behalf of stopping the Qualcomm intrusion, which was outside campus jurisdiction) also helped.

The conclusion is that “requiring a warrant … would disrupt the operation of the university and the network that it relies upon to function.” This seems like a very weak claim. The administrator had already successfully blocked the connection to the email server once, and could presumably have put in a firewall rule blocking all SSH (or backdoor) TCP connections to the email server from the dorms.The ruling presumably quotes his testimony that the “user had obtained access … restricted to specific system administrators, none of whom would be working from the university’s dormitories.” The IP addresses the attacker used were referenced only by the last 8-bits, indicating a simple class-C filter rule would have been sufficient. Better yet, block all Internet access by disabling the Ethernet port on the switch the attacker connected through.

A member of the University of Wisconsin IT staff posted a curious commentary along these lines:

If you hack University servers from your computer (or even if the computer is being used a zombie), and then take steps to hide your identity or otherwise conceal your activities, your network access will be removed, such removal will be actively enforced and verified, and any immediate actions required to protect the security and integrity of the University network and computing resources will be taken.

In other words, we can and will block your access. But then later,

Academic, legal, and possible criminal action will then follow, as warranted. These were exigent circumstances, and not done under the guise of law enforcement, but rather the protection of critical university resources from activities clearly and explicitly disallowed by numerous University information technology, housing, academic, and general policies (not to mention various federal and state laws).

Sorry, but if you’re so capable of removing, enforcing, and verifying network access, why was a raid of the dorm room by the administrator and campus police so urgent that a warrant couldn’t be obtained? They can’t say “because he might destroy evidence since he probably knew we’d detected him” without treading into the waters of this being a law enforcement action, and thus an illegal search. So they try to have it both ways, claiming that the electronic and subsequent physical search were necessary to take immediate action to protect the email server, and not for evidentiary reasons, while their previous actions showed that they were fully aware and capable of preventing the attacker from accessing the email server via filter rules.

The court also ruled that computers and the data they contain are considered private, even when attached to the university network. However, connecting a computer to the network implies assent to the network owner’s policies (even more vague an action than click-through licenses since it’s not clear you have the policy in front of you when you plug into a network jack.)

The university should have had a monitoring clause in their computer policy. Instead they had limitations on access to data (“[i]n general, all computer and electronic files should be free from access by any but the authorized users of those files.”) This helped reinforce the point that his computer could be considered private.

Finally, if you’re a hacker:

Don’t hack hosts on your local network. The more entities (read: bureaucracy) you can layer between you and the target, the better.
Don’t do activities from your computer that identify you (check email, log into your legit account, etc.) while hacking.
Don’t hack from your own computer or any computer remotely associated with you.

If you’re a security vendor:

Core Impact just got a whole lot more valuable. Ivan, can we have a “just looking” mode that gathers info without touching anything that looks private?
NAC doesn’t solve the problem of taking a set of connections and tracking it back to a port. Who’s going to help administrators like this who have problems installing firewall rules?

Mesh design pattern: hash-and-decrypt

April 9, 2007 ~ Nate Lawson ~ 4 Comments

Hash functions are an excellent way to tie together various parts of a protection mechanism. Our first mesh design pattern, hash-and-decrypt, uses a hash function to derive a key that is then used to decrypt the next stage. Since a cryptographic hash (e.g., SHA-1) is sensitive to a change of even a single bit of input, this pattern provides a strong way to insure the next stage (code, data, more checks) is not accessible unless all the input bits are correct.

Diagram of hashing and then decrypting

For example, consider a game with different levels, each encrypted with a different AES key. The key to decrypt level N+1 can be derived by hashing together data which only is present in RAM after the player has beat level N with an unmodified game (e.g., correct items in inventory, state of treasure chests, map of locations visited, etc.) If an attacker tries to cheat on level N by modifying the game state, they won’t know what items they need to have, may load up their character with items that are impossible to have at that point in the game, or one or more map positions won’t have been marked as visited. In this case, the hash and thus the next level key will be incorrect. Any difference in the hashed data produces an incorrect key and the level cannot be decrypted without the exact key.

In software protection, the focus is on verifying that security checks are intact and running properly. Hash-and-decrypt would cover code and data locations that might be modified by an attacker who is debugging or patching the application in order to reverse engineer it. This includes locations that might be changed by setting breakpoints (i.e., int 3 or Detours-style function hooking, debug registers DR0-3, IDTR a la Red Pill) or self-check functions that may be disabled or paused while analyzing the executable. The encrypted stage N+1 can be parts of the application as well as other self-check functions.

To tie together multiple self-check functions, hash-and-decrypt can also be layered. For example, each self-check function, which monitors one or more different hook points, can iteratively hash a unique seed each time it runs along with the data it observes. The unlock function hashes each self-check’s hash, decrypts, and then executes the protected code. If an attacker pauses one or more of the self-check threads, the hash will be incorrect when the unlock function runs.

To slow down the attacker, invariants that are not security-critical can be included in the hash (“chaff”). For example, the game could hash parts of RAM that are known to be constants or map data. This keeps the hashed addresses from pointing out what’s important. This falls under the category of obfuscation techniques, but is a good example of how each software protection method mutually enforces each other.

Finally, this technique can be implemented in hardware to prevent glitching attacks. In a glitching attack, a pulse is used to disrupt a processor’s execution (usually via the clock line), causing some operation to be performed incorrectly. Attackers can use this to target internal logic and trigger malicious code execution in tamper-resistant devices. To counter this, the internal state (i.e. pipeline registers, address lines, caches, IR) of the processor can be hashed by custom hardware each clock cycle. At the end of each instruction, the final hash is used to unlock the next operation. If a glitch caused the value of any flip-flop in the CPU to be incorrect during any of the intermediate clock cycles, the next operation will not decrypt properly.

JTAG attacks and PR submarines

April 6, 2007April 6, 2007 ~ Nate Lawson ~ 37 Comments

Security research publication comes in two varieties: genuine advances and PR submarines (stories that sound like real advances but are more clever PR than substance.) Barnaby Jack’s recent announcement of attacking embedded systems via JTAG is definitely the latter. Since the trade press is always looking for interesting angles, they are especially susceptible to PR submarines.

Background: the attack uses the standard JTAG port present on nearly all chipsets and CPUs. This port is used for factory and field diagnostics and provides device-specific access to the internal flip-flops that store all the chip’s state. A technician typically uses a GUI (aka in-circuit emulator) on the host PC to set breakpoints, read/write internal registers, dump memory, and perform other debugger-like functions. Secure processors like smart cards already disable JTAG before the chips leave the factory to prevent this kind of attack.

Like Schneier’s snake oil crypto test, let’s examine how to identify security PR submarines.

1. Attack has been done before (bonus: no citation of prior work in the same area)

Check. Since JTAG access gives the hardware equivalent of a software debugger, attackers have been using it from the beginning. The first attackers were probably competitors reverse engineering designs to copy them or improve their own. Currently, a packaged version of this attack has been in use for years to get free satellite TV. No mention of any of this history can be found in the article.

2. Researcher previously gave same talk at another conference

Check. Keep these slides open for reference below. He is probably speaking on another application of the same attack, but count on the talk being quite similar.

3. Implications of attack wildly speculative

An attacker with physical access to the circuit board can control a device. Yes, that’s what JTAG is for. But there is no way this allows an attacker to “redirect Internet traffic on routers” without physical access to those routers. Perhaps Mr. Jack was unaware that this attack primarily matters to tamper-resistant devices (i.e., smart cards) where the device itself must protect stored cash, authentication secrets, or other data subject to physical attacks. That may be why he added a nice, but wholly-unnecessary application of modifying the software on a home router to insert trojan code in EXEs (slides 35-38.)

4. Attack uses very polished, mature tools and requires little or no custom development

Check. Note use of GUI in-circuit emulator on slides 18 and 21. The only custom development I can see is for the ARM code to modify the TCP packets. He could have inserted that code via a socketed flash chip instead of using JTAG but that would not sound as cool.

5. Deployed systems already have defenses against the attack

Check. JTAG is already disabled with any use of a tamper-resistant processor, and nearly every microcontroller made has a fuse to disable JTAG.

6. New researcher or new field for existing researcher

Barnaby Jack (formerly of eEye) has done awesome work on win32 kernel shellcode. Not to slight his previous work, but hardware is a new direction for him.

7. Venue is a talk at a minor conference, not a peer-reviewed paper (bonus: no details given)

Check. CanSecWest does not require a paper, and I don’t expect Mr. Jack to publish one although it’s possible he might. And what’s this about Juniper, his employer, sponsoring CanSecWest?

8. Announcement first appears in trade press or Slashdot

Check and check.

9. Slogan or catch-phrase consistently used to advertise attack

Check. Closing quote for the article is “I’m looking at my microwave oven right now, but I don’t think there’s much I could do with that.” See also intro slide 3 for the previous talk.

What is it about CanSecWest that attracts such sensationalism? Is there just no other way to justify a trip to Canada in your travel budget?