Reverse engineering with a VM

April 21, 2007April 22, 2007 ~ Nate Lawson ~ 11 Comments

In a previous comment, Tim Newsham mentions reverse engineering an application by running it in a VM. As it so happened, I gave a talk on building and breaking systems using VMs a couple years ago. One very nice approach is ReVirt, which records the state of a VM, allowing debugging to go forwards or backwards. That is, you can actually rewind past interrupts, IO, and other system events to examine the state of the software at any arbitrary point. Obviously, this would be great for reverse engineering though, as Tim points out, there haven’t been many public instances of people doing this. (If there have, can you please point them out to me?)

An idea I had a few years back was to design a VM-based system to assist in developing Linux or FreeBSD drivers when only Windows drivers are available. The VM would be patched to record data associated with all IO instructions (inb, outb, etc.), PCI config space access, and memory-mapped IO (a “wedge” device.) It would pass through the data for a single real hardware device. To the guest OS, it would appear to be a normal VM with one non-virtual device.

To reverse engineer a device, you would configure the VM with the bus:slot:function of the device to pass through. Boot Windows in the VM with the vendor driver installed. Use the device normally, marking the log at various points (“boot probe”, “associating with an AP”). Pass that log on to the open source developer to assist in implementing or improving a driver.

A similar approach without involving a VM would be to make a Windows service that loads early and hooks HAL.DLL as well as sets protection on any memory mappings of the target device. Similar to copy-on-write, access to that memory would trigger an exception that the service could handle, recording the data and permitting access. This could be distributed to end users to help in remote debugging of proprietary hardware.

Anti-debugger techniques are overrated

April 19, 2007April 22, 2007 ~ Nate Lawson ~ 49 Comments

Most protection schemes include various anti-debugger techniques. They can be as simple as IsDebuggerPresent() or complex as attempting to detect or crash a particular version of SoftICE. The promise of these techniques is that they will prevent attackers from using their favorite tools. The reality is that they are either too simple and thus easy to bypass or too specific to a particular type or version of debugger. When designing software protection, it’s best to build a core that is resistant to reverse-engineering of all kinds and not rely on anti-debugger techniques.

One key point that is often overlooked is that anti-debugger techniques, at best, increase the difficulty of the first break. This characteristic is similar to other approaches, including obfuscation. Such techniques do nothing to prevent optimizing the first attack or packaging an attack for distribution.

In any protection system, there are two kinds of primitives: checks and landmines. Checks produce a changing value or code execution path based on the status of the item checked. Landmines crash, hang, scramble, or otherwise interfere with the attacker’s tools themselves. Anti-debugger techniques come in both flavors.

IsDebuggerPresent() is an example of a simple check. It is extremely general but can be easily bypassed with a breakpoint script or by traditional approaches like patching the import table (IAT). However, since the implementation of this function merely returns a value from the process memory, it can even be overwritten to always be 0 (False). The approach is to find the TIB (Thread Information Block) via the %fs segment register, dereference a pointer to the PEB (Process Environment Block), and overwrite a byte at offset 2. Since it is so general, it has little security value.

More targeted checks or landmines have been used before, against SoftICE for example. (Note that the SIDT method listed is very similar to the later Red Pill approach — everything old is new again.) To the protection author, targeted anti-debugger techniques are like using a 0day exploit. Once you are detected and the hole disabled or patched, you have to find a new one. That may be a reasonable risk if you are an attacker secretly trying to compromise a few valuable servers. But as a protection author, you’re publishing your technique to the entire world of reverse engineers, people especially adept at figuring it out. You may slow down the first one for a little while, but nothing more than that.

Anti-debugging techniques do have a small place if their limits are recognized. IsDebuggerPresent() can be used to provide a courtesy notice reminding the user of their license agreement (i.e., the “no reverse engineering” clause.) However, since it is so easily bypassed, it should not be used as part of any protection scheme. Debugger-specific checks and landmines can be sprinkled throughout the codebase and woven into the overall scheme via mesh techniques. However, their individual reliability should be considered very low due to constantly improving reversing tools and the ease with which attackers can adapt to any specific technique.

Bright future for counter-attacks

April 10, 2007April 11, 2007 ~ Nate Lawson ~ 3 Comments

Counter-attack stories come and go, but this time it’s supported by the courts. The question was whether the defendant’s 4th Amendment rights against unreasonable search and seizure were violated by the campus system administrator logging into his dorm computer without authorization. Campus police and the administrator then followed up by visiting the dorm room and gathering evidence.

The 9th Circuit Court ruled that the physical search was not justified, but since the same evidence was found independently by the remote search that the court ruled was justified, it did not violate the defendant’s 4th Amendment rights and was admissible.

Electronic access was justified, under a “special needs” exemption. The administrator supposedly went against Qualcomm requests to wait for an FBI warrant and didn’t make an effort to collect extensive evidence (or create/delete files), reinforcing the point that it was not an evidentiary search. The claim that he was acting to protect the campus email server (and not on behalf of stopping the Qualcomm intrusion, which was outside campus jurisdiction) also helped.

The conclusion is that “requiring a warrant … would disrupt the operation of the university and the network that it relies upon to function.” This seems like a very weak claim. The administrator had already successfully blocked the connection to the email server once, and could presumably have put in a firewall rule blocking all SSH (or backdoor) TCP connections to the email server from the dorms.The ruling presumably quotes his testimony that the “user had obtained access … restricted to specific system administrators, none of whom would be working from the university’s dormitories.” The IP addresses the attacker used were referenced only by the last 8-bits, indicating a simple class-C filter rule would have been sufficient. Better yet, block all Internet access by disabling the Ethernet port on the switch the attacker connected through.

A member of the University of Wisconsin IT staff posted a curious commentary along these lines:

If you hack University servers from your computer (or even if the computer is being used a zombie), and then take steps to hide your identity or otherwise conceal your activities, your network access will be removed, such removal will be actively enforced and verified, and any immediate actions required to protect the security and integrity of the University network and computing resources will be taken.

In other words, we can and will block your access. But then later,

Academic, legal, and possible criminal action will then follow, as warranted. These were exigent circumstances, and not done under the guise of law enforcement, but rather the protection of critical university resources from activities clearly and explicitly disallowed by numerous University information technology, housing, academic, and general policies (not to mention various federal and state laws).

Sorry, but if you’re so capable of removing, enforcing, and verifying network access, why was a raid of the dorm room by the administrator and campus police so urgent that a warrant couldn’t be obtained? They can’t say “because he might destroy evidence since he probably knew we’d detected him” without treading into the waters of this being a law enforcement action, and thus an illegal search. So they try to have it both ways, claiming that the electronic and subsequent physical search were necessary to take immediate action to protect the email server, and not for evidentiary reasons, while their previous actions showed that they were fully aware and capable of preventing the attacker from accessing the email server via filter rules.

The court also ruled that computers and the data they contain are considered private, even when attached to the university network. However, connecting a computer to the network implies assent to the network owner’s policies (even more vague an action than click-through licenses since it’s not clear you have the policy in front of you when you plug into a network jack.)

The university should have had a monitoring clause in their computer policy. Instead they had limitations on access to data (“[i]n general, all computer and electronic files should be free from access by any but the authorized users of those files.”) This helped reinforce the point that his computer could be considered private.

Finally, if you’re a hacker:

Don’t hack hosts on your local network. The more entities (read: bureaucracy) you can layer between you and the target, the better.
Don’t do activities from your computer that identify you (check email, log into your legit account, etc.) while hacking.
Don’t hack from your own computer or any computer remotely associated with you.

If you’re a security vendor:

Core Impact just got a whole lot more valuable. Ivan, can we have a “just looking” mode that gathers info without touching anything that looks private?
NAC doesn’t solve the problem of taking a set of connections and tracking it back to a port. Who’s going to help administrators like this who have problems installing firewall rules?

JTAG attacks and PR submarines

April 6, 2007April 6, 2007 ~ Nate Lawson ~ 37 Comments

Security research publication comes in two varieties: genuine advances and PR submarines (stories that sound like real advances but are more clever PR than substance.) Barnaby Jack’s recent announcement of attacking embedded systems via JTAG is definitely the latter. Since the trade press is always looking for interesting angles, they are especially susceptible to PR submarines.

Background: the attack uses the standard JTAG port present on nearly all chipsets and CPUs. This port is used for factory and field diagnostics and provides device-specific access to the internal flip-flops that store all the chip’s state. A technician typically uses a GUI (aka in-circuit emulator) on the host PC to set breakpoints, read/write internal registers, dump memory, and perform other debugger-like functions. Secure processors like smart cards already disable JTAG before the chips leave the factory to prevent this kind of attack.

Like Schneier’s snake oil crypto test, let’s examine how to identify security PR submarines.

1. Attack has been done before (bonus: no citation of prior work in the same area)

Check. Since JTAG access gives the hardware equivalent of a software debugger, attackers have been using it from the beginning. The first attackers were probably competitors reverse engineering designs to copy them or improve their own. Currently, a packaged version of this attack has been in use for years to get free satellite TV. No mention of any of this history can be found in the article.

2. Researcher previously gave same talk at another conference

Check. Keep these slides open for reference below. He is probably speaking on another application of the same attack, but count on the talk being quite similar.

3. Implications of attack wildly speculative

An attacker with physical access to the circuit board can control a device. Yes, that’s what JTAG is for. But there is no way this allows an attacker to “redirect Internet traffic on routers” without physical access to those routers. Perhaps Mr. Jack was unaware that this attack primarily matters to tamper-resistant devices (i.e., smart cards) where the device itself must protect stored cash, authentication secrets, or other data subject to physical attacks. That may be why he added a nice, but wholly-unnecessary application of modifying the software on a home router to insert trojan code in EXEs (slides 35-38.)

4. Attack uses very polished, mature tools and requires little or no custom development

Check. Note use of GUI in-circuit emulator on slides 18 and 21. The only custom development I can see is for the ARM code to modify the TCP packets. He could have inserted that code via a socketed flash chip instead of using JTAG but that would not sound as cool.

5. Deployed systems already have defenses against the attack

Check. JTAG is already disabled with any use of a tamper-resistant processor, and nearly every microcontroller made has a fuse to disable JTAG.

6. New researcher or new field for existing researcher

Barnaby Jack (formerly of eEye) has done awesome work on win32 kernel shellcode. Not to slight his previous work, but hardware is a new direction for him.

7. Venue is a talk at a minor conference, not a peer-reviewed paper (bonus: no details given)

Check. CanSecWest does not require a paper, and I don’t expect Mr. Jack to publish one although it’s possible he might. And what’s this about Juniper, his employer, sponsoring CanSecWest?

8. Announcement first appears in trade press or Slashdot

Check and check.

9. Slogan or catch-phrase consistently used to advertise attack

Check. Closing quote for the article is “I’m looking at my microwave oven right now, but I don’t think there’s much I could do with that.” See also intro slide 3 for the previous talk.

What is it about CanSecWest that attracts such sensationalism? Is there just no other way to justify a trip to Canada in your travel budget?

Protecting the protection code

March 23, 2007March 23, 2007 ~ Nate Lawson ~ 3 Comments

Now that we’ve prevented casual copying using media binding, the next step is to protect the protection code itself. This is the largest and hardest task in software protection and is typically why most attackers first try to separate the prize (video/audio or software functionality) from the protection code.

The first two techniques we’ll consider are obfuscation and code encryption.

Obfuscation uses non-standard combinations of data/code to make it difficult for an attacker (or attack software) to understand or target specific components of the software in order to compromise them
Encryption uses cryptography (often block ciphers) to prevent the data/code from being analyzed without first obtaining the key necessary to decrypt it

Obfuscation techniques include misleading symbol names, linking code as data segments (or vice versa), jump tables, self-modifying code, randomization of memory allocation, and interrupts/exceptions. The goal is to perform the normal software functionality and protection code in a convoluted way that makes it difficult to understand or reliably modify the program flow. Typically, obfuscation is done at the assembly language level after the main software has been compiled although it can be done at other stages also. This transforms the original program P into P’, which should perform the same functions as P but in a very different way.

Encryption involves pre-processing code/data in the program with a cipher (e.g., AES, RC4) and then adding a routine to perform the decryption before executing or processing the protected data. Encryption with a standard cipher is theoretically much stronger than obfuscation if the attacker does not have the key. However, the key often needs to be stored in the program or transmitted to the program at some point so it can decrypt the data. At that point, the decrypted data or the key can be grabbed by the attacker.

These two techniques are very different, and it’s important to understand why. Obfuscation, if done properly, has the advantage of having no single point of failure. The code/data is always protected during processing until the attacker carefully unwinds the obfuscation. However, it is often slower than straightforward processing, requires manual intervention to design and insert, and is rarely integrated with the rest of the software protection well enough that an attacker can’t just go around it. Encryption offers strong theoretical security, but has a single point of failure: the key. If the key is not present in the software (i.e., a PIN to unlock add-on features), encryption offers very strong protection. However, often weak encryption like XOR is used or the key is not protected very well when stored in the program itself.

The best approach is to combine obfuscation and encryption, along with all the other techniques, offsetting the weaknesses of each with the strengths of the others. For example, the traditional approach to implementing software decryption would be to take a stock C implementation of AES and call it with the key and a flag indicating it should perform decryption.

The weakness here is that the key is stored separately from the decryption routine and thus may be easy to isolate and extract. Since this function is not being used for encryption or with different sets of keys, it makes sense to combine the key and cipher implementation into a single primitive: an obfuscated hard-coded decryption function.

One approach is to take a set of random bijections and combine it with the key and cipher to produce this fixed decryption function. Specifically, each component of the cipher is replaced with an equivalent, randomized function that computes one stage of the decryption using the fixed key. This is possible because block ciphers are composed of a number of small functions (substitute, permute or shuffle around, add key material), run over and over for a number of rounds. As long as each modified function still produces the correct results for the fixed key and variable data, the combination of them will still compute the same result as stock AES. (More details on this technique, called “white-box cryptography,” can be found in these papers on DES and AES.)

Of course, if an attacker can just isolate and enslave this decryption function to recover arbitrary data from the program, it isn’t very effective. That’s why other techniques are needed to tie the decryption function to the entire software protection.

Next, we’ll cover anti-debugging and anti-tampering techniques.

Media binding techniques

March 22, 2007March 23, 2007 ~ Nate Lawson ~ 5 Comments

Media binding is the first step of copy protection, a specific type of software protection. If the attacker can’t just duplicate the media itself, he will have to attack the software’s implementation of media binding. Media binding is a set of techniques for preventing access to the protected software without the original media.

Content protection (a specific form of copy protection) starts by encrypting the video/audio data with a unique key, and then uses media binding and other software protection techniques to regulate access to that key. Since the protected data is completely passive (i.e., stands alone as playable once decrypted), managing access to the key is critical. However, software copy protection is more flexible since software is active and its copy protection can be integrated throughout its runtime.

The Blu-ray content protection system, BD+, is more of a software protection scheme (versus a content protection scheme) since the per-disc security code is run throughout disc playback to implement the descrambling process. Thus, each disc is more of a unique software application instead of passive video/audio data. AACS is more of a traditional, key-based content protection system.

There are three ways to bind software to media:

Verify unique characteristics of the original media
Encode data in a form that can’t be written by consumer equipment
Attack particular copying processes and equipment involved

Verifying the media involves one or more checks for physical aspects that are difficult to duplicate. Checks involve intentional errors, invalid filesystem data, data layout alignment, and timing of drive operations or media. The key point is that some logic in the software is making a decision (i.e., if/then) based on the results of these checks. So attackers will often go after the software protection if they can’t easily duplicate the media characteristics.

Encoding data involves modifying the duplication equipment, consumer drive, and/or recordable media such that the real data cannot be read or written on unmodified consumer equipment and media. This is usually more powerful than verification alone because attackers have to modify their drives or circumvent the software protection before getting access to the software in the first place. Game consoles often use this approach since they can customize their drives and media, even though they usually start with consumer designs (i.e., DVD). Custom encodings can be used with verification of the encoded data for increased strength.

Attacking the copying process involves analyzing the equipment used to make copies and exploiting its flaws. For example, most DVD ripping software uses UDF filesystem information to locate files while hardware DVD players use simpler metadata. So phantom files can be provided that only the copying software sees that corrupt the rip. The problem with attacking the copying process is that it is relatively easy to update copying software, so usually this technique has a short shelf life. However, it can be useful as part of an overall strategy of increasing an attacker’s costs.

Obviously, if the attacker has access to the same duplication equipment that the author uses, nothing can prevent them from making their own media. Other mechanisms, such as revocation, must handle this case farther down the chain.

Next, we’ll discuss protecting the protection code itself.