How the PS3 hypervisor was hacked

George Hotz, previously known as an iPhone hacker, announced that he hacked the Playstation 3 and then provided exploit details. Various articles have been written about this but none of them appear to have analyzed the actual code. Because of the various conflicting reports, here is some more analysis to help understand the exploit.

The PS3, like the Xbox360, depends on a hypervisor for security enforcement. Unlike the 360, the PS3 allows users to run ordinary Linux if they wish, but it still runs under management by the hypervisor. The hypervisor does not allow the Linux kernel to access various devices, such as the GPU. If a way was found to compromise the hypervisor, direct access to the hardware is possible, and other less privileged code could be monitored and controlled by the attacker.

Hacking the hypervisor is not the only step required to run pirated games. Each game has an encryption key stored in an area of the disc called ROM Mark. The drive firmware reads this key and supplies it to the hypervisor to use to decrypt the game during loading. The hypervisor would need to be subverted to reveal this key for each game. Another approach would be to compromise the Blu-ray drive firmware or skip extracting the keys and just slave the decryption code in order to decrypt each game. After this, any software protection measures in the game would need to be disabled. It is unknown what self-protection measures might be lurking beneath the encryption of a given game. Some authors might trust in the encryption alone, others might implement something like SecuROM.

The hypervisor code runs on both the main CPU (PPE) and one of its seven Cell coprocessors (SPE). The SPE thread seems to be launched in isolation mode, where access to its private code and data memory is blocked, even from the hypervisor.  The root hardware keys used to decrypt the bootloader and then hypervisor are present only in the hardware, possibly through the use of eFUSEs. This could also mean that each Cell processor has some unique keys, and decryption does not depend on a single global root key (unlike some articles that claim there is a single, global root key).

George’s hack compromises the hypervisor after booting Linux via the “OtherOS” feature. He has used the exploit to add arbitrary read/write RAM access functions and dump the hypervisor. Access to lv1 is a necessary first step in order to mount other attacks against the drive firmware or games.

His approach is clever and is known as a “glitching attack“. This kind of hardware attack involves sending a carefully-timed voltage pulse in order to cause the hardware to misbehave in some useful way. It has long been used by smart card hackers to unlock cards. Typically, hackers would time the pulse to target a loop termination condition, causing a loop to continue forever and dump contents of the secret ROM to an accessible bus. The clock line is often glitched but some data lines are also a useful target. The pulse timing does not always have to be precise since hardware is designed to tolerate some out-of-spec conditions and the attack can usually be repeated many times until it succeeds.

George connected an FPGA to a single line on his PS3’s memory bus. He programmed the chip with very simple logic: send a 40 ns pulse via the output pin when triggered by a pushbutton. This can be done with a few lines of Verilog. While the length of the pulse is relatively short (but still about 100 memory clock cycles of the PS3), the triggering is extremely imprecise. However, he used software to setup the RAM to give a higher likelihood of success than it would first appear.

His goal was to compromise the hashed page table (HTAB) in order to get read/write access to the main segment, which maps all memory including the hypervisor. The exploit is a Linux kernel module that calls various system calls in the hypervisor dealing with memory management. It allocates, deallocates, and then tries to use the deallocated memory as the HTAB for a virtual segment. If the glitch successfully desynchronizes the hypervisor from the actual state of the RAM, it will allow the attacker to overwrite the active HTAB and thus control access to any memory region. Let’s break this down some more.

The first step is to allocate a buffer. The exploit then requests that the hypervisor create lots of duplicate HTAB mappings pointing to this buffer. Any one of these mappings can be used to read or write to the buffer, which is fine since the kernel owns it. In Unix terms, think of these as multiple file handles to a single temporary file. Any file handle can be closed, but as long as one open file handle remains, the file’s data can still be accessed.

The next step is to deallocate the buffer without first releasing all the mappings to it. This is ok since the hypervisor will go through and destroy each mapping before it returns. Immediately after calling lv1_release_memory(), the exploit prints a message for the user to press the glitching trigger button. Because there are so many HTAB mappings to this buffer, the user has a decent chance of triggering the glitch while the hypervisor is deallocating a mapping. The glitch probably prevents one or more of the hypervisor’s write cycles from hitting memory. These writes were intended to deallocate each mapping, but if they fail, the mapping remains intact.

At this point, the hypervisor has an HTAB with one or more read/write mappings pointing to a buffer it has deallocated. Thus, the kernel no longer owns that buffer and supposedly cannot write to it. However, the kernel still has one or more valid mappings pointing to the buffer and can actually modify its contents. But this is not yet useful since it’s just empty memory.

The exploit then creates a virtual segment and checks to see if the associated HTAB is located in a region spanning the freed buffer’s address. If not, it keeps creating virtual segments until one does. Now, the user has the ability to write directly to this HTAB instead of the hypervisor having exclusive control of it. The exploit writes some HTAB entries that will give it full access to the main segment, which maps all of memory. Once the hypervisor switches to this virtual segment, the attacker now controls all of memory and thus the hypervisor itself. The exploit installs two syscalls that give direct read/write access to any memory address, then returns back to the kernel.

It is quite possible someone will package this attack into a modchip since the glitch, while somewhat narrow, does not need to be very precisely timed. With a microcontroller and a little analog circuitry for the pulse, this could be quite reliable. However, it is more likely that a software bug will be found after reverse-engineering the dumped hypervisor and that is what will be deployed for use by the masses.

Sony appears to have done a great job with the security of the PS3. It all hangs together well, with no obvious weak points. However, the low level access given to guest OS kernels means that any bug in the hypervisor is likely to be accessible to attacker code due to the broad API it offers. One simple fix would be to read back the state of each mapping after changing it. If the write failed for some reason, the hypervisor would see this and halt.

It will be interesting to see how Sony responds with future updates to prevent this kind of attack.

[Edit: corrected the description of virtual segment allocation based on a comment by geohot.]

Just another day at the office

The following does not take place over a 24 hour period. But any one of these situations is a good example of a typical day at Root Labs.

Attack Windows device driver protection scheme

Certain drivers in Windows must implement software protection (PMP) in order to prevent audio/video ripping attacks. Since drivers run in ring 0, they can do a lot more than just the standard SEH tricks. It’s 1992 all over again as we dig into the IDT and attempt to find how they are trying to protect their decryption process.

Whitebox cryptography is a method of combining a key and cipher implementation to create a keyed cipher. It can only do whatever operation it was initialized with and uses a hard-coded key. The entire operation becomes a series of table lookups. The intermediate values are obscured by randomness merged into the tables. But it’s not impossible to defeat.

To get at the cipher though, we’re going to need a way to bypass some of these anti-debugging traps. Since ring 0 code has direct access to all the CPU’s registers (including the debug registers, MSRs, and page tables), it is free to wreak havoc with our attempts to circumvent it. A common approach is to disable any breakpoints in the registers by overwriting them. A less common method is to use the debug registers as a local procedure call gate so that an attacker that writes to them breaks the main loop.

We write our own hook and patch into the int 1 handler at a point that isn’t integrity-checked and set the GD bit. This causes our hook to be called whenever anyone writes or reads the debug registers. The trap springs! There’s the heart of the protection code right there. Now to figure out what it’s doing.

Design self-reinforcing integrity checks

Preventing attackers from patching your code is very difficult. One approach is to insert small hash functions that verify a region of code or data has not been changed. Of course, there’s the obvious problem of “who watches the watcher?” Since it’s easy for attackers to NOP out the checksum routine or modify their patch to compensate if you use CRC, our mission today is to design and implement a more robust approach.

First, we analyze the general problem of mutually-reinforcing checks. The code and data for the check itself both need to be covered. But if two checks exactly mirrored each other, you’d have a chicken-and-egg problem in building them since a change in one would require changes in the other, and so on. It’s like standing between two mirrors, infinite recursion.

A data structure that describes the best we can do is a directed acyclic graph. There are no cycles so the checks (and checksums) can be generated in reverse order. But the multiple paths through the graph provide overlapping coverage so we can be certain that a single patch cannot bypass the protection. Of course, the roots of each of these paths needs to be hidden throughout the code. Putting them all in one big loop run by a single watchman thread would be a mistake. Now we have to come up with a way of automatically generating, randomizing, and inserting them into the code while also hiding the root nodes in various places.

Reverse-engineer RFID transponder

What exactly is in those FasTrak transponders? Do they securely identify themselves via cryptographic challenge/response? Do they preserve your privacy? What about this rumor that they are tracked by antennas all over Bay Area freeways?

Not being an existing user, we bought one at a local supermarket and took it apart. Inside was a microcontroller, some passive electronics, and not a whole lot else. An older one turned out to still have JTAG enabled, so it was simple to dump its firmware. The newer one did have the lock bit set, and Flylogic Engineering was kind enough to decap it and zap it for us. The firmware was identical. Does IDA have an MSP430 plugin? Well, there was one but it was on Geocities and then vaporized. Time to dig around a bit. Find the source code and hack it to work with the newer SDK.

Then it’s off to drop in the code and analyze it for switch statements (protocol handlers usually). Trace everything that starts from the IO pins that map to the receive side of the antenna. Then manually walk up the stack to see where it goes. Hey, there’s an over-the-air update function in here. Just send the magic handshake, then the next packet gets written to flash. There are some checks to try to maintain the writes within the area that stores the ID. But there are multiple paths to this function and one of them is obviously not tested because it pushes the wrong size argument on the stack (a 2-byte instead of 1-byte length argument). Time to notify the agency that 1 million transponders are at risk of a permanent DoS at best and code execution at worst.

Coda

If you’ve made it this far, you might be interested to know we’re hiring. We tackle difficult security problems in environments with clever, persistent adversaries. We’re just as likely to design a system as attack it (and often do both for the same customer). If this sounds like your kind of job, please see here for more details.

Repeating a check sometimes makes sense

When reviewing source code, I often see a developer making claims like “this can’t happen here” or “to reach this point, X condition must have occurred”. These kinds of statements make a lot of assumptions about the execution environment. In some cases, the assumptions may be justified but often they are not, especially in the face of a motivated attacker.

One recent gain for disk reliability has been ZFS. This filesystem has a mode that generates SHA-1 hashes for all the data it stores on disk. This is an excellent way to make sure a read error hasn’t been silently corrected by the disk to the wrong value or to catch firmware that lies about the status of the physical media. SHA-1 support in filesystems is still a new feature due to the performance loss and the lingering doubt whether it’s needed since it ostensibly duplicates the builtin ECC in the drive.

Mirroring (RAID1) is great because if a disk error occurred, you can often fetch a copy from the other disk. However, it does not address the problem of being certain that an error had occurred and that the data from the other drive was still valid. ZFS gives this assurance. Mirroring could actually be worse than no RAID if you were silently switched to a corrupted disk due to a transient timeout of the other drive. This could happen if its driver had bugs that caused it to timeout on read requests under heavy load, even if the data was perfectly fine. This is not a hypothetical example. It actually happened to me with an old ATA drive.

Still, engineers resist adding any kind of redundant checks, especially of computation results. For example, you rarerly see code adding the result of a subtraction to compare to the original input value. Yet in a critical application, like calculating Bill Gates’ bank account statement, perhaps such paranoia would be warranted. In security-critical code, the developer needs even more paranoia since the threat is active manipulation of the computing environment, not accidental memory flips due to cosmic radiation.

Software protection is one area where redundancy is crucial. A common way to bypass various checks in a protection scheme is to patch them out. For example, the conditional branch that checks for a copied disc to determine if the code should continue to load a game could be inverted or bypassed. By repeating this check in various places, implemented in different ways, it can become more difficult for an attacker to determine if they have removed them all.

As another example, many protection schemes aren’t very introspective. If there is a function that decrypts some data, often you can just grab a common set of arguments to that function and repeatedly call it from your own code, mutating the arguments each time. But if that function had a redundant check that walked the stack to make sure its caller was the one it expected, this could be harder to do. Now, if the whole program was sprinkled with various checks that validated the stack in many different ways, it becomes even harder to hack out any one piece to embed in the attacker’s code. To an ordinary developer, such redundant checks seem wasteful and useless.

Besides software protection, crypto is another area that needs this kind of redundant checking. I previously wrote about how you can recover an RSA private key merely by disrupting any part of the signing operation (e.g., by injecting a fault). The solution to this is for the signer to verify the signature they just made before revealing the signature to an attacker. That is, a sign operation now performs both a sign and verify. If you weren’t concerned about this attack, this redundant operation would seem useless. But without it, an attacker can recover a private key after observing only a single faulty signature.

We need developers to be more paranoid about the validity of stored or computed values, especially in an environment where there could be an adversary. Both software protection and crypto are areas where a very powerful attacker is always assumed to be present. However, other areas of software development could also benefit from this attitude, increasing security and reliability by “wasting” a couple cycles.

Felten on fingerprinting blank paper

Ed Felten posted last week about his upcoming publication on fingerprinting techniques for blank paper. This is a great paper with practical applications. It reminded me to discuss some of the fundamental issues with fingerprinting and the potential problems when it comes to real-world forgery.

Long ago, some copy protections for floppy disks used a method known as “weak bits”. The mastering equipment would write a long string of zeros to the disk. This would cause the read head to return a varying string of bits with some pattern to it. The software would check that this region returned different values each time it was read to make sure it wasn’t a copy.

Similar techniques have also been applied to magstripe media for credit cards. Magtek makes a reader that attempts to measure physical characteristics of the magnetic stripe and submit this as a fingerprint for a given card. The general idea is that while the data on a card can be easily duplicated with a reader, the manufacturing process for the physical media leaves behind certain random “noise” that is difficult to reproduce.

This is similar to the Felten paper. They attempt to create a unique fingerprint for a piece of paper based on the variations in fibers that are visible with a high-resolution scan. They take multiple scans from different angles as well.

All of these techniques have something in common. The characteristic being measured must actually have some uniqueness. There must be a cost-effective way to measure that characteristic. There must be a sampling mechanism that chooses different areas to examine. The fingerprint algorithm must combine the samples in a way that is resilient to natural errors (i.e., no false positives). Yet it also must be difficult for a forger to create a copy that is close enough to the original to be accepted by the verifier (i.e., no false negatives).

Both magstripes and paper appear to have enough inherent uniqueness. The manufacturing techniques of both do create a lot of low-level variation. But once this requirement is satisfied, the fingerprint approach itself is still subject to fundamental limitations. No fingerprinting method can avoid them. It needs to be resilient not only in the face of regular use (e.g., crumpling the paper) but also with intentionally malicious manipulation. The conflicting requirements to avoid false positives and yet also be difficult to clone are always the most difficult part of any kind of fingerprinting scheme. This is a fundamental problem with any kind of statistical decision process.

There are two kinds of forgery attacks: second pre-image and collision. The former is the most obvious one, where an attacker creates a copy that matches some existing original. The latter is much harder to prevent. To create a collision, that attacker can pre-process two pieces of paper in order to create two documents that the fingerprint algorithm judges as close enough to be identical. For example, the attacker can write a sequence of small dots to both pages in a similar pattern before printing the text. He can repeat this multiple times while varying the pattern until the verifier judges the papers as close enough. Depending on the sampling algorithm and the attacker’s printing capabilities, this may be more difficult. Section 6 of the paper discusses this kind of attack but it mostly focuses on preventing a second pre-image attack and most of the analysis is left for the future.

The key thing to remember is that the attacker does not need to make the papers actually identical by reproducing the exact pattern of fibers on the paper. The attacker doesn’t even have to have a particularly fine dot resolution, as long as the position of the dots can be controlled. The idea is that the printed pattern overwhelms the fine characteristics measured by the scanner and thus two documents are judged to be close enough by the verifier. It also would be interesting to see how the fingerprint technique does against darker colored paper.

This attack illustrates the fundamental limitation of this kind of fingerprint method. The verifier has to allow for some variation to prevent false positives. But an attacker can repeatedly try to exploit that rejection region by creating various pairs of documents until they pass.

All of this is based on a preliminary read of the paper, so I’m interested in what the Felten team plans to do to address this kind of problem.

Xbox 360 security talk

This recent video of Michael Steil and Felix Domke talking about the Xbox 360 security scheme is the best overview I’ve seen so far.

Michael previously gave a nice talk summarizing the security flaws in the original Xbox.

The CPU itself supports hashing and/or encryption of physical pages, based on flags set in the upper word of the 64-bit virtual address.  They talk about how Felix was able to leapfrog off shader-based DMA to write to an unencrypted register save state structure, jumping through a syscall gate (sorta like return-to-libc) that was improperly validated by the hypervisor.  The end result was arbitrary code execution in the context of the hypervisor.  Quite impressive.

I’ve always wondered how different security features like encrypted RAM that have long been present in smart cards would take to “trickle-up” to the more complex platforms like game consoles.  While the Xbox 360 security is much better than the original Xbox, it seems like the big-systems people are reinventing techniques already tested and worked out in the microcontroller world.

For example, the 360 was vulnerable to a timing attack, where an internal secret key can be guessed by timing how long it takes to validate the submitter’s HMAC.  I’d be extremely surprised if any mainstream smart card were vulnerable to such a well-known legacy bug.

I have yet to see anyone publish information about applying power or RF-based side channel analysis to a game console, despite smart cards adding countermeasures to these almost 10 years ago.  Even earlier attacks on encrypted RAM have still not been attempted on modern systems.

These attacks probably haven’t been needed yet since software bugs were still present. However, the push by game consoles and cellphone manufacturers to increase their resistance to software attacks means it won’t be long before side-channel resistance becomes a must-have feature.  It will be interesting to see how long it takes big-system manufacturers to add countermeasures and whether they’ll choose to learn from the hard lessons we have seen in the smart card world.

Hacker or hooker?

Well-funded and motivated attackers are typically the hardest to defend against when designing a system.  Governments can attack systems in different ways and with more resources than a typical threat.  Consider a recent example where a British aide lost his Blackberry after spending the night with a woman who approached him in a Chinese disco.  While it’s possible he just lost it while drunk, this is a good example of how unconventional threats need to be carefully considered.

Let’s analyze the cost of two routes to getting this same information: hacker or hooker.  The hacker might try to crack passwords or develop a 0-day exploit against the Blackberry server.  Or, build a custom trojan and send it via a forged email that appears to come from the Prime Minister.  The hooker would try to get to his hotel room and steal the phone.  It would actually suffice to just borrow it for a few minutes and dump the RAM since passwords are often cached there.  This has the added advantage that he might never know anything had happened.

A 0-day exploit could be in the $20,000 range.  Hiring someone to develop and target a trojan at this aide would be less, but the chance of succeeding would be lower.  According to the stories about Eliot Spitzer, a high-end call girl is $1,500 per hour.  Assuming it takes four hours, the total cost would be $6,000.  The fact that both these approaches could be done in China means the actual cost would be lower but probably still a similar ratio.

There are a lot of other advantages to the hooker approach besides cost.  There is good deniability if the call girl gets caught.  Since the call girl remains within the attacking country’s jurisdiction, the police can be used to recover the Blackberry if she makes an extortion attempt.  The hacker approach has a lot more uncertainty as flaws could be patched or blocked, making the exploit useless.

I also think this gives good support to my claim that software protection techniques are on the verge of wider adoption.  With cold boot attacks and growing news of governments seizing laptops or stealing cell phones, systems must remain secure even when an attacker has physical possession of a powered-up device.  The only way to do this is to adopt software and hardware techniques that are already used to protect games, DRM, and satellite TV.  Traditional approaches like those used in network security are no longer enough.

I’ll be speaking on this topic along with Thomas Ptacek at WOOT, co-hosted at USENIX on July 28th in San Jose.  Since this event is invite-only, send me email if you’re a security researcher who would like to attend.  Please include a brief summary of your background.