Theories of how PS Jailbreak works

A company recently announced a modchip for the PS3 and claims they will be shipping them soon. It plugs into the USB port and allows running backup games. Inside the device is an ATmega8U2 USB microcontroller, something I’ve worked with before. While I didn’t have access to this device and don’t know PS3 internals, I spent a few minutes looking at a hex dump from a USB trace.

An article on Gamefreax (German, alternate in English) claimed to have reverse-engineered this device. They believe that a stack overflow in USB config descriptor processing allows the device to execute code on the PS3 host. However, their page shows a useless set of packets from a USB trace and some hex code obscured with a logo so it’s not possible to verify their claims from the info given.

In a later comment, a user named Descrambler posted a more complete dump of hex data from the USB trace. It is still not fully complete, but it’s enough to look into more of the details.

The first trace starts with a standard configuration descriptor. It is followed by a relatively standard interface descriptor, except it is class 254, subclass 1, and protocol 2. Since this is in the reserved class range, perhaps this is related to an internal Sony test tool. Following this descriptor is some data and PowerPC code.

The second trace starts with a config descriptor that is a bit weird. It claims the total length is 77 bytes (68 bytes of interface descriptors after the 9-byte config descriptor). It also claims to support 10 interfaces. With the standard interface descriptor length of 9 bytes, the total length should be 99 bytes.

There are multiple ways this might affect the PS3. If it believes the total length field for sizing the buffer, the first bytes after the initial 77 could overflow a buffer (in this case, “00 00 fe 01 02 00 09 04” are what follows). Or it might simply copy it into a static buffer (if 256 bytes long, the first bytes to overflow would be “fe 01 02 00 09 04 00 00”).

If it’s not an overflow, it could be related to how the PS3 parses the first 10 interface descriptors. The sequence is not regular. Taken in 9-byte chunks, it diverges after the first 6 interface descriptors, giving a next descriptor of “09 00 09 04 00 00 00 fe 01”. This is not a valid descriptor. Or, if the PS3 parses lengths of descriptors, it will end up with a few very short ones (“02 00”).

These are all just theories. It’s quite possible the second trace is just a decoy, meant to slow down reversing. The behavior described by Gamefreax cannot be validated from the USB traces posted by Descrambler. It appears Gamefreax may have misread the trace (77 bytes in the second trace is 0x4d, but they claim a descriptor length is 0xAD). Also, Descrambler’s hex dumps are incomplete and don’t show the various phases described in the Gamefreax post.

It’s definitely too early to claim that the PS Jailbreak exploit has been reverse-engineered. However, it should be quite easy to clone since all the data needed to do so is present in a USB trace. Just paste the data into example code for an AT90USB and replay the same descriptors. You might have to add in a bus disconnect in the right place but it should be relatively simple.

A new direction for homebrew console hackers?

A recent article on game console hacking focused on the Wii and a group of enthusiasts who hack it in order to run Linux or homebrew games. The article is very interesting and delves into the debate about those who hack consoles for fun and others who only care about piracy. The fundamental question behind all this: is there a way to separate the efforts of those two groups, limiting one more than the other?

Michael Steil and Felix Domke, who were mentioned in the article, gave a great talk about Xbox 360 security a few years ago. Michael compared the history of Xbox 360 security to the PS3 and Wii, among other consoles. (Here’s a direct link to the relevant portion of the video). Of all the consoles, only the PS3 was not hacked at the time, although it has since been hacked. Since the PS3 had an officially supported method of booting Linux, there was less reason for the homebrew community to attack it. It was secure from piracy for about 3 years, the longest of any of the modern consoles.

Michael’s claim was that all of the consoles had been hacked to run homebrew games or Linux, but the ultimate result was piracy. This was likely due to the hobbyists having more skill than the pirates, something which has also been the case in smart phones but less so in satellite TV. The case of the PS3 also supports his theory.

Starting back in the 1980’s, there has been a history of software crackers getting jobs designing new protection methods. So what if the homebrew hackers put more effort into protecting their methods from the pirates? There are two approaches they might take: software or hardware protection.

Software protection has been used for exploits before. The original Xbox save game exploit used some interesting obfuscation techniques to limit it to only booting Linux. It stored its payload encrypted in the JPEG header of a penguin image. It didn’t bypass code signature verification completely, it modified the Xbox’s RSA public key to have a trivial factor, which allowed the author to sign his own images with a different private key.

With all this work, it took about 3 months for someone to reverse-engineer it. At that point, the same hole could be used to run pirated games. However, this hack didn’t directly enable piracy because there were already modchip-based methods in use. So, while obfuscation can add some time to pirates getting access to the exploit, it wasn’t much.

Another approach is to embed the exploit in a modchip. These have long been used by pirates to protect their exploits from other pirates. As soon as another group clones an exploit, the price invariably goes down. Depending on the exploitation method and protection skill of the designer, reverse-engineering the modchip can be as hard as developing the exploit independently.

The homebrew community does not release many modchips because of the development cost. But if they did, it’s possible they could reduce the risk of piracy from their exploits. It would be interesting to see a homebrew-only modchip, where games were signed by a key that certified they were independently developed and not just a copy of a commercial game. The modchip could even be a platform for limiting exploitation of new holes that were only used for piracy. In effect, the homebrew hackers would be setting up their own parallel system of control to enforce their own code of ethics.

Software and hardware protection could slow down pirates acquiring exploits. However, the approach that has already proven effective is to limit the attention of the homebrew hackers by giving them limited access to the hardware. Game console vendors should take into account the dynamics of homebrew hackers versus the pirates in order to protect their platform’s revenue.

But what can you also do about it, homebrew hackers? Can you design a survivable system for keeping your favorite console safe from piracy while enabling homebrew? Enforce a code of ethics within your group via technical measures? If anyone can make this happen, you can.

Attacking RSA exponentiation with fault injection

A new paper, “Fault-Based Attack of RSA Authentication” (pdf) by Pellegrini et al, is making the rounds. The general idea is that an attacker can disrupt an RSA private key operation to cause an invalid signature to be returned, then use that result to extract the private key. If you’re new to fault injection attacks on RSA, I previously wrote an intro that should help.

The main concept to grasp is that public key crypto is brittle. In the case of RSA’s CRT operation, a single bit error in one multiplication result is enough to fully compromise your private key. We’ve known this since 1997. The solution is simple: validate every signature with the public key before returning it to the caller.

The authors noticed something curious. OpenSSL does verify signatures it generates before returning them, but if it detects a problem, it does not just return an error. It then tries again using a different exponentiation process, and then returns that signature without validating it.

Think about this for a moment. What conditions could cause an RSA private key operation to compute an invalid answer? An innocent possibility is cosmic radiation, bad RAM, etc. In this case, all computations should be considered unreliable and any retried operation should be checked very carefully. The other and more likely possibility is that the system is under attack by someone with physical proximity. In this case, OpenSSL should generate a very obvious log message and the operation should not be retried. If it is, the result should be checked very carefully.

For whatever reason, the OpenSSL programmers decided to retry with fixed-window exponentiation and trust that since there were no published fault attacks for it, they didn’t have to validate its result. This is a foolhardy attitude — not something you want to see in your crypto library. There had been many other fault injection attacks against various components or implementation approaches for RSA, including right-to-left exponentiation. There was no reason to consider left-to-right exponentiation invulnerable to this kind of attack.

Fixed-window exponentiation is a form of sliding window exponentiation. This is just a table-based optimization, where a window (say, 3 bits wide) is moved across the exponent, computing the final result incrementally. While this may be resistant to some timing attacks (but not cache timing or branch prediction attacks), there is nothing that would prevent fault injection attacks.

Indeed, it turns out to be vulnerable. The authors generate a few thousand signatures with single bit-flips in some window of the signature. Then they compare the faulty signatures to a correct signature over the same message. They compute the value for that portion of the private exponent since there are only 2w possibilities for that location if w is the window size in bits. This is repeated until enough of the private key is known.

The method they used to create the faulty signatures was a bit artificial. They built a SPARC system on an FPGA running Linux and OpenSSL. They then decreased the power supply voltage until multiplies started to fail. Since multiplication logic is a relatively long chain, it is often one of the first things to fail. However, a more interesting hardware result would be to attempt this kind of attack on an actual server because FPGAs work differently than ASICs. It might require careful targeting of the right power pins on the CPU. Since power pins are numerous in modern systems, this may be more effective than only modulating the system power supply.

This was a nice attack but nothing earth-shattering. The only thing I was floored by (yet again), was the willingness of crypto implementers to perform unsafe operations in the face of an almost certain attack. Shame on OpenSSL.

Reverse-engineering a smart meter

In 2008, a nice man from PG&E came out to work on my house. He installed a new body for the gas meter and said someone would come by later to install the electronics module to make it a “smart meter“. Since I work with security for embedded systems, this didn’t sound very exciting. I read up on smart meters and found they not only broadcast billing information (something I consider only a small privacy risk) but also provide remote control. A software bug, typo at the control center, or hacker could potentially turn off my power and gas. But how vulnerable was I actually?

I decided to look into how smart meters work. Since the electronics module never was installed, I called up various parts supply houses to try to buy one. They were quite suspicious, requesting company background info and letterhead before deciding if they could send an evaluation sample. Even though this was long before IOActive outed smart meter flaws to CNN, they had obviously gotten the message that these weren’t just ordinary valves or pipes.

Power, gas, and water meters have a long history of tampering attacks. People have drilled into them, shorted them out, slowed them down, and rewired them to run backwards. I don’t think I need to mention that doing those kinds of things is extremely dangerous and illegal. This history is probably why the parts supplier wasn’t eager to sell any smart meter boards to the public.

There’s always an easier way. By analyzing the vendor’s website, I guessed that they use the same radio module across product lines and other markets wouldn’t be so paranoid. Sure enough, the radio module for a water meter made by the same vendor was available on Ebay for $30. It arrived a few days later.

The case was hard plastic to prevent water damage. I used a bright light and careful tapping to be sure I wasn’t going to cut into anything with the Dremel. I cut a small window to see inside and identified where else to cut. I could see some of the radio circuitry and the battery connector.

After more cutting, it appeared that the battery was held against the board by the case and had spring-loaded contacts (see above). This would probably zeroize the device’s memory if it was cut open by someone trying to cheat the system. I applied hot glue to hold the contacts to the board and then cut away the rest of the enclosure.

Inside, the board had a standard MSP430F148 microcontroller and a metal cage with the radio circuitry underneath. I was in luck. I had previously obtained all the tools for working with the MSP430 in the Fastrak transponder. These CPUs are popular in the RFID world because they are very low power. I used the datasheet to identify the JTAG pinouts on this particular model and found the vendor even provided handy pads for them.

Since the pads matched the standard 0.1″ header spacing, I soldered a section of header directly to the board. For the ground pin, I ran a small wire to an appropriate location found with my multimeter. Then I added more hot glue to stabilize the header. I connected the JTAG cable to my programmer. The moment of truth was at hand — was the lock bit set?

Not surprisingly (if you read about the Fastrak project), the lock bit was not set and I was able to dump the firmware. I loaded it into the IDA Pro disassembler via the MSP430 CPU plugin. The remainder of the work would be to trace the board’s IO pins to identify how the microcontroller interfaced with the radio and look for protocol handling routines in the firmware to find crypto or other security flaws.

I haven’t had time to complete the firmware analysis yet. Given the basic crypto flaws in other smart meter firmware (such as Travis Goodspeed finding a PRNG whose design was probably drawn in crayon), I expect there would be other stomach-churning findings in this one. Not even taking rudimentary measures such as setting the lock bit does not bode well for its security.

I am not against the concept of smart meters. The remote reading feature could save a lot of money and dog bites with relatively minimal privacy exposure, even if the crypto was weak. I would be fine if power companies offered an opt-in remote control feature in exchange for lower rates. Perhaps this feature could be limited to cutting a house’s power to 2000 watts or something.

However, something as important as turning off power completely should require a truck roll. A person driving a truck will not turn off the mayor’s power or hundreds of houses at once without asking questions. A computer will. Remote control should not be a mandatory feature bundled with remote reading.

PS3 hypervisor exploit reproduced

There’s a nice series of articles by xorloser on reproducing the recent PS3 hypervisor hack. He used a microcontroller to send the glitch and improved the software exploit to work on multiple firmware revisions. Here’s a picture of his final setup.

It remains to be seen what security measures Sony has taken to address a hypervisor compromise. One countermeasure would be to lock down the OtherOS environment, since the attack depends on the ability to manipulate low-level OS memory structures. They could be using a simpler hypervisor than the GameOS side (say, one that just prevents access to the GPU). Perhaps the SPEs have a disable bit that turns off the hardware decryption unit, and the hypervisor does this before booting OtherOS.

Beyond this, they may not be using a single global key that is shared amongst all SPEs. Broadcast encryption schemes have long been used in the pay TV industry to allow fine-grained revocation of keys that have leaked. They work by embedding a subset of keys from a matrix or tree in each device. If the keys leak, they can be excluded from subsequent software releases. This requires attackers to keep extracting keys and discarding the devices as they are revoked.

Also, it’s possible there are software protection measures in place. For example, the SPE could request hashes of regions of the calling hypervisor and use this to detect patching. This results in a cat-and-mouse game where firmware updates (or even individual games) use different methods of detecting attackers. Meanwhile, attackers would try to come up with new ways to avoid these countermeasures. This has already been happening in the Xbox 360 world, as well as with nearly every other game console before now.

We’ll have to wait and see if Sony used this kind of defense-in-depth and planned for this eventuality or built a really tall wall with nothing more behind it.

How the PS3 hypervisor was hacked

George Hotz, previously known as an iPhone hacker, announced that he hacked the Playstation 3 and then provided exploit details. Various articles have been written about this but none of them appear to have analyzed the actual code. Because of the various conflicting reports, here is some more analysis to help understand the exploit.

The PS3, like the Xbox360, depends on a hypervisor for security enforcement. Unlike the 360, the PS3 allows users to run ordinary Linux if they wish, but it still runs under management by the hypervisor. The hypervisor does not allow the Linux kernel to access various devices, such as the GPU. If a way was found to compromise the hypervisor, direct access to the hardware is possible, and other less privileged code could be monitored and controlled by the attacker.

Hacking the hypervisor is not the only step required to run pirated games. Each game has an encryption key stored in an area of the disc called ROM Mark. The drive firmware reads this key and supplies it to the hypervisor to use to decrypt the game during loading. The hypervisor would need to be subverted to reveal this key for each game. Another approach would be to compromise the Blu-ray drive firmware or skip extracting the keys and just slave the decryption code in order to decrypt each game. After this, any software protection measures in the game would need to be disabled. It is unknown what self-protection measures might be lurking beneath the encryption of a given game. Some authors might trust in the encryption alone, others might implement something like SecuROM.

The hypervisor code runs on both the main CPU (PPE) and one of its seven Cell coprocessors (SPE). The SPE thread seems to be launched in isolation mode, where access to its private code and data memory is blocked, even from the hypervisor.  The root hardware keys used to decrypt the bootloader and then hypervisor are present only in the hardware, possibly through the use of eFUSEs. This could also mean that each Cell processor has some unique keys, and decryption does not depend on a single global root key (unlike some articles that claim there is a single, global root key).

George’s hack compromises the hypervisor after booting Linux via the “OtherOS” feature. He has used the exploit to add arbitrary read/write RAM access functions and dump the hypervisor. Access to lv1 is a necessary first step in order to mount other attacks against the drive firmware or games.

His approach is clever and is known as a “glitching attack“. This kind of hardware attack involves sending a carefully-timed voltage pulse in order to cause the hardware to misbehave in some useful way. It has long been used by smart card hackers to unlock cards. Typically, hackers would time the pulse to target a loop termination condition, causing a loop to continue forever and dump contents of the secret ROM to an accessible bus. The clock line is often glitched but some data lines are also a useful target. The pulse timing does not always have to be precise since hardware is designed to tolerate some out-of-spec conditions and the attack can usually be repeated many times until it succeeds.

George connected an FPGA to a single line on his PS3’s memory bus. He programmed the chip with very simple logic: send a 40 ns pulse via the output pin when triggered by a pushbutton. This can be done with a few lines of Verilog. While the length of the pulse is relatively short (but still about 100 memory clock cycles of the PS3), the triggering is extremely imprecise. However, he used software to setup the RAM to give a higher likelihood of success than it would first appear.

His goal was to compromise the hashed page table (HTAB) in order to get read/write access to the main segment, which maps all memory including the hypervisor. The exploit is a Linux kernel module that calls various system calls in the hypervisor dealing with memory management. It allocates, deallocates, and then tries to use the deallocated memory as the HTAB for a virtual segment. If the glitch successfully desynchronizes the hypervisor from the actual state of the RAM, it will allow the attacker to overwrite the active HTAB and thus control access to any memory region. Let’s break this down some more.

The first step is to allocate a buffer. The exploit then requests that the hypervisor create lots of duplicate HTAB mappings pointing to this buffer. Any one of these mappings can be used to read or write to the buffer, which is fine since the kernel owns it. In Unix terms, think of these as multiple file handles to a single temporary file. Any file handle can be closed, but as long as one open file handle remains, the file’s data can still be accessed.

The next step is to deallocate the buffer without first releasing all the mappings to it. This is ok since the hypervisor will go through and destroy each mapping before it returns. Immediately after calling lv1_release_memory(), the exploit prints a message for the user to press the glitching trigger button. Because there are so many HTAB mappings to this buffer, the user has a decent chance of triggering the glitch while the hypervisor is deallocating a mapping. The glitch probably prevents one or more of the hypervisor’s write cycles from hitting memory. These writes were intended to deallocate each mapping, but if they fail, the mapping remains intact.

At this point, the hypervisor has an HTAB with one or more read/write mappings pointing to a buffer it has deallocated. Thus, the kernel no longer owns that buffer and supposedly cannot write to it. However, the kernel still has one or more valid mappings pointing to the buffer and can actually modify its contents. But this is not yet useful since it’s just empty memory.

The exploit then creates a virtual segment and checks to see if the associated HTAB is located in a region spanning the freed buffer’s address. If not, it keeps creating virtual segments until one does. Now, the user has the ability to write directly to this HTAB instead of the hypervisor having exclusive control of it. The exploit writes some HTAB entries that will give it full access to the main segment, which maps all of memory. Once the hypervisor switches to this virtual segment, the attacker now controls all of memory and thus the hypervisor itself. The exploit installs two syscalls that give direct read/write access to any memory address, then returns back to the kernel.

It is quite possible someone will package this attack into a modchip since the glitch, while somewhat narrow, does not need to be very precisely timed. With a microcontroller and a little analog circuitry for the pulse, this could be quite reliable. However, it is more likely that a software bug will be found after reverse-engineering the dumped hypervisor and that is what will be deployed for use by the masses.

Sony appears to have done a great job with the security of the PS3. It all hangs together well, with no obvious weak points. However, the low level access given to guest OS kernels means that any bug in the hypervisor is likely to be accessible to attacker code due to the broad API it offers. One simple fix would be to read back the state of each mapping after changing it. If the write failed for some reason, the hypervisor would see this and halt.

It will be interesting to see how Sony responds with future updates to prevent this kind of attack.

[Edit: corrected the description of virtual segment allocation based on a comment by geohot.]