Baysec meetup on May 16

Interested in meeting other infosec professionals? Not interested in a big commitment or sales pitch? Live in the SF Bay area? Then Baysec is for you!

We’ll be meeting on Wednesday, May 16 at 7 pm at Zeitgeist in San Francisco. We’ll grab a table, have some drinks, and chat about security. No sponsor currently so your tab is on you but that’s the only cost.

To find out more, join the low-traffic mailing list <baysec-subscribe at sockpuppet.org>. No RSVP needed unless you want someone to save you a seat.  Update: here’s the official announcement.

RSA public keys are not private

RSA is an extremely malleable primitive and can hardly be called “encryption” in the traditional sense of block ciphers. (If you disagree, try to imagine how AES could be vulnerable to full plaintext exposure to anyone who can submit a message and get a return value of “ok” or “decryption incorrect” based only on the first couple bits.) When I am reviewing a system and see the signing operation described as “RSA-encrypt the signature with the private key,” I know I’ll be finding some kind of flaw in that area of the implementation.

The entire PKCS#1 standard exists to add structure to the payload of RSA to eliminate as much malleability as possible. To distinguish the RSA primitive operation from higher-level constructs that use the RSA primitive, I use the following terminology.

RSA-Pub/Priv-Op Perform an RSA operation using the appropriate key (exponent/modulus): dataexp mod n
RSA-Encrypt Add random padding and transform with RSA-Pubkey-Op
RSA-Decrypt Transform with RSA-Privkey-Op and check padding in result
RSA-Sign Add padding and transform with RSA-Privkey-Op
RSA-Verify Transform with RSA-Pubkey-Op and check padding in result, hash data, compare with hash in payload and return True if match, else False

With that out of the way, a common need in embedded systems is secure code updates. A good approach is to RSA-Sign the code updates at the manufacturer. The fielded device then loads the update into RAM, performs RSA-Verify to be sure it is valid, checks version and other info, then applies the update to flash.

One vendor was signing their updates, but decided they also wanted privacy for the update contents. Normally, this would involve encrypting the update with AES-CBC before signing. However, they didn’t want to add another key to the secure processor since the design was nearly complete.

The vendor decided that since they already had an RSA public key in the device for signature verification, they would also encrypt the update to that key, using the corresponding private key. The “public” key was never revealed outside their secure processor so it was considered secret.

Let’s call the public/private key pair P1 and S1, respectively. Here’s their new update process:

rsa-sig1.png

On receiving the update, the secure processor would do the inverse, verifying the signature and then decrypting the code. (Astute reader: yes, there would be a chaining problem if encrypting more than one RSA block’s worth of data but let’s assume the code update is small.)

What’s wrong? Nothing, if you’re thinking about RSA in terms of conventional crypto. The vendor is encrypting something to a key that is not provided to an attacker. Shouldn’t this be as secure as a secret AES key? Next post, I’ll reveal how the malleability of the RSA primitive allows n to be easily calculated.

Functional languages and reverse engineering

In a previous discussion, Tim Newsham said

“I would like to see someone reverse engineer some small Haskell programs. The compilation techniques are totally foreign to anyone familiar with standard imperative languages and there are no tools designed specifically for the task.”

He then provided a link to some examples to analyze. Another commenter brought up Standard ML, another functional language. (I assume he means the NJ Standard ML implementation, but it could also be OCaml or Moscow ML as Dan Moniz pointed out.) Tim responded:

“I don’t know entirely. I’m not familiar with ML compiler implementation. They could use similar compilation techniques, but might not. ML is not ‘pure’ (and additionally is strict) so the compilation techniques might be different.”

He also provided links to a couple papers on implementing compilers for functional language. One commenter took a brief look at Tim’s examples:

“I took a look. The compiled Haskell is definitely different from the compiled ML I looked at. Roughly the same order of magnitude as to how terrible it was, though. Mine actually used Peano arithmetic on lists for simple arithmetic operations. What was funny was the authors of that program bragging about how algorithmically fast their technology was. I couldn’t help but think, after examining some entire functions and finding that all of the code was dead except for a tiny fraction of the instructions, how much a decent back-end (something with constant propagation and dead-code elimination) could have improved the runtime performance.”

Since one common obfuscation technique is to implement a VM and then write your protection code in that enviroment, how obfuscated is compiled object code from standard functional programming languages?

WOOT = Usenix + Blackhat

The call for papers is now up for a new Usenix workshop, WOOT (Workshop On Offensive Technologies, but don’t think the name came before the acronym.) The workshop will be co-hosted with Usenix Security and will focus on new practical attacks.

I was recently saying that vulnerability research could use more Peer Review instead of the other kind of PR (i.e., vague news stories, user-scaring Month of X Bugs). So help the community out here by submitting quality papers, especially if you’ve never submitted one before. I think the goal of bridging the gap between slideware (e.g., Blackhat) and 15th generation theoretical overlay network designs (e.g. Usenix Security) is a great one.

Also, I’m on the program committee but don’t hold that against them.

GPG now requires pinentry package

As a FreeBSD committer, I also run FreeBSD on a lot of my machines. I recently upgraded my desktop with portupgrade and found that gnupg no longer worked. I got the error message:

gpg-agent[13068]: can’t connect server: `ERR 67109133 can’t exec `/usr/local/bin/pinentry’: No such file or directory’
gpg-agent[13068]: can’t connect to the PIN entry module: IPC connect call failed
gpg-agent[13068]: command get_passphrase failed: No pinentry
gpg: problem with the agent: No pinentry

I found these two articles and noticed that my gpg had been upgraded from the 1.x to 2.x series. The 1.x gpg had an integrated password entry prompt but 2.x requires an external package. This can be fixed by installing the security/pinentry port. I’m not sure why it wasn’t marked as a dependency for gpg2.

Anti-debugging techniques of the past

During the C64 and Apple II years, a number of interesting protection schemes were developed that still have parallels in today’s systems. The C64/1541 disk drive schemes relied on cleverly exploiting hardware behavior since there was no security processor onboard to use as a root of trust. Nowadays, games or DRM systems for the PC still have the same limitation, while modern video game systems rely more on custom hardware capabilities.

Most targeted anti-debugger techniques rely on exploiting shared resources. For example, a single interrupt vector cannot be used by both the application and the debugger at the same time. Reusing that resource as part of the protection scheme and for normal application operations forces the attacker to modify some other shared resource (perhaps by hooking the function prologue) instead.

One interesting anti-debugger technique was to load the computer’s protection code into screen memory. This range of RAM, as with modern integrated video chipsets, is regular system memory that is also mapped to the display chip. Writing values to this region changes the display. Reading it returns the pixel data that is onscreen. Since it’s just RAM, the CPU can execute out of it as well but the user would see garbage on the screen. The protection code would change the foreground and background colors to be the same, making the data invisible while the code executed.

When an attacker broke into program execution with a machine language monitor (i.e., debugger), the command prompt displayed by the monitor would overwrite the protection code. If it was later resumed, the program would just crash because the critical protection code was no longer present in RAM. The video memory was a shared resource used by both the protection and the debugger.

Another technique was to load code into an area of RAM that was cleared during system reset. If an attacker reset the machine without powering it off, the data would ordinarily still be present in RAM. However, if it was within a region that was zeroed by the reset code in ROM, nothing would remain for the attacker to examine.

As protection schemes became stronger in the late 1980’s, users resorted to hardware-based attacks when software-only copying was no longer possible. Many protection schemes took advantage of the limited RAM in the 1541 drive (2 KB) by using custom bit encoding on the media and booting a custom loader/protection routine into the drive RAM to read it. The loader would lock out all access by the C64 to drive memory so it could not easily be dumped and analyzed.

The copy system authors responded in a number of ways. One of them was to provide a RAM expansion board (8 KB) that allowed the custom bit encoding to be read all in one chunk, circumventing the boundary problems that occurred when copying in 2 KB chunks and trying to stitch them back together. It also allowed the protection code in the drive to be copied up to higher memory, saving it from being zeroed when the drive was reset. This way the drive would once again allow memory access by the C64 and the loader could be dumped and analyzed.

Protection authors responded by crashing the drive when expanded RAM was present. With most hardware memory access, there’s a concept known as “mirroring.” If the address space is bigger than the physical RAM present, accesses to higher addresses wrap around within the actual memory. For example, accesses to address 0, 2 KB and 4 KB would map to the same RAM address (0) on a 1541 with stock memory. But on a drive with 8 KB expanded RAM, these would map to three different locations.

1541ram1.png

One technique was to scramble the latter part of the loader. The first part would descramble the remainder but use addresses above 2 KB to do its work. On a stock 1541, the memory accesses would wrap around, descrambling the proper locations in the loader. On a modified drive, they would just write garbage to upper memory and the loader would crash once it got to the still-scrambled code in the lower 2 KB.

1541ram2.png

These schemes were later circumvented by adding a switch to the RAM expansion that allowed it to be switched off when it wasn’t in use, but this did add to the annoyance factor of regularly using a modified drive.