Why RSA encryption padding is critical

Public key cryptosystems like RSA are often taught by describing how they work at a high level. Even an algorithm textbook written by one of the inventors of RSA focuses more on exponentiation and how the decryption operation inverts ciphertext than on the underlying security details. However, all public key algorithms have various requirements that are not part of the basic primitive but are vital for security.

RSA requires the plaintext to be armored during encryption/signing and the result to be verified during decryption/verification. Unfortunately, this armoring is commonly called “padding”, which means some implementers think it functions like ordinary protocol padding. The interoperability principle (“be strict in what you send and lenient in what you accept”) is exactly opposite how public key crypto must be implemented. Padding cannot be ignored and if even one bit is out of place, the message is invalid. Failure to implement all the steps correctly could allow attackers to forge signatures, decrypt ciphertext, or even recover the private key.

I’ve written before about how failure to verify a signature result properly could result in forged signatures or recovery of the private key. This time, I’d like to explain an attack that shows why encryption armoring works differently than signature armoring.

Most RSA implementations use an optimization called the Chinese Remainder Theorem or CRT. With CRT, the implementation performs the exponentiation me mod n as two separate operations, me mod p and me mod q. The two parts are combined at the end. Since p and q are about half the size of n (which is p * q), this speeds up the exponentiation a lot because the multi-word numbers are smaller.

CRT can also be used to attack RSA. Consider an implementation that does not use any armoring. The encryption operation is simply the RSA primitive itself. A sender wants to send a message to three separate recipients. Thus, the sender calculates:

me mod A
me mod B
me mod C

The attacker can’t calculate the inverse of one of these encryptions directly because the eth root problem in each ring is difficult. However, because the message is the same for each recipient (but different ciphertexts), she can convert these operations into a group where the inverse operation is easy. To do this, she uses CRT to combine the three ciphertexts to get:

me mod A*B*C

Since m is smaller than each of A, B, and C, me is smaller than A*B*C if e=3. This means the attacker just has to calculate the cube root of the result, an operation that is easy in the monoid of integers modulo A*B*C. This is essentially an integer cube root, ordinary arithmetic. This shows why PKCS #1 armoring for encryption has always been randomized. It can be fatal to encrypt the same message multiple times, even to the same recipient. For signatures, it is more secure to randomize the padding as in RSASSA-PSS, but it is not yet fatal for legacy systems to continue to use PKCS #1 v1.5 signature padding, which is not randomized.

In public key crypto, padding is not an optional feature. It is a critical part of the cryptosystem security. The latest version of PKCS #1 (v2.1 as of this writing) should be used for both encryption/signing and decryption/verification. For new implementation, use the approaches that first appeared in v2.0 (RSAES-OAEP for encryption and RSASSA-PSS for signing). Failure to properly manage RSA armoring could allow attackers to forge signatures, decrypt ciphertext, or even recover your private key.

A traveler’s plea to credit card issuers

Credit cards are all about convenience. Part of the reason for the move to contactless cards is decreasing the “transaction friction”. Studies have shown that people spend money more casually the easier it is to approve the transaction. So why do I feel like a second-class citizen when using my credit card in Europe?

As this fascinating documentary shows, credit cards started as an elite accessory for only the richest people. This is why the banks were able to charge such high interest rates — they restricted their clientele to those who could afford it. As various states (especially Delaware and North Dakota) relaxed the rules, banks moved their card operations there and began offering credit cards to more people. Today, credit cards are a common part of everyday life.

When traveling in Europe, a credit card is very useful. You get an automatic currency exchange with no need to carry around unfamiliar coins or make repeated trips to the bank if you underestimate how much you’ll spend. But if you carry a US credit card, you are shunned.

At nearly every restaurant, I’ve had the pleasure of instructing the waiter how to swipe the magstripe card. Most of them are unfamiliar with the proper orientation of the card or the correct speed. Ending every meal with a delay and apology is no fun.

Want to rent a bicycle from the Velib automatic dispensers all over France? Sorry, you can’t.

Want to take a local train in Geneva but don’t have coins? Sorry, your card won’t work either. (This caused me to miss a train with a connection that only happens every 90 minutes.)

Want to ride the TGV high-speed rail system and didn’t buy a ticket in advance? Sorry, you have to wait in the long line for a live agent. Your card won’t work in the kiosks.

The reason for all this is that European smart cards contain a chip that supports the EMV payment standard. While the US system is stuck in the 1960’s with magstripe and online verification, smart cards provide quick and cryptographically secure offline transactions. To be fair, changing out all the US terminals to support EMV would be an expensive undertaking. Also, there are estimates that smart cards cost the banks around $1.25 each while a mag card is about $0.25. I’ve heard a rumor that most of the cost of a mag card is to license the hologram. Here are two articles that describe why the switch to smart cards is taking so long.

The sad thing is, I’ve worked with smart cards for ten years. My previous company, Cryptography Research, licenses side-channel countermeasures to all the major smart card manufacturers. Experiencing these inconveniences while exhibiting at the biggest smart card trade show is probably the height of irony.

What if the credit card companies offered US citizens an upgrade option to the “International Traveler” card? I’d be happy to pay a one-time fee of $20 for a smart card option. Even though it would currently be useless in the US, at least it would save me some hassle overseas and make my card less vulnerable to skimming attacks in some countries. At a time of declining fees and increased regulation, any credit card company want the additional revenue?

Next Baysec: Sept 23 at Kate O’Briens

The next Baysec meeting is September 23 at Kate O’Briens. Come out and meet fellow security people from all over the Bay Area. As always, this is not a sponsored meeting, there is no agenda or speakers, and no RSVP is needed.

See you Wednesday, September 23, 7-11 pm. We’ll be towards the back.

Kate O’Briens
579 Howard St. @ 2nd, San Francisco
(415) 882-7240

Repeating a check sometimes makes sense

When reviewing source code, I often see a developer making claims like “this can’t happen here” or “to reach this point, X condition must have occurred”. These kinds of statements make a lot of assumptions about the execution environment. In some cases, the assumptions may be justified but often they are not, especially in the face of a motivated attacker.

One recent gain for disk reliability has been ZFS. This filesystem has a mode that generates SHA-1 hashes for all the data it stores on disk. This is an excellent way to make sure a read error hasn’t been silently corrected by the disk to the wrong value or to catch firmware that lies about the status of the physical media. SHA-1 support in filesystems is still a new feature due to the performance loss and the lingering doubt whether it’s needed since it ostensibly duplicates the builtin ECC in the drive.

Mirroring (RAID1) is great because if a disk error occurred, you can often fetch a copy from the other disk. However, it does not address the problem of being certain that an error had occurred and that the data from the other drive was still valid. ZFS gives this assurance. Mirroring could actually be worse than no RAID if you were silently switched to a corrupted disk due to a transient timeout of the other drive. This could happen if its driver had bugs that caused it to timeout on read requests under heavy load, even if the data was perfectly fine. This is not a hypothetical example. It actually happened to me with an old ATA drive.

Still, engineers resist adding any kind of redundant checks, especially of computation results. For example, you rarerly see code adding the result of a subtraction to compare to the original input value. Yet in a critical application, like calculating Bill Gates’ bank account statement, perhaps such paranoia would be warranted. In security-critical code, the developer needs even more paranoia since the threat is active manipulation of the computing environment, not accidental memory flips due to cosmic radiation.

Software protection is one area where redundancy is crucial. A common way to bypass various checks in a protection scheme is to patch them out. For example, the conditional branch that checks for a copied disc to determine if the code should continue to load a game could be inverted or bypassed. By repeating this check in various places, implemented in different ways, it can become more difficult for an attacker to determine if they have removed them all.

As another example, many protection schemes aren’t very introspective. If there is a function that decrypts some data, often you can just grab a common set of arguments to that function and repeatedly call it from your own code, mutating the arguments each time. But if that function had a redundant check that walked the stack to make sure its caller was the one it expected, this could be harder to do. Now, if the whole program was sprinkled with various checks that validated the stack in many different ways, it becomes even harder to hack out any one piece to embed in the attacker’s code. To an ordinary developer, such redundant checks seem wasteful and useless.

Besides software protection, crypto is another area that needs this kind of redundant checking. I previously wrote about how you can recover an RSA private key merely by disrupting any part of the signing operation (e.g., by injecting a fault). The solution to this is for the signer to verify the signature they just made before revealing the signature to an attacker. That is, a sign operation now performs both a sign and verify. If you weren’t concerned about this attack, this redundant operation would seem useless. But without it, an attacker can recover a private key after observing only a single faulty signature.

We need developers to be more paranoid about the validity of stored or computed values, especially in an environment where there could be an adversary. Both software protection and crypto are areas where a very powerful attacker is always assumed to be present. However, other areas of software development could also benefit from this attitude, increasing security and reliability by “wasting” a couple cycles.

Awesome C64 visual debugger

I recently ran across a new visual debugger for C64 emulators called ICU64, written by Mathfigure. It will be released later this year and provides some amazing visualizations of both memory access and memory-mapped devices like the video and sound chip. See the video below for how it works.

There is also another video showing how the classic game Boulder Dash can be expanded to span multiple screens. He implemented this by grabbing graphics from the same RAM areas the game uses and displaying them on a custom screen. On this forum thread, the author discusses his goal of allowing more graphical expandability for classic games.

Learning to program on a small machine with a full view of every memory location is something I’ve always seen as a great way to start. I got a lot of insight into how my computer worked by running a background task that continually copied pages of RAM to screen, so you could “scroll through” memory. For example, the tick counter showed up as a blur.

I’ve always said that my kids would have to learn to program a C64 before they got a more modern system. Wouldn’t this be a great tool for teaching anyone new to low-level computing how assembly language and memory-mapped devices work? What about adapting some of these techniques to fuzzing or modern debuggers?

[Edit: added a link to the project now that it has been released]