Mixed voltage interfacing for design or hacking

January 6, 2012October 28, 2012 ~ Nate Lawson

Modern digital systems involve a wide array of voltages. Instead of just the classic 5V TTL, they now use components and busses ranging from 3.3V down to 1.0V. Interfacing with these systems is tricky, especially when you have multiple power sources, capacitive loads, and inrush current from devices being powered on. It’s important to understand the internal design of all drivers, both in your board and the target, to be sure your final product is cheap but reliable.

When doing a security audit of an embedded device, interfacing is one of the main tasks. You have to figure out what signals you want to tap or manipulate. You pick a speed range and signal characteristics. For example, you typically want to sample at 4x the target data rate to ensure a clean signal or barring that, lock onto a clock to synchronize at 1x. For active attacks such as glitching, you often have to do some analog design for the pulse generator and trigger it from your digital logic. Most importantly, you have to make sure your monitoring board does not disrupt the system. The Xbox tap board (pdf) built by bunnie is a great case study for this, as well as his recent NeTV device.

Building a mass market board is different than a quick hack for a one-off review, and cost becomes a big concern. Sure, you can use an external level shifter or transceiver chip. However, these come with a lot of trade-offs. They add latency to switching times. They often require dual power supplies, which may not be available for your particular application. They usually require extra control pins (e.g., to select signal direction). They force you to switch a group of pins all at once, not individually. Most importantly, they add cost and take up board space. With a little more thought, you can often come up with a simpler solution.

There was a great article in Circuit Cellar on mixed-voltage digital interfacing. It goes into a lot of the different design approaches, from current-limiting resistors all the way up to BJTs and MOSFETs. The first step is to understand the internals of I/O pins on most digital devices. Below is a diagram I reproduced from the article.

CMOS I/O Pin design — CMOS I/O pin, showing internal ESD diodes

Whether you are using a microcontroller or logic chip, usually the I/O pins have a similar design. The input to the gate is protected by diodes that will conduct current to ground if the input voltage goes more negative than the GND pin or to Vcc if the input is more positive than Vcc. These diodes are normally used to protect against static discharge, but can also be a useful part of your design if you are careful.

The current limiting resistor can be calculated by figuring out the maximum and minimum voltages your input will see and the maximum current flow you want into your device. You also have to take into account the diode’s voltage drop. Here’s an example calculation:

Max input level: 6V
Vcc: 5V
Diode drop: 0.7V (typical)
Max current: 100 microamps

R = V / I = (6 – 5 – 0.7) / 0.0001 = 3K ohms

While microcontrollers and 74xx logic devices are often tolerant of some moderate current, FPGAs are extremely sensitive. If you’re using an FPGA, use a level shifter chip or you’ll be sorry. Also, most devices are more sensitive to negative voltage than positive. Even if you’re within the specs for max current, a negative voltage may quickly lead to latch-up.

Level shifters usually take in dual voltages and safely isolate both sides. Some are one-way, which can be used as line drivers or line receivers. Others are bidirectional transceivers, and the direction is selected by an extra pin. If you can’t afford the extra pins, you can often combine an open-collector transmitter with a line receiver. However, if you happen to drive the lines by transmitting at the same time as the other side, you can fry your chips.

To summarize, mixed voltage interfacing is a skill you’ll need when building your own devices or hacking existing ones. You can get by with just a current-limiting resistor in many cases, but be sure you understand the I/O pin design to avoid costly failures, especially in the field over time. For more assurance or with sensitive parts like FPGAs, you’ll have to use a level shifter.

The lost Van Jacobson paper that could save the Internet

December 30, 2011October 28, 2012 ~ Nate Lawson ~ 7 Comments

One of my heroes has always been Van Jacobson. His 1988 paper on solving TCP congestion is an enjoyable read, with cross-discipline appeal. The history of all this is fascinating, such as congestion control’s roots in hydrodynamics theory. (If you want to kill an afternoon, you can read my collection of the history of Internet working in the 80’s and 90’s. I especially like the notes on tuning Sun’s IP stack with hand-coded assembly.)

Since the old days, the IETF has taken over and our congestion problems are more or less solved, right? Well, not exactly. There’s a new congestion storm brewing with our endpoints that is largely the impetus for the network neutrality dispute.

Back in 2008, I wrote some articles about how Random Early Detection (RED) would be more effective than deep packet inspection in solving the congestion apparently caused by Bittorrent. At the time, some ISPs were terminating Bittorrent uploads, supposedly in order to manage their bandwidth. I thought network admins ignored RED because they were control freaks, and deep packet inspection gives you a lot of control over user behavior. But a lost Van Jacobson paper with a diagram of a toilet might be the key to the new congestion problem.

Jim Gettys of Bell Labs has been blogging for about a year on a phenomenon known as “bufferbloat“. This refers to the long queues created by the large buffers of routers, firewalls, cable modems, and other intermediate gateways. Because of Moore’s Law making RAM cheaper and lack of queue management, packets are queued for a long time during congestion instead of being dropped quickly. This misleads TCP congestion control and leads to even more congestion.

Back when RAM was expensive and networks were slow, packets were dropped immediately when congestion was encountered. This created a responsive control system. The transmitter could be sure a packet had been dropped if it didn’t get an ACK within a couple standard deviations of the average round-trip time.

Think of such a network as a stiff spring. As the transmitter “pushed” on one end of the spring, the response force was quickly “felt”, and the sender could back off when the network bandwidth was fully allocated.

Now, increase the bandwidth and intermediate router buffer sizes but maintain the same control system. More bandwidth means that it is normal to have many packets in flight (increased window size). Larger buffers mean more of those packets can be delayed without being dropped. If they are dropped, it happens long after the first congestion actually occurred and the buffer started filling up. Multiply this effect by each hop in the route to the destination.

This gives a control system more like a set of loose springs with gaps in the middle. The transmitter increases the window size until congestion is encountered, probing the available bandwidth. Instead of the first excess packet being dropped, it gets queued somewhere. This happens to many of the packets, until the intermediate buffer is full. Finally, a packet gets dropped but it’s too late — the sender has exceeded the network capacity by the available bandwidth plus the combined sizes of one or more of the intermediate buffers.

Network equipment manufacturers make this worse through a cycle of escalation. When a fast network meets a slower one, there has to be congestion. For example, a wireless router typically offers 50-100 Mbps speeds but is connected to a 5-10 Mbps Internet connection. If the manufacturer provides larger buffers, bursty traffic can be absorbed without packet loss, at least for a little while. But all packets experience a higher latency during this period of congestion, and the delay between transmission and drop grows, making the sender oscillate between over and under utilization.

The congestion problem was solved long ago by RED. When a router starts to experience congestion, it immediately applies an algorithm to fairly drop packets from the queue, weighted by each sender’s portion of bandwidth used. For example, with a simple random algorithm, a sender who is transmitting 50% of the total bandwidth is twice as likely to be dropped as someone using 25%.

Besides dropping packets, the router can also set an explicit congestion notification (ECN) bit on a packet. This communicates a warning to the sender that future packets will be dropped if it keeps increasing the window size. This is better than just dropping the packet since it avoids discarding useful data that the packet is carrying.

It turns out that RED is not enabled on many Internet routers. Jim wrote a fascinating post why. In short, ISPs avoided deploying RED due to some bugs in the original paper and the requirement for manually tuning its parameters. ISPs don’t want to do that and haven’t. But years ago, Van Jacobson had begun to write a paper on how to fix RED.

The lost paper was never published. One roadblock was that the diagram of a toilet offended a reviewer. Also, Van changed jobs and never got around to properly finishing it. He lost the draft and the FrameMaker software for editing it. But recently, the original draft was found and converted into a usable format.

Much remains to be done. This is truly a hard problem. Jim Gettys and others have been building tools to analyze bufferbloat and writing new articles. They’re trying to raise visibility of this issue and come up with a new variant of RED that can be widely deployed. If you’re interested in helping, download the tools or check out Netalyzr.

There’s no single correct solution to eliminating bufferbloat, but I’m hoping a self-tuning algorithm based on RED can be widely deployed in the coming years.

Recovering a private key with only a fraction of the bits

September 20, 2011October 28, 2012 ~ Nate Lawson ~ 3 Comments

Ever since my first post on breaking DSA, I’ve been meaning to write a clear description of how to recover a private key if you only have a fraction of the bits. For example, power analysis attacks may allow you to derive a few bits of the random k value on each measurement. It turns out you can combine multiple measurements to get a single k value and then recover the DSA private key. Of course, all this also applies to ECDSA.

Since I haven’t had time to put together a good summary article, here are some references for learning this on your own. The first paper in this area was Boneh and Venkatesan (1996). They described the basic Hidden Number Problem.

The next important paper was by Howgrave-Graham and Smart (1999) [1]. They used Babai’s algorithm[2] and LLL lattice reduction to solve for DSA private nonces. This was improved by Nguyen and Shparlinski (2000) [3] to solve for just the k values.

This attack applies any time the DSA nonce isn’t fully random and used only one time. It applies if a few of the bits are constant, if the RNG is biased towards certain values, or if you can recover part of the values by side channel attacks. These references should allow you to implement this attack yourself. It has been repeatedly used in private work, but I haven’t seen much public discussion about applying this to real-world systems.

[1] Howgrave-Grahm and Smart. “Lattice attacks on Digital Signature Schemes.” HP internal publication, 1999.
[2] Laszlo Babai. “On Lovász lattice reduction and the nearest lattice point problem.” Combinatorica 6. No online reference found.
[3] Nguyen and Shparlinski. “The Insecurity of the Elliptic Curve Digital Signature Algorithm with Partially Known Nonces.” Journal of Cryptology, volume 15, pp. 151-176, 2000.

Addendum: I found the following references to improve this list.

Slides by Shparlinski on the HNP
Nguyen on the hardness of the hidden subset sum problem (1999)
DSA nonces can be forced into this form with a glitch attack (2004)

The Magic Inside Bunnie’s New NeTV

September 13, 2011October 28, 2012 ~ Nate Lawson ~ 1 Comment

A year ago, what was probably the most important Pastebin posting ever was released by an anonymous hacker. The HDCP master key gave the ability for anyone to derive the keys protecting the link between DVD players and TVs. There was no possibility of revocation. The only remaining question was, “who would be the first to deploy this key in an HDCP stripper?”

Last week, the HDCP master key was silently deployed, but surprisingly, not in a stripper or other circumvention device. Instead, it’s enabling a useful new system called the Chumby NeTV. It was created by Bunnie Huang, who is known for inventing the Chumby and hacking the Xbox. He’s driving down the cost of TV-connected hardware with a very innovative approach.

The NeTV displays Internet apps on your TV. You can see Twitter feeds, view photos, and browse the web via an on-screen display. It overlays this information on your video source. You can control it from your iPhone or Android phone. It’s simple to install since you merely plug it inline with your cable box or DVD player’s HDMI connection to the TV. And in true Bunnie fashion, the hardware and software is all open source.

When I first heard of this last week, I didn’t think much of it. It’s a neat concept, but I don’t have an HDTV. Then, a friend contacted me.

“Have you figured out how the NeTV works? There’s a lot of speculation, but I think I’ve figured it out,” he said. I told him I hadn’t thought much about it, then downloaded the source code to the FPGA to take a look.

I was surprised to find an entire HDCP implementation, but it didn’t quite make sense. There was no decryption block or device keys. I emailed Bunnie, asking how it could do alpha blending without decrypting the video. He wrote back from a plane in Tokyo with a cryptic message, “No decryption involved, just chroma key.”

This was the hint I needed. I went back and watched the demo video. The overlay was not transparent as I had first thought. It was opaque. To do alpha blending, you have to have plaintext video in order to mask off the appropriate bits and combine them. But to apply an opaque overlay, you could just overwrite the appropriate video locations with your substituted data. It would require careful timing, but no decryption.

Chroma key (aka “blue/green screen”) uses color for in-band signaling. Typically, an actor performs in front of a green screen. A computer (or a filter, in the old days) substitutes data from another feed wherever there is green. This is the foundation of most special effects in movies. Most importantly, it is simple and can be performed quickly with a minimum of logic.

The NeTV generates its output signal by combining the input video source and the generated overlay with this same technique. The overlay is mostly filled with pixels of an unusual color (Bunnie called it “magic pink”). The FPGA monitors the input signal position (vertical/horizontal sync, which aren’t encrypted) to know where it is within each frame of video. When it is within the pink region of the overlay, it just passes through the encrypted input video. Otherwise, it displays the overlay. The HDCP implementation is needed to encrypt the overlay, otherwise this part of the screen will be scrambled when the TV tries to decrypt it. But, indeed, there is no decryption of the input content.

This is impressive work, on par with the demoscene. The NeTV synchronizes with every frame of video, no jitter, choosing which pixel stream to output (and possibly encrypt) on-the-fly. But there’s more.

To generate the keystream, the NeTV has to synchronize with the HDCP key exchange between video source and TV. It replicates each step of the process so that it derives the correct stream key. To keep any timing issues with the main CPU from delaying the key exchange, it resets the link after deriving the shared key to be sure everything is aligned again. Since the transport key only depends on the two endpoint device keys, the same shared key is always used.

This is extremely impressive from a technical standpoint, but it’s also interesting from a content protection standpoint. The NeTV has no device keys of its own; it derives the ones in use by your video source and TV as needed. It never decrypts video, only encrypts its on-screen display to match. It can’t easily be turned into an HDCP stripper since that would require a lot of rework of the internals. (The Revue, with its HDMI transceiver chip and Atom processor could probably be turned into an HDCP stripper with a similar level of effort.)

Bunnie has done it again with a cheap device that applies his extensive creativity to not just solve a problem, but do it in style. Whatever the outcome of his maverick engineering is in the marketplace, the internals are a thing of beauty.

Intermediate cryptography resources

June 28, 2011June 28, 2011 ~ Nate Lawson ~ 9 Comments

People often ask me for a good introduction to intermediate cryptography. It’s often easy to find basic and dangerous introductions (“public key encryption is like a mailbox”), but the next level isn’t as available.

There’s no single source for this, but you can find good coverage of the main practical topics online. Here are some resources to get you started learning beyond cryptography basics.

Cryptography: an Introduction (Nigel Smart)

I can’t say enough good things about this book. It is a great way to learn about attacks on public key schemes (see part 4) and has good general coverage as well, including elliptic-curve.

Lecture Notes on Cryptography (Bellare and Goldwasser)

Good for understanding how to model block cipher constructions with PRFs and PRPs. When someone says “that construction is not IND-CPA-secure”, this will tell you what that means. Try chapters 5, 6, and 9. Also, see the class notes page for slides and individual chapters of this series.

Tom’s math and crypto libraries (Tom St. Denis)

It’s impossible to understand practical cryptography without looking at implementations. Tom’s libraries are relatively clear and readable and cover the gamut from low-level integer manipulation all the way up to protocols. There are no external dependencies and they are public domain. For extra credit, implement one of the ciphers yourself before looking at his code, then compare to see how you did.

He also includes a large PDF documenting the library, and it’s available as a book as well.

NIST FIPS, SP and RSA PKCS standards

The NIST standards are pretty clear. The RSA ones are a bit more difficult to read. In any case, it’s very helpful to read through these and ask “why?” for each requirement they make. There’s always a reason for every “shall” or “must”. But are there some “shoulds” that should be “shalls”?

Once you’ve moved beyond these resources, the best next level is to read survey papers (like Boneh’s coverage of RSA) in the specific area you’re interested in. If you have your own favorite resources for intermediate cryptography, let me know in the comments below.

Shatner on the future of microchips

June 9, 2011June 6, 2011 ~ Nate Lawson ~ 5 Comments

AT&T recently published a lot of videos from their archives. I particularly like this video with William Shatner discussing the magic of the microchip. In hindsight, its view of the future is revealing.

First, it’s still a good introduction to how chips are produced. It shows an automated test machine, wire-bonding, and hand-soldering boards. It also shows microscopic views of a 5 micron circuit and a nice animation of clock pulses and gates. All of these processes are basically the same today, just more sophisticated.

On the flip side, its prediction about the computerization of telephones revolutionizing society has come and gone. Wired telephones connected to smart central computers have been surpassed by smart cellphones.

This video also reminded me how there once were more women in technology. Computer science was originally considered a branch of math, and women have often excelled at mathematics. When this video was made in 1980, women made up about 41% of computer science freshmen. That had dropped to about 12.5% by 2007. If you look at the graph from that article, it seems that women participated almost equally in the early 80’s computer craze but sat out the Dotcom boom.

I’m still not sure about all the causes of this, but it is a troubling trend. Any time half of your intelligence sits on the bench, you’re going to be at a disadvantage to other countries. Some studies have shown Chinese women have a more positive view of computers than other cultures. Additionally, all enrollment in computer science is lower as a percentage than at any time since the 1970’s.

What can be done to increase interest in computer science among all students and especially women?