Encrypted Google Docs done well

There’s a nice new paper out called “Private Editing Using Untrusted Cloud Services” by Yan Huang and David Evans. They also provide a Firefox extension that implements their scheme. I like their approach for a few reasons.

First, their core advancement is to implement incremental encryption efficiently. Incremental encryption is an often-overlooked method of performing insert, delete, and replace operations on ciphertext. It’s a useful branch of applied cryptography — one that should be used more.

However, the naive implementation of incremental encryption would involve encrypting each character separately, slowing down client/server communications a lot. To get around this, they organize deltas in an Indexed Skip List. This makes it easy to group characters into variable-sized blocks, as well as update them quickly.

I am also happy that they deployed their code as a browser extension instead of client-side JavaScript. As I have mentioned before, client-side JS crypto is a bad idea. There are fundamental integrity and trust problems that can’t be solved in that environment. However, except for the potential for side-channel attacks and lack of control of low-level details like key zeroization, JavaScript crypto in a browser extension is more acceptable, as long as it is properly reviewed. This is one use of the Stanford JS crypto library that is defensible.

For those of you implementing “secure” note-taking web services, this is the right way to do it.

Baysec update and announcement change

The next Baysec is April 26, 7-11 pm at Irish Bank. Next month will be the fourth anniversary of Baysec!

I won’t be announcing these events on this blog any more because I’d like to reserve it for articles instead. The Baysec announcements are ephemeral and of no value to people outside the Bay Area.

I will still be posting Baysec announcements on the @rootlabs Twitter account. And if you want to participate in discussing Baysec events, please join the mailing list at baysec.net. It is very low traffic — less than 10 messages per month.

More certs may indicate less security

In my last post, I mentioned how warning users when a previously-seen cert changes may generate false positives for some sites. If a website has a multiple servers with different certs, the browser may often generate spurious errors for that site. But could this be a symptom of a genuine security problem?

Citibank appears to have one certificate per server. You can verify this yourself by going to their website and multiple times, clearing your browser each time. Clicking on the SSL icon to the left of the URL will show a different cert.

Here are the first 4 bytes of  three serial numbers of certs observed at Citibank:

  • 43:8e:67:66
  • 61:22:d4:81
  • 3e:f4:5b:7c

The Citibank certs are all identical except for a few fields. As you would expect, the domain name (CN) field is identical for each. The organizational unit (OU) differs (e.g., “olb-usmtprweb3” versus “…web1”), but this field is not interpreted by browsers and is more of a convenience. The web server’s public key is different in each cert. And, of course, the serial number and signature fields also differ, as they should for all certs.

On the other hand, Wells Fargo appears to have only one cert. This cert (serial 41:c5:cd:90) is the same even after accessing their site via a proxy to ensure some load-balancing magic isn’t getting in the way. It’s easy to ignore this difference, but there might be something else going on.

Protecting the web server’s private key is one of the most important operational security duties. If it is discovered, all past and present encrypted sessions are compromised. (Yes, I know about DHE but it’s not widely used). After cleaning up the mess, the organization needs to get a new certificate and revoke the old one. This is no easy task as CRLs and OCSP both have their downsides.

One key question to ask an opsec department is “have you ever done a live cert revocation?” It’s one of those things that has to be experienced to be understood. In the recent Comodo fiasco, leaf cert revocations were embedded in browser software updates because the existing revocation mechanisms weren’t reliable enough.

Since web servers run commodity operating systems, most big sites use a hardware security module (HSM) to protect the private key. This is a dedicated box with some physical tamper resistance that is optimized for doing private key operations. By limiting the API to the server, HSMs can be hardened to prevent compromise, even if the server is hacked. The main downsides are that HSMs are expensive and may not live up to the original security guarantees as the API surface area grows.

Now, back to the two banks. Why would one have multiple certs but not the other? Certificates cost money, so if you’re offloading SSL to a single accelerator, there’s no reason to give it multiple certs. If each server has a dedicated HSM, you could use separate certs or just generate one and export it to all the others. You need to do this anyway for backup purposes.

This is just supposition, but one thing this could indicate is a different approach to securing the private key. Instead of generating one cert and private key, you create one per server and store it without an HSM. If a server gets compromised, you revoke the private key and move on. This might seem like a good idea to some since the cost of a cert must be lower than an HSM. However, the ineffectiveness of revocation today shows this to be a dangerous choice.

There may be other explanations for this. Perhaps Citi uses individual HSMs and Wells Fargo has a single SSL accelerator with plaintext HTTP in the backend. Perhaps they got a bargain on certs by buying in bulk. However, any time a system has more keys than necessary, it can lead to complicated key management. Or worse, it may indicate a weaker system design overall.

There’s no way to know the real story, but it’s good food for thought for anyone else who might be considering multiple certs as a substitute for strong private key protection. Cert revocation doesn’t currently work and should not be relied on.

Fixing the SSL cert nightmare

Recently, there has been an uproar as Comodo failed to do the one thing their business is supposed to do — issue certificates only to authenticated parties. (But what do you expect for a $5, few-questions-asked cert?) I’m glad there’s renewed focus on fixing the current CA and SSL browser infrastructure because this is one of the largest and most obvious security flaws in an otherwise successful protocol.

In response to this compromise, many people are recommending drastic changes. One really bad idea is getting rid of root certs in favor of SSH-style host verification. There are also some good proposals though:

Paring down number of root certs in common browsers

Root cert proliferation has gotten out-of-hand in all the major browsers. It would be great if someone analyzed which CAs are essential for all hostnames in the top 1000 sites. That info could be used to prune root certs to a smaller initial set.

It is unlikely the browser vendors will do this work themselves. They have clear financial and political incentives to add new CAs and few incentives to remove them. Even if you just consider the collateral damage to innocent sites when a root cert is removed, the costs can be huge.

The EFF had a promising start but not as much appears to have been done recently. However, they did publish this article including a list of CAs yesterday, sorted by number of times each CA signed an unqualified hostname. (Comodo is only second to Go Daddy, by the way, and Verisign is pretty high as well).

Notifying the user when a site’s cert changes

This is a pretty simple idea that has some merit. Your browser would cache the cert the first time you connect to a given server. If it changes when you revisit the site, you would get a warning. (Bonus: maintain a worldwide cache of these certs and correlate observations from various locations, like Perspectives or Google’s DNS-based cert history.)

This would have helped in the Comodo case but wouldn’t notify the user if the compromised CA were the same as the server’s current one. This scenario actually occurred in 2001 when Verisign issued another Microsoft code-signing cert to someone posing as an employee.

One usability problem of persistent cert chains is the fact that some sites use many different certs. For example, Citibank appears to have one cert per webserver, something we discuss more in our next post. This means users would get lots of spurious warnings when using their site.

Keep a hit count of CAs previously seen

This is a simple idea I came up with the other day that may be worth investigating. I’d like to see a CA “hit count” displayed alongside the list of root certs in my browser. This would help in auditing which certs are actually used by sites I visit and which are dormant. This could include the hostnames themselves as a collapsible list under each CA cert.

The important goal in considering all these proposals is to avoid making the problem worse. Nearly everyone agrees that the current situation has become untenable, and we need solutions to certificate problems now.

Memory address layout vulnerabilities

This post is about a programming mistake we have seen a few times in the field. If you live the TAOSSA, you probably already avoid this but it’s a surprisingly tricky and persistent bug.

Assume you’d like to exploit the function below on a 32-bit system. You control len and the contents of src, and they can be up to about 1 MB in size before malloc() or prior input checks start to error out early without calling this function.

int target_fn(char *src, int len)
{
    char buf[32];
    char *end;

    if (len < 0) return -1;
    end = buf + len;
    if (end > buf + sizeof(buf)) return -1;
    memcpy(buf, src, len);
    return 0;
}

Is there a flaw? If so, what conditions are required to exploit it? Hint: the obvious integer overflow using len is caught by the first if statement.

The bug is an ordinary integer overflow, but it is only exploitable in certain conditions. It depends entirely on the address of buf in memory. If the stack is located at the bottom of address space on your system or if buf was located on the heap, it is probably not exploitable. If it is near the top of address space as with most stacks, it may be exploitable. But it completely depends on the runtime location of buf, so exploitability depends on the containing program itself and how it uses other memory.

The issue is that buf + len may wrap the end pointer to memory below buf. This may happen even for small values of len, if buf is close enough to the top of memory. For example, if buf is at 0xffff0000, a len as small as 64KB can be enough to wrap the end pointer. This allows the memcpy() to become unbounded, up to the end of RAM. If you’re on a microcontroller or other system that allows accesses to low memory, memcpy() could wrap internally after hitting the top of memory and continue storing data at low addresses.

Of course, these kinds of functions are never neatly packaged in a small wrapper and easy to find. There’s usually a sea of them and the copy happens many function calls later, based on stored values. In this kind of situation, all of us (maybe even Mark Dowd) need some help sometimes.

There has been a lot of recent work on using SMT solvers to find boundary condition bugs. They are useful, but often limited. Every time you hit a branch, you have to add a constraint (or potentially double your terms, depending on the structure). Also, inferring the runtime contents of RAM is a separate and difficult problem.

We think the best approach for now is to use manual code review to identify potentially problematic sections, and then restrict the search space to that set of functions for automated verification. Despite some promising results, we’re still a long way from automated detection and exploitation of vulnerabilities. As the program verification field advances, additional constraints from ASLR, DEP, and even software protection measures reduce the ease of exploitation.

Over the next few years, it will be interesting to see if attackers can maintain their early lead by using program verification techniques. Microsoft has applied the same approach to defense, and it would be good to see this become general practice elsewhere.