Fixing the SSL cert nightmare

Recently, there has been an uproar as Comodo failed to do the one thing their business is supposed to do — issue certificates only to authenticated parties. (But what do you expect for a $5, few-questions-asked cert?) I’m glad there’s renewed focus on fixing the current CA and SSL browser infrastructure because this is one of the largest and most obvious security flaws in an otherwise successful protocol.

In response to this compromise, many people are recommending drastic changes. One really bad idea is getting rid of root certs in favor of SSH-style host verification. There are also some good proposals though:

Paring down number of root certs in common browsers

Root cert proliferation has gotten out-of-hand in all the major browsers. It would be great if someone analyzed which CAs are essential for all hostnames in the top 1000 sites. That info could be used to prune root certs to a smaller initial set.

It is unlikely the browser vendors will do this work themselves. They have clear financial and political incentives to add new CAs and few incentives to remove them. Even if you just consider the collateral damage to innocent sites when a root cert is removed, the costs can be huge.

The EFF had a promising start but not as much appears to have been done recently. However, they did publish this article including a list of CAs yesterday, sorted by number of times each CA signed an unqualified hostname. (Comodo is only second to Go Daddy, by the way, and Verisign is pretty high as well).

Notifying the user when a site’s cert changes

This is a pretty simple idea that has some merit. Your browser would cache the cert the first time you connect to a given server. If it changes when you revisit the site, you would get a warning. (Bonus: maintain a worldwide cache of these certs and correlate observations from various locations, like Perspectives or Google’s DNS-based cert history.)

This would have helped in the Comodo case but wouldn’t notify the user if the compromised CA were the same as the server’s current one. This scenario actually occurred in 2001 when Verisign issued another Microsoft code-signing cert to someone posing as an employee.

One usability problem of persistent cert chains is the fact that some sites use many different certs. For example, Citibank appears to have one cert per webserver, something we discuss more in our next post. This means users would get lots of spurious warnings when using their site.

Keep a hit count of CAs previously seen

This is a simple idea I came up with the other day that may be worth investigating. I’d like to see a CA “hit count” displayed alongside the list of root certs in my browser. This would help in auditing which certs are actually used by sites I visit and which are dormant. This could include the hostnames themselves as a collapsible list under each CA cert.

The important goal in considering all these proposals is to avoid making the problem worse. Nearly everyone agrees that the current situation has become untenable, and we need solutions to certificate problems now.

36 thoughts on “Fixing the SSL cert nightmare

  1. Thanks, Joachim and Gavin. I added links to those in the article.

    Google’s approach seems pretty insecure, with only some vague hope that DNSSEC will save them in the future. Would be better if they used a real API over SSL.

    1. 1. Stricter rules are great but I don’t see remedies. When a list has “when you fail, we remove you permanently”, I’ll be happy.

      2. OCSP vs. CRLs is a debate I don’t care about much. If you’re constantly revoking large numbers of certs, both approaches break down. But in that case, the system of not generating bogus certs in the first place is broken, and that’s the real problem to solve.

      Personally, I prefer CRLs with a window of 1 day or so to get updated, else stop allowing SSL connections. But like I said, both CRLs and OCSP have their issues and neither can handle large numbers of revocations (nor should they).

      1. The problem with breaking on CRL failure is that it breaks a HUGE amount of deployed code, AND makes the functionality of my webserver dependent on the performance of the machine with the CRL. If the CRL gets DDOS’d, people get big scary error messages on my site, reflecting poorly on me.

        If we require a CRL before accepting a cert, we break almost every single wifi paywall/browser-auth appliance in existence, as well as many other kinds of walled gardens.

      2. CRLs can have a tunable parameter, which is how long you allow access to SSL sites without seeing a CRL update. OCSP vs. CRL is really just the old cache liveness question. Put another way, CRL with a 0-second expiration is OCSP. OCSP with caching of results for N days is CRL.

        The fact that browsers don’t implement revocation correctly is the problem here, not an inherent issue with CRLs.

  2. Hi Nate,

    It’s no longer my area, so I’m interested in why you think persistence/TOFU is a really bad idea?

    1. Persistence of leaf certs is fine. It’s helpful to have some memory of certs seen previously. The only caveat is mentioned in the article — some sites have a lot of certs so there will be spurious warnings. But I’m all for having the option of persistence.

      It’s when people want to throw out all root certs (TOFU) that I question them. Why force every user to be their own, insecure browser vendor? That’s just adding back the SSH problem to a system that already has a partial solution to it.

      To put it another way: a valid signature is helpful the first time you see a cert. Why force the user to verify fingerprints manually, which many won’t even do?

      I want to keep root certs in the browser and have the option to be notified if previously-seen certs change.

      1. I don’t see sites with many certs as being a problem, they will just have to change and that’s the end of it. I think we have to accept that any solution will involve some effort/pain on someone’s part, and ideally it should not be the browser users.

  3. Thanks for mentioning EFF. :) Note that the list of who signed unqualified domains most was actually done by George Macon, using the Observatory.

    I’ll be publishing another article along similar lines tomorrow. This time, it’s fully-qualified junk names.

    I am a TOFU advocate, but mostly for the persistence feature and for the simplicity of the guarantee to users, developers, and ops people. That is hugely important. For transitions, including the first use, I favor Perspectives and things like it. Persistence + Perspectives-verified transitions is way, way safer and simpler than what we have.

    And, we have been doing more with the Observatory recently. We have a decentralized Observatory baked into our HTTPS Everywhere Firefox plugin (beta version; will be rolled into the stable version soon), and we have been using the Observatory to help browser vendors understand the impact of un-trusting this or that CA, and other empirical questions. So, some visible stuff, some behind-the-scenes stuff. Hopefully the Decentralized Observatory will turn up some fun stuff…

  4. This seems obvious to me, but I don’t think I’ve ever seen it mentioned: accept certificates for sites we haven’t visited before if they are signed by a trusted CA, otherwise accept certificates only if they are signed by a trusted CA and the last seen certificate (with a timeout of, say, 365 days). Of course, externalizing the cache a la Perspectives might make sense.

    Did I miss something fundamental?

    1. Yes, this is exactly what I was advocating in the 2nd point in this post:

      Use CAs the same way as today for the first access to a site, then just cache the site’s cert and notify if it changes after that point. It’s exactly as secure/insecure as today’s approach for first access, but has some advantages for subsequent visits to the same site.

      1. Some advantages, but I’m not convinced that it would provide very much. In this case, valid warnings of change will be quite common, and users will learn to ignore them even more quickly than they’ve learned to click through the existing invalid-cert warning. You’ll have cert owners asking for extremely long-life certs, which is a security issue in itself. And you’ll still have a bump where users are warned on changeover.

        Perhaps the trick would be to warn if a new cert appears while the cached cert is still valid, with an overlap window of, say, a week for admin changeover.

      2. I’m sorry, I was unclear. My comment should read “otherwise accept certificates only if they are signed by a trusted CA and by the last seen certificate” (you seem to have read it as “and are the last seen certificate”, which admittedly makes sense). This offers a “legitimate” upgrade path.

        I would still be surprised if I was the first to come up with this, though.

  5. One major aspect to remove the noise and make special-purpose CAs more useful: Restrict CAs to specific TLDs and sub-domains. Its fine that the government of Ghana wants to run their own CA for their sites and citizens, but no browser should accept such certificates for ebay.com or gates@minisoft.com.

    This would also prevent a major drawback of selfsigned certs and CAs: Currently, if I import the CA cert of my employer into the browser to use SSL for our sites, this creates a potential backdoor for all other transactions as well.

    1. Yes, compartmentalizing CAs would be good but there are business problems with taking that too far. For example, you couldn’t lock Ebay into Verisign-only certs. That would prevent them from moving to another CA for pricing or security advantages.

      1. Well then, lets not take it that far :-)

        For individual sites like ebay, key continuity management can be used. More or less a client-based compartmentalization. A hybrid may also be interesting, i.e. KCM with some central management server.

  6. I was reading the information Mozilla had on their list.

    And privacy seems to be a concern for them when dealing with revocation and network notary services.

    Would it not work if you had a service like do following:

    where the client would sent:
    – a hash of the domain
    – the serialnumber of the certificate (maybe fingerprints)
    – the rest of the certificate-path (maybe also seperately hashed ?)

    And the server would just reply with statistics and dates (last seen by x amount of people, x amount of people are seeing something else and so on).
    Maybe the server should have also keep some general GeoIP-/network-information and add that to the mix of statistics ?

    1. You can protect the privacy of which sites a user has visited by just downloading the entire database (say with the browser install) and then updating it rsync-style periodically. All checks would be local. Asking for individual site stats on each browser access is just a premature performance optimization, just like OCSP vs. CRLs.

      Your proposal doesn’t solve the problem because the server can easily correlate the hash of the cert with the original, undoing the privacy.

  7. While I agree that the proliferation of trusted root certs is a problem, it is somewhat self-created. SSL certs used to be REALLY expensive, for a service that is largely automated. They still are pretty expensive in many cases.

    I think that the inclusion of StartSSL (a pretty recent addition) and their business model of charging when verifying a user (and free for basic single-domain certs), rather than charging for each cert, makes a lot of sense.

    Without a large, varied playing field, we would be back to the days of $100 minimum to get a functional cert, and an extra $100 per additional domain on that cert. We can’t get widespread adoption of a security measure that costs that much per domain, and by charging more for certs that work on multiple domains, we make the IPv4 problem worse.

    1. I think charging more is an excellent filter for bad actors and the process should not be “largely automated”. I’d like to see $500 SSL certs be the minimum. Couple this with real penalties for CAs issuing bad certs and you’d have the incentives properly aligned.

      What you can do is tie a site’s cert cost to their insurance policy. Citibank would have to pay a lot more for their cert since they would want a larger policy against compromises. This money would be used by the CA to do extensive verification and physical protection of keys (HSMs for Citi’s webservers).

      1. So then you’d prefer to see a larger swath of the web in plaintext? It’s clear that there’s no opportunistic encryption technology that will magically step in and replace SSL in the short run.

        As far as I can tell, EV certs are exactly what you propose – much more expensive, far less automated, and better verified. Their existence is supposed to do what SSL certs no longer effectively do – verify that the person holding the cert is the person you trust.

      2. If there were no other certs than EV certs, yes I’d be happy. Remember, before EV certs, the term “certificate” alone was sufficient to describe this process.

        You wouldn’t see large swaths of the web in plaintext. The threat of Firesheep is too obvious now to go back. Sure, the cost of business would be slightly higher but this would be more than compensated by the increased security to everyone.

        Even Honest Achmed…

        https://bugzilla.mozilla.org/show_bug.cgi?id=647959

      3. DNSSEC anyone ? DANE seems to get a lot of supporters right now. There is even an IETF workgroup: https://tools.ietf.org/wg/dane/

        It would eventually remove the requirement of domain-validated certificates and only EV-certs (the more expensive ones you talk about) should remain. The CAB-forum already has a lot more rules for EV-certs than the domain-validated ones.

        DANE and similair proposals/RFC’s are similair to domain-validated certs because you store information in DNS with the domain.

        But it will take a large number of years to see enough adoption. You probably would want to have DNSSEC validation in the browser, but if your DSL-router, strips the request for DNSSEC-information when relaying your request, blocks the larger DNS-packets when trying to request it directly from a nameserver instead of the DSL-router and does not allow a fallback over TCP then you won’t be able to use it. It will even slow down your browsing.

        It is like IPv6-adoption

      4. Hopefully, it will be like IPv6 adoption: very slow and allow better solutions to take hold. DNSSEC sucks. I absolutely hate it, almost as much as browser-based JS crypto.

        Also, I fail to see how a PKI-based cert in DNS solves any of the CA problems we’re discussing in this post. A DNS PKI cert will have the exact same trust problems as an SSL PKI cert.

      5. Well, if your browser supports DNSSEC and you find a DNSSEC signed hash of a certificate in DNS (and if you still care about the CA’s for validation of the certificate) you atleast can be sure the DNS-admin intended this specific certificate to be used.

        DNSSEC should be able to allow a DNSSEC-validating client to determine from the root down that your domain should be DNSSEC signed, if it is not then it is assumed an attacker has tried to insert false information.

        Some efforts allow the browser to determine which CA is allowed to by the domain-owner to issue a usable certficate, then it doesn’t matter as much anymore if their are 100’s of CA’s in the browser, as long as there is certificate information in DNS and it is signed with DNSSEC. This means (unless you are moving to a new CA or something) not more than 1 CA can issue validating certificates for your domain.

      6. You can get signed DNS responses with djb’s DNSCurve. It doesn’t use CAs and thus does not have the “100’s of CAs” problem.

        I don’t see how your last paragraph helps. Making the browser determine which CA is allowed to issue DNS certs sounds exactly like today’s problem of determining which CA is allowed to issue SSL certs. I assume you know that it’s going to be the same set of CAs issuing DNS certs, not some new, magically secure CAs.

      7. @Nate I was afraid it wasn’t clear enough.

        What I mean is if you have a cert in DNS (which is signed), it prevents any CA creating valid certs for man-in-the-middle-attacks, which the current CA-system does allow. Most users wouldn’t notice if CNNIC or whatever created a valid cert for the website they are visiting (I don’t think any browser has a UI which directly displays the CA-name without having to click to get the details).

        While I like a lot of the ideas of DNSCurve (simplicity is very high on my list for starters) I think people in the industry have been very clear how they don’t see it scale because it can not be cached at the provider, if you want some kind of signed DNS up to the browser/desktop/mobile than that would not scale (atleast that is what many in the industry believe, I’ve personally not looked close enough at the protocol itself yet to form my own opinion).

      8. I may be missing something here. What is it about a DNS cert that is more secure than an SSL cert, given that the same CAs will be issuing both?

        DNSSEC only makes the bad CA problem worse. For example, a user can manually review a cert in the browser because it has a UI to do so. This allows making more informed decisions, adding plugins to do things like popping a warning when they see CA mobility, etc.

        The DNS resolver library doesn’t have a UI. It has a return value. So this only makes cert management worse.

        Unfortunately, Matasano’s excellent list of the problems with DNSSEC is down. There’s a decent discussion here:

        http://www.isc.org/community/blog/201002/whither-dnscurve

        The fundamental difference is that DNSCurve protects individual queries/responses, while DNSSEC provides signatures over zone files. There are many security problems with DNSSEC’s approach. For example, old but valid replies can be cached and replayed by an attacker forever. So if you move IP ranges and someone squats on the old range, they can be you.

        On the other hand, if someone compromises a DNSCurve server, they can be that server for the duration of the time it takes you to repair the hack and change keys (hours? days? weeks?) But any new queries that come after that point are secured again.

        If you want to know more, the DNSCurve web pages for “Attackers” are pretty clear:

        http://dnscurve.org/forgery.html

      9. I was just arguing that with the current system, any CA can issue a cert for any domain (most users won’t check the CA details of the website they visit). So they can issue a cert for doing man in the middle attacks.

        But a system like DNSSEC is signed from the root down and thus will lead to just that one certificate (hash) in DNS (obviously you can have more than one RR in DNS). If the browser talks to the webserver it will only accept that one certificate.

        If the validator is on the machine of the intended victim and has the information about the root to do the validation properly then you can try to spoof DNS and create certificates with an other CA all you want, but it won’t be the same as the one in DNS. Thus the potential victim will never see the attackers website (if they try to use HTTPS immediately without trying to use HTTP, which is probably a much bigger problem than all the other combines).

        ___

        The currently most advanced browser extension for DNSSEC does have some (although currently small) GUI-elements:
        https://os3sec.org/

        And it is possible to do so, because the extension includes a full DNSSEC validator.
        ___

        While I really like forgery.html, I’m not so sure the part about “If an attacker forges a DNS packet, DNSCurve immediately detects the forgery, throws it away, and continues waiting for the legitimate packet, so the correct data gets through.” and how would DNSSEC not be able to do this ?

        When it gets to the point where the recursive DNS-server is waiting for the answer packet from the authoritive server, it would also have all the information to verify the answer. The math may be simpler, this could be true (I’m not a cryptographer or mathematician) or I guess with DNSCurve it is just one calculation to check the packet, with DNSSEC you need to check each answer against the chain, but you are always checking it against the information from the previous packet.

  8. Thanks for the pointer to Chris Palmer’s interesting essay at eff.org. Amusingly, eff.org’s certificate is signed by Comodo.

  9. I’ve actually been running with only 9-10 trusted root CA certs for about an year now. I’ve documented the research and testing on my blog at netsekure.org. Ivan Ristic (ssllabs.com) has done similar scan along with the EFF observatory on a much lager scale than I did, but the data is pretty consistent : ).

Comments are closed.