Encrypted Google Docs done well

There’s a nice new paper out called “Private Editing Using Untrusted Cloud Services” by Yan Huang and David Evans. They also provide a Firefox extension that implements their scheme. I like their approach for a few reasons.

First, their core advancement is to implement incremental encryption efficiently. Incremental encryption is an often-overlooked method of performing insert, delete, and replace operations on ciphertext. It’s a useful branch of applied cryptography — one that should be used more.

However, the naive implementation of incremental encryption would involve encrypting each character separately, slowing down client/server communications a lot. To get around this, they organize deltas in an Indexed Skip List. This makes it easy to group characters into variable-sized blocks, as well as update them quickly.

I am also happy that they deployed their code as a browser extension instead of client-side JavaScript. As I have mentioned before, client-side JS crypto is a bad idea. There are fundamental integrity and trust problems that can’t be solved in that environment. However, except for the potential for side-channel attacks and lack of control of low-level details like key zeroization, JavaScript crypto in a browser extension is more acceptable, as long as it is properly reviewed. This is one use of the Stanford JS crypto library that is defensible.

For those of you implementing “secure” note-taking web services, this is the right way to do it.

6 thoughts on “Encrypted Google Docs done well

  1. Correction: this is a pretty good way to do it. There’s still some big information leaks; as the paper suggests, “our schemes reveal positional and timing information about edits”. There are some mitigating circumstances specific to google docs like sending periodic updates at regular intervals instead of per-keystroke, but you can still deduce quite a bit of information here. For example, a letter responding to a job applicant is likely to be shorter if they got rejected, and longer if they’re hired. And you can do much lower-level analysis, see what e.g. sshow does for encrypted ssh sessions, or the many papers that have been published on deducing passwords entirely from timing measurements. Having the document storage service store a single opaque blob of fixed large length, that changes completely with every edit no matter how minor, would be much more secure.

    Remember that doing bulk statistical analysis of incremental ciphertext edits to gleam as much information as possible is the kind of application that Google’s infrastructure is perfectly designed for!

    1. Jim, of course I agree. Your comments are relevant to any 3rd-party storage backend.

      However, if you had looked at the comments on my post about JS crypto, you’d see I’m dealing with developers who think it’s ok to distribute the JS code and encrypted doc from the same server, among other weaknesses. This paper is a definite advance over that.


      I stand by my conclusion: if you’re going to build an encrypted web service, this is the appropriate deployment model.

      1. I agree about the crypto should be done by the browser or add-on not in js. Since the last discussion, I found the Unhosted project: http://www.unhosted.org/ [0]

        And I’ve come to the conclusion, that not trusting the application-developer(s) is just not feasible [1] [2] or their webserver in the short term.

        Let’s take Google docs as an example and this browser extension. The browser extension as mentioned in the paper only encrypts the communication which was reverse engineered. I think that won’t scale. What if Google updates their application ? It will start sending the cleartext to the server again and probably overwrite the cryptext. It might also just break and you have a bunch of unhappy users who now not only now depend on the Google developers but also on the extension developers.

        If they change the application (or might already have that in place), they could send your data anywhere, they can change your data without you knowing it, they can delete your data. Who knows ?

        The Unhosted guys said: What we need is a protocol.

        The Unhosted guys have come up with the following:
        – trust the application developer
        – load the static HTML5/JS/CSS code from the webserver of the application developer (thus trusting the webserver as well :-/ )
        – create a protocol where the javascript code can talk to a remote server where the data is stored, I’ll call it the storage provider for now
        – the location is derived from the username, which is actually the email address [3].
        – the storage provider supports a standard protocol for storage over HTTP(S), WebDAV [4]
        – the data is encrypted when stored on the server
        – currently the javascript code for doing the encryption and communication is downloaded from the server of the application developer (personally I would like to see that be part of the browser or add-on, which should be possible if all the protocols are ready)
        – a small API is presented to the application developer and he/she can get/set the cleartext data it needs/wants to retrieve

        I would really like to see a comment from you saying that it is possible to come up with something where the application developer does not have to trusted, but I wouldn’t know how. I think most application would need access to the plaintext and thus the application can do pretty much anything with your data after that point.

        So if that isn’t possible, our best bet is a protocol/API which allows to seperate the application and it’s data.

        Are you by any chance coming to Europe any time soon ? For a conference or something ? I would love to have some beers with you and talk about this stuff. ;-) Discussing this stuff over the internet is a tiring.

        [0] and because I did some suggestions, they made me a developer, although I don’t know if I’ll have time for that and I don’t know if our goals align yet.

        [1] unless the browser hashes all the code and presents the user with a good UI to check if the update is OK, but how does the user know this ? Mostly only the application developer(s) know why something was changed. So you are probably back at squire one. The simplest solution is to have an open source project with peer reviewed code by different parties which tell you what is the correct hash. The HTML5-manifest ( http://diveintohtml5.org/offline.html ) might help here, a simple extension to this could add hashing of the code.

        [2] unless the browser has a text-input-field where everything is encrypted and the application code only has access to the cryptext and the browser only allows the use to the plaintext ?

        [3] it uses the existing webfinger protocol for that, your identity on the web is mostly already directly or indirectly tied to your email address anyway (for example: password reminders)

        [4] I think there is JSON, base64 and so on involved as well

      2. I appreciate your thoughtful comment. Here are some brief responses:

        The paper I referenced has a good section on attacks. Google breaking their API would more likely result in corrupted data than plaintext disclosure. But I agree that it’s much better for the same party who maintains the backend to also maintain the browser extension. I could see Google taking over maintenance of the plugin.

        I disagree with Unhosted’s approach because browser-based JS has so many security problems. We discussed it a bit here:


        I’m rarely in Europe but if I attend a conference there, I’ll post about it. I’ve always wanted to attend CCC.

      3. Ohh, I see Michiel from Unhosted already asked you about it. :-) I completely missed that in the last discussion.

        “I disagree with Unhosted’s approach because browser-based JS has so many security problems.”

        That is why I mentioned: establish a protocol first, build an browser extension (or builtin standard like Geolocation and so on) after the fact. That way the browser only has to support one system and it would work for a whole range of applications.

        Or is there something I’m not seeing ?


        I think CCC might be slightly complicated, as I understand it there are a lot more people who would want to attend it than they can handle that many. So they basically have a lottery system or something along those lines in place. So it could be you wouldn’t get in. You also try to get in as a speaker ofcourse. :-)

        I’m in the Netherlands, the country next to Germany, so I can travel that far. Interresting enough the Unhosted folks are based in Berlin (where CCC conferences are held I think), so it would be possible to combine it as well.

  2. I agree that such a thing would need to be designed in from the ground up. I think Google has the incentive and the best shot at that by owning one of the most popular web app sites and a browser. So watch and predict what they’re doing and you may be able to get ahead of that game.

    Brendan Eich has been talking about how to fix the same-page DOM policy as well but he’s not proposing an entire web app management scheme.

Comments are closed.