Why stream ciphers shouldn’t be used for hashing

January 31, 2012October 28, 2012 ~ Nate Lawson ~ 11 Comments

I recently saw a blog post that discussed using RC4 as an ad-hoc hash in order to show why CBC mode is better than ECB. While the author’s example is merely an attempt to create a graphic, it reminded me to explain why a stream cipher shouldn’t be used as as a cryptographic hash.

A stream cipher like RC4 only has one input (the key) and one output, a variable-length keystream. During initialization, the key is expanded and stored in an internal buffer. When the user wants to encrypt or decrypt (both are the same operation), the buffer is updated in some way and keystream bits are output. It’s up to the caller to take that keystream data and XOR it with the plaintext to get the ciphertext (or vice versa). Very simple, right? You just initialize the stream cipher’s state with a key and then turn the crank whenever you want keystream bits.

A cryptographic hash algorithm like SHA-1 also has one input (the data) and one output, the digest. A variable-length stream of input data is crunched in blocks, giving a final output digest that should be difficult to invert, among other properties.

At first glance, it seems that a stream cipher can be used as a cryptographic hash by setting the data to hash as the key, turning the crank, and using some of the keystream as the digest. The reasoning goes, “since it should be difficult to recover the original stream cipher key merely by seeing some of the keystream, the output is usable as a hash”. While this may sound reasonable, it is often wrong, leading to various security problems.

There are numerous, vital design distinctions between stream ciphers and hashes. First, a stream cipher is designed to output an extremely long keystream sequence while a hash digest is a relatively small, fixed-length output. There are design differences that arise from expanding a key vs. compressing input. Also, resistance against a chosen input attack is a requirement for a cryptographic hash, while it may not have been considered for a stream cipher. What could an attacker gain if they can choose the input keys? By definition, they already know the secret key in this case.

The RC4 weakness that led to WEP being broken was a related-key attack. Even though an attacker could not choose WEP keys, the RC4 key was the concatenation of a counter and the secret key. Thus, subsequent outputs of the keystream are derived from closely related input keys.

But to use RC4 for hashing, it would have to be resistant not only to related key attacks, but to a chosen key attack. In this case, the attacker can target weaknesses in your key schedule algorithm by maliciously choosing many keys versus merely knowing that some relation exists between unknown keys that the attacker can’t choose. While chosen-IV attacks are part of the consideration for stream ciphers, I haven’t heard of full chosen-key resistance being an important design criteria. (Please correct me if I’m out of date on this, especially with eStream).

In contrast, resistance to a chosen-input attack is the very definition of a cryptographic hash algorithm. This resistance comes at a performance cost. Turning a hash algorithm into a stream cipher can be done (say, an HMAC using a key and counter), but it’s slower than stream ciphers that were designed as such. Stream cipher designs are optimized for performance and are usually not focused on preventing chosen-key attacks. An interesting corrolary is that analyzing a stream cipher’s key scheduling algorithm as a hash function (e.g., collision resistance) is often a good way to understand its possible weaknesses.

To summarize, don’t use cryptographic primitives for non-standard purposes. There are often built-in assumptions based on the original intended application that could compromise your modified design.

The lost Van Jacobson paper that could save the Internet

December 30, 2011October 28, 2012 ~ Nate Lawson ~ 7 Comments

One of my heroes has always been Van Jacobson. His 1988 paper on solving TCP congestion is an enjoyable read, with cross-discipline appeal. The history of all this is fascinating, such as congestion control’s roots in hydrodynamics theory. (If you want to kill an afternoon, you can read my collection of the history of Internet working in the 80’s and 90’s. I especially like the notes on tuning Sun’s IP stack with hand-coded assembly.)

Since the old days, the IETF has taken over and our congestion problems are more or less solved, right? Well, not exactly. There’s a new congestion storm brewing with our endpoints that is largely the impetus for the network neutrality dispute.

Back in 2008, I wrote some articles about how Random Early Detection (RED) would be more effective than deep packet inspection in solving the congestion apparently caused by Bittorrent. At the time, some ISPs were terminating Bittorrent uploads, supposedly in order to manage their bandwidth. I thought network admins ignored RED because they were control freaks, and deep packet inspection gives you a lot of control over user behavior. But a lost Van Jacobson paper with a diagram of a toilet might be the key to the new congestion problem.

Jim Gettys of Bell Labs has been blogging for about a year on a phenomenon known as “bufferbloat“. This refers to the long queues created by the large buffers of routers, firewalls, cable modems, and other intermediate gateways. Because of Moore’s Law making RAM cheaper and lack of queue management, packets are queued for a long time during congestion instead of being dropped quickly. This misleads TCP congestion control and leads to even more congestion.

Back when RAM was expensive and networks were slow, packets were dropped immediately when congestion was encountered. This created a responsive control system. The transmitter could be sure a packet had been dropped if it didn’t get an ACK within a couple standard deviations of the average round-trip time.

Think of such a network as a stiff spring. As the transmitter “pushed” on one end of the spring, the response force was quickly “felt”, and the sender could back off when the network bandwidth was fully allocated.

Now, increase the bandwidth and intermediate router buffer sizes but maintain the same control system. More bandwidth means that it is normal to have many packets in flight (increased window size). Larger buffers mean more of those packets can be delayed without being dropped. If they are dropped, it happens long after the first congestion actually occurred and the buffer started filling up. Multiply this effect by each hop in the route to the destination.

This gives a control system more like a set of loose springs with gaps in the middle. The transmitter increases the window size until congestion is encountered, probing the available bandwidth. Instead of the first excess packet being dropped, it gets queued somewhere. This happens to many of the packets, until the intermediate buffer is full. Finally, a packet gets dropped but it’s too late — the sender has exceeded the network capacity by the available bandwidth plus the combined sizes of one or more of the intermediate buffers.

Network equipment manufacturers make this worse through a cycle of escalation. When a fast network meets a slower one, there has to be congestion. For example, a wireless router typically offers 50-100 Mbps speeds but is connected to a 5-10 Mbps Internet connection. If the manufacturer provides larger buffers, bursty traffic can be absorbed without packet loss, at least for a little while. But all packets experience a higher latency during this period of congestion, and the delay between transmission and drop grows, making the sender oscillate between over and under utilization.

The congestion problem was solved long ago by RED. When a router starts to experience congestion, it immediately applies an algorithm to fairly drop packets from the queue, weighted by each sender’s portion of bandwidth used. For example, with a simple random algorithm, a sender who is transmitting 50% of the total bandwidth is twice as likely to be dropped as someone using 25%.

Besides dropping packets, the router can also set an explicit congestion notification (ECN) bit on a packet. This communicates a warning to the sender that future packets will be dropped if it keeps increasing the window size. This is better than just dropping the packet since it avoids discarding useful data that the packet is carrying.

It turns out that RED is not enabled on many Internet routers. Jim wrote a fascinating post why. In short, ISPs avoided deploying RED due to some bugs in the original paper and the requirement for manually tuning its parameters. ISPs don’t want to do that and haven’t. But years ago, Van Jacobson had begun to write a paper on how to fix RED.

The lost paper was never published. One roadblock was that the diagram of a toilet offended a reviewer. Also, Van changed jobs and never got around to properly finishing it. He lost the draft and the FrameMaker software for editing it. But recently, the original draft was found and converted into a usable format.

Much remains to be done. This is truly a hard problem. Jim Gettys and others have been building tools to analyze bufferbloat and writing new articles. They’re trying to raise visibility of this issue and come up with a new variant of RED that can be widely deployed. If you’re interested in helping, download the tools or check out Netalyzr.

There’s no single correct solution to eliminating bufferbloat, but I’m hoping a self-tuning algorithm based on RED can be widely deployed in the coming years.

More certs may indicate less security

April 15, 2011 ~ Nate Lawson ~ 14 Comments

In my last post, I mentioned how warning users when a previously-seen cert changes may generate false positives for some sites. If a website has a multiple servers with different certs, the browser may often generate spurious errors for that site. But could this be a symptom of a genuine security problem?

Citibank appears to have one certificate per server. You can verify this yourself by going to their website and multiple times, clearing your browser each time. Clicking on the SSL icon to the left of the URL will show a different cert.

Here are the first 4 bytes of three serial numbers of certs observed at Citibank:

43:8e:67:66
61:22:d4:81
3e:f4:5b:7c

The Citibank certs are all identical except for a few fields. As you would expect, the domain name (CN) field is identical for each. The organizational unit (OU) differs (e.g., “olb-usmtprweb3” versus “…web1”), but this field is not interpreted by browsers and is more of a convenience. The web server’s public key is different in each cert. And, of course, the serial number and signature fields also differ, as they should for all certs.

On the other hand, Wells Fargo appears to have only one cert. This cert (serial 41:c5:cd:90) is the same even after accessing their site via a proxy to ensure some load-balancing magic isn’t getting in the way. It’s easy to ignore this difference, but there might be something else going on.

Protecting the web server’s private key is one of the most important operational security duties. If it is discovered, all past and present encrypted sessions are compromised. (Yes, I know about DHE but it’s not widely used). After cleaning up the mess, the organization needs to get a new certificate and revoke the old one. This is no easy task as CRLs and OCSP both have their downsides.

One key question to ask an opsec department is “have you ever done a live cert revocation?” It’s one of those things that has to be experienced to be understood. In the recent Comodo fiasco, leaf cert revocations were embedded in browser software updates because the existing revocation mechanisms weren’t reliable enough.

Since web servers run commodity operating systems, most big sites use a hardware security module (HSM) to protect the private key. This is a dedicated box with some physical tamper resistance that is optimized for doing private key operations. By limiting the API to the server, HSMs can be hardened to prevent compromise, even if the server is hacked. The main downsides are that HSMs are expensive and may not live up to the original security guarantees as the API surface area grows.

Now, back to the two banks. Why would one have multiple certs but not the other? Certificates cost money, so if you’re offloading SSL to a single accelerator, there’s no reason to give it multiple certs. If each server has a dedicated HSM, you could use separate certs or just generate one and export it to all the others. You need to do this anyway for backup purposes.

This is just supposition, but one thing this could indicate is a different approach to securing the private key. Instead of generating one cert and private key, you create one per server and store it without an HSM. If a server gets compromised, you revoke the private key and move on. This might seem like a good idea to some since the cost of a cert must be lower than an HSM. However, the ineffectiveness of revocation today shows this to be a dangerous choice.

There may be other explanations for this. Perhaps Citi uses individual HSMs and Wells Fargo has a single SSL accelerator with plaintext HTTP in the backend. Perhaps they got a bargain on certs by buying in bulk. However, any time a system has more keys than necessary, it can lead to complicated key management. Or worse, it may indicate a weaker system design overall.

There’s no way to know the real story, but it’s good food for thought for anyone else who might be considering multiple certs as a substitute for strong private key protection. Cert revocation doesn’t currently work and should not be relied on.

Fixing the SSL cert nightmare

April 6, 2011April 15, 2011 ~ Nate Lawson ~ 36 Comments

Recently, there has been an uproar as Comodo failed to do the one thing their business is supposed to do — issue certificates only to authenticated parties. (But what do you expect for a $5, few-questions-asked cert?) I’m glad there’s renewed focus on fixing the current CA and SSL browser infrastructure because this is one of the largest and most obvious security flaws in an otherwise successful protocol.

In response to this compromise, many people are recommending drastic changes. One really bad idea is getting rid of root certs in favor of SSH-style host verification. There are also some good proposals though:

Paring down number of root certs in common browsers

Root cert proliferation has gotten out-of-hand in all the major browsers. It would be great if someone analyzed which CAs are essential for all hostnames in the top 1000 sites. That info could be used to prune root certs to a smaller initial set.

It is unlikely the browser vendors will do this work themselves. They have clear financial and political incentives to add new CAs and few incentives to remove them. Even if you just consider the collateral damage to innocent sites when a root cert is removed, the costs can be huge.

The EFF had a promising start but not as much appears to have been done recently. However, they did publish this article including a list of CAs yesterday, sorted by number of times each CA signed an unqualified hostname. (Comodo is only second to Go Daddy, by the way, and Verisign is pretty high as well).

Notifying the user when a site’s cert changes

This is a pretty simple idea that has some merit. Your browser would cache the cert the first time you connect to a given server. If it changes when you revisit the site, you would get a warning. (Bonus: maintain a worldwide cache of these certs and correlate observations from various locations, like Perspectives or Google’s DNS-based cert history.)

This would have helped in the Comodo case but wouldn’t notify the user if the compromised CA were the same as the server’s current one. This scenario actually occurred in 2001 when Verisign issued another Microsoft code-signing cert to someone posing as an employee.

One usability problem of persistent cert chains is the fact that some sites use many different certs. For example, Citibank appears to have one cert per webserver, something we discuss more in our next post. This means users would get lots of spurious warnings when using their site.

Keep a hit count of CAs previously seen

This is a simple idea I came up with the other day that may be worth investigating. I’d like to see a CA “hit count” displayed alongside the list of root certs in my browser. This would help in auditing which certs are actually used by sites I visit and which are dormant. This could include the hostnames themselves as a collapsible list under each CA cert.

The important goal in considering all these proposals is to avoid making the problem worse. Nearly everyone agrees that the current situation has become untenable, and we need solutions to certificate problems now.

Building a USB protocol analyzer

December 27, 2010December 27, 2010 ~ Nate Lawson ~ 10 Comments

The recent effort by bushing‘s team to develop an open-source USB protocol analyzer reminded me of a quick hack I did previously. I was debugging a tricky USB problem but only had an oscilloscope.

If you’ve been following this blog, you know one of my hobby projects has been designing a USB interface for old Commodore floppy drives. The goal is to archive old data, including the copy-protection bits, before the media fails. Back in January 2009, I was debugging the first prototype board. Most of the commands succeeded but one would fail immediately every time I sent it.

I tried a software USB analyzer, but it didn’t show any more information. The command was returning almost immediately with no data. Debugging output on the device’s UART didn’t show anything abnormal, except it was never receiving the problem command. So the problem had to be between the host and target’s USB stacks, and possibly was in the AVR‘s hardware USB state machine. Only a bus analyzer could reveal what was going on.

Like other hobby developers, I couldn’t justify the cost of a dedicated USB analyzer just to troubleshoot this one problem, especially in a design I would be releasing for free. Since I did have an oscilloscope at work, I decided to build a USB decoding stack on top of it.

USB, like Ethernet and TCP/IP, is a combination of protocols. The lowest layer is the physical cabling and bit signalling. On top of this is packet framing and device addressing. Next, each device has a set of endpoints. These are analogous to TCP/UDP ports and support control, bulk, or interrupt message types. The standard control endpoint (address 0) handles a set of common configuration messages. Other endpoints are device-specific.

High-speed signalling (480 Mbit/s) is a bit different from full/low-speed, so I won’t describe it here. Suffice to say, you can just put a USB 1.1 hub between your device and host to force it to downgrade speeds. Unless you’re trying to debug a problem with high-speed signalling itself, this is sufficient to debug protocol-level issues.

The USB physical layer uses differential current flow to signal bits. This balances the charge, decreasing the latency for line transitions and increasing noise rejection. I hooked up probes to the D+ and D- lines and saw a trace like this:

Each zero bit in USB is signalled by a transition, low to high or high to low. A one bit is signalled by no transition for the clock period. (This is called NRZI encoding). Obviously, there’s a chance for sender and receiver clocks to drift out of sync if there are too many one bits in a row, so a zero bit is stuffed into the frame after every 6 one bits. It is discarded by the receiver. An end-of-packet is signalled by a single-ended zero (SE0), which is both lines held low. You can see this at the beginning of the trace above

To start each packet, USB sends an 0x80 byte, least-significant bit first. This is 7 transitions followed by a one bit, allowing the receiver to synchronize their clock on it. You can see this in the trace above, just after the end-of-packet from the previous frame. After the sync bits, the rest of the frame is byte-oriented.

The host initiates every transaction. In a control transfer, it sends the command packet, generates an optional data phase (in/out from device), and ends with a status phase. If the transaction failed, the device returns an error byte.

My decoding script implemented all the layers in the quickest way possible. After taking a scope trace, I’d dump the samples to a file. The script would then run through them, looking for the first edge. If this edge was part of a sync byte, it would begin byte-aligned decoding of a frame to pass up to higher-level functions. At the end of the packet, it would go back to scanning for the next edge. Using python’s generators made this quite easy since it was just a series of nested loops instead of a complicated state machine.

Since this was a quick hack, I cut corners. To detect the SE0 end-of-packet, you really need to monitor both D+ and D-. At higher speeds, the peaks get lower since less current is exchanged. However, at lower speeds, you can ignore this and just put a scope probe on the D- line. Instead of proper decoding of the SE0, I’d just decode each frame until no more data was expected and then yield a fake EOP symbol to the upper layers.

After a few days of debugging, I found the problem. The LUFA USB stack I was using in my firmware had a bug. It had a filter for standard control messages (such as endpoint configuration) that it handled for you. Class-specific transactions were passed up to a handler in my firmware. The bug was that the filter was too permissive — all control transfers of type 6, even if they were class-specific, were captured by LUFA. This ended up returning an error without ever passing the message to my firmware. (By the way, the LUFA stack is excellent, and this bug has long since been fixed).

Back in the present, I’m glad to see the OpenVizsla project creating a cheaper USB analyzer. It should be a great product. Based on my experience, I have some questions about their approach I hope are helpful.

It seems kind of strange that they are going for high-speed support. Since the higher-level protocol messages you might want to reverse-engineer are the same regardless of speed, it would be cheaper to just handle low/full speed and use a hub to force devices to downgrade. I guess they might be dealing with proprietary devices, such as the Kinect, that refuse to operate at lower speeds. But if that isn’t the case, their namesake, the Beagle 12, is a great product for only $400.

I have used the Total Phase Beagle USB analyzers, and they’re really nice. As with most products these days, the software makes the difference. They support Windows, Mac, and Linux and have a useful API. They can output data in CSV or binary formats. They will be supporting USB 3.0 (5 Gbps) soon.

I am glad OpenVizsla will be driving down the price for USB analyzers and providing an option for hobbyists. At the same time, I have some concern that it will drive away business from a company that provides open APIs and well-supported software. Hopefully, Total Phase’s move upstream to USB 3.0 will keep them competitive for people doing commercial development and the OpenVizsla will fill an underserved niche.

Final post on Javascript crypto

November 29, 2010February 8, 2011 ~ Nate Lawson ~ 132 Comments

The talk I gave last year on common crypto flaws still seems to generate comments. The majority of the discussion is by defenders of Javascript crypto. I made JS crypto a very minor part of the talk because I thought it would be obvious why it is a bad idea. Apparently, I was wrong to underestimate the grip it seems to have on web developers.

Rather than repeat the same rebuttals over and over, this is my final post on this subject. It ends with a challenge — if you have an application where Javascript crypto is more secure than traditional implementation approaches, post it in the comments. I’ll write a post citing you and explaining how you changed my mind. But since I expect this to be my last post on the matter, read this article carefully before posting.

To illustrate the problems with JS crypto, let’s use a simplified example application: a secure note-taker. The user writes notes to themselves that they can access from multiple computers. The notes will be encrypted by a random key, which is itself encrypted with a key derived from a passphrase. There are three implementation approaches we will consider: traditional client-side app, server-side app, and Javascript crypto. We will ignore attacks that are common to all three implementations (e.g., weak passphrase, client-side keylogger) and focus on their differences.

The traditional client-side approach offers the most security. For example, you could wrap PGP in a GUI with a notes field and store the encrypted files and key on the server. A client who is using the app is secure against future compromise of the server. However, they are still at risk of buggy or trojaned code each time they download the code. If they are concerned about this kind of attack, they can store a local copy and have a cryptographer audit it before using it.

The main advantage to this approach is that PGP has been around almost 20 years. It is well-tested and the GUI author is unlikely to make a mistake in interfacing with it (especially if using GPGME). The code is open-source and available for review.

If you don’t want to install client-side code, a less-secure approach is a server-side app accessed via a web browser. To take advantage of existing crypto code, we’ll use PGP again but the passphrase will be sent to it via HTTP and SSL. The server-side code en/decrypts the notes using GPGME and pipes the results to the user.

Compared to client-side code, there are a number of obvious weaknesses. The passphrase can be grabbed from the memory of the webserver process each time it is entered. The PGP code can be trojaned, possibly in a subtle way. The server’s /dev/urandom can be biased, weakening any keys generated there.

The most important difference from a client-side attack is that it takes effect immediately. An attacker who trojans a client app has to wait until users download and start using it. They can copy the ciphertext from the server, but it isn’t accessible until someone runs their trojan, exposing their passphrase or key. However, a server-side trojan takes effect immediately and all users who access their notes during this time period are compromised.

Another difference is that the password is exposed to a longer chain of software. With a client-side app, the passphrase is entered into the GUI app and passed over local IPC to PGP. It can be wiped from RAM after use, protected from being swapped to disk via mlock(), and generally remains under the user’s control. With the server-side app, it is entered into a web browser (which can cache it), sent over HTTPS (which involves trusting hundreds of CAs and a complex software stack), hits a webserver, and is finally passed over local IPC to PGP. A compromise of any component of that chain exposes the password.

The last difference is that the user cannot audit the server to see if an attack has occurred. With client-side code, the user can take charge of change management, refusing to update to new code until it can be audited. With a transport-level attack (e.g., sslstrip), there is nothing to audit after the fact.

The final implementation approach is Javascript crypto. The trust model is similar to server-side crypto except the code executes in the user’s browser instead of on the server. For our note-taker app, the browser would receive a JS crypto library over HTTPS. The first time it is used, it generates the user’s encryption key and encrypts it with the passphrase (say, derived via PBKDF2). This encrypted key is persisted on the server. The notes files are en/decrypted by the JS code before being sent to the server.

Javascript crypto has all the same disadvantages as server-side crypto, plus more. A slightly modified version of all the server-side attacks still works. Instead of trojaning the server app, an attacker can trojan the JS that is sent to the user. Any changes to the code immediately take effect for all active users. There’s the same long chain of software having access to critical data (JS code and the password processed by it).

So what additional problems make JS crypto worse than the server-side approach?

Numerous libraries not maintained by cryptographers — With a little searching, I found: clipperz, etherhack, Titaniumcore, Dojo, crypto-js, jsSHA, jscryptolib, pidCrypt, van Everdingen’s library, and Movable Type’s AES. All not written or maintained by cryptographers. One exception is Stanford SJCL, although that was written by grad students 6 months ago so it’s too soon to tell how actively tested/maintained it will be.
New code has not been properly reviewed and no clear “best practices” for implementers — oldest library I can find is 2 years old. Major platform-level questions still need to be resolved by even the better ones.
Low-level primitives only — grab bag of AES, Serpent, RC4, and Caesar ciphers (yes, in same library). No high-level operations like GPGME. Now everyone can (and has to) be a crypto protocol designer.
Browser is low-assurance environment — same-origin policy is not a replacement for ACLs, privilege separation, memory protection, mlock(), etc. JS DOM allows arbitrary eval on each element and language allows rebinding most operations (too much flexibility for crypto).
Poor crypto support — JS has no secure PRNG such as /dev/urandom, side channel resistance is much more difficult if not impossible
Too many platforms — IE, Firefox, Netscape, Opera, WebKit, Konqueror, and all versions of each. Crypto code tends to fail catastrophically in the face of platform bugs.
Auditability — each user is served a potentially differing copy of the code. Old code may be running due to browser cache issues. Impossible for server maintainers to audit clients.

JS crypto is not even better for client-side auditability. Since JS is quite lenient in allowing page elements to rebind DOM nodes, even “View Source” does not reveal the actual code running in the browser. You’re only as secure as the worst script run from a given page or any other pages it allows via document.domain.

I have only heard of one application of JS crypto that made sense, but it wasn’t from a security perspective. A web firm processes credit card numbers. For cost reasons, they wanted to avoid PCI audits of their webservers, but PCI required any server that handled plaintext credit card numbers to be audited. So, their webservers send a JS crypto app to the browser client to encrypt the credit card number with an RSA public key. The corresponding private key is accessible only to the backend database. So based on the wording of PCI, only the database server requires an audit.

Of course, this is a ludicrous argument from a security perspective. The webserver is a critical part of the chain of trust in protecting the credit card numbers. There are many subtle ways to trojan RSA encryption code to disclose the plaintext. To detect trojans, the web firm has a client machine that repeatedly downloads and checksums the JS code from each webserver. But an attacker can serve the original JS to that machine while sending trojaned code to other users.

While I agree this is a clever way to avoid PCI audits, it does not increase actual security in any way. It is still subject to the above drawbacks of JS crypto.

If you’ve read this article and still think JS crypto has security advantages over server-side crypto for some particular application, describe it in a comment below. But the burden of proof is on you to explain why the above list of drawbacks is addressed or not relevant to your system. Until then, I am certain JS crypto does not make security sense.

Just because something can be done doesn’t mean it should be.

Epilogue

Auditability of client-side Javascript

I had overstated the auditability of JS in the browser environment by saying the code was accessible via “View Source”. It turns out the browser environment is even more malleable than I first thought. There is no user-accessible menu that tells what code is actually executing on a given page since DOM events can cause rebinding of page elements, including your crypto code. Thanks to Thomas Ptacek for pointing this out. I updated the corresponding paragraph above.

JS libraries such as jQuery, Prototype, and YUI all have APIs for loading additional page elements, which can be HTML or JS. These elements can rebind DOM nodes, meaning each AJAX query can result in the code of a page changing, not just the data displayed. The APIs don’t make a special effort to filter out page elements, and instead trust that you know what you’re doing.

The same origin policy is the only protection against this modification. However, this policy is applied at the page level, not script level. So if any script on a given page sets document.domain to a “safe” value like “example.net”, this would still allow JS code served from “ads.example.net” to override your crypto code on “www.example.net”. Your page is only as secure as the worst script loaded from it.

Brendan Eich made an informative comment on how document.domain is not the worst issue, separation of privileges for cross-site scripts is:

Scripts can be sourced cross-site, so you could get jacked without document.domain entering the picture just by <script src=”evil.ads.com”>. This threat is real but it is independent of document.domain and it doesn’t make document.domain more hazardous. It does not matter where the scripts come from. They need not come from ads.example.net — if http://www.example.net HTML loads them, they’re #include’d into http://www.example.net’s origin (whether it has been modified by document.domain or not).

In other words, if you have communicating pages that set document.domain to join a common superdomain, they have to be as careful with cross-site scripts as a single page loaded from that superdomain would. This suggests that document.domain is not the problem — cross-site scripts having full rights is the problem. See my W2SP 2009 slides.

“Proof of work” systems

Daniel Franke suggested one potentially-useful application for JS crypto: “proof of work” systems. These systems require the client to compute some difficult function to increase the effort required to send spam, cause denial of service, or bruteforce passwords. While I agree this application would not be subject to the security flaws listed in this article, it would have other problems.

Javascript is many times slower than native code and much worse for crypto functions than general computation. This means the advantage an attacker has in creating a native C plus GPU execution environment will likely far outstrip any slowness legitimate users will accept. If the performance ratio between attacker and legitimate users is too great, Javascript can’t be used for this purpose.

He recognized this problem and also suggested two ways to address it: increase the difficulty of the work function only when an attack is going on or only for guesses with weak passphrases. The problem with the first is that an attacker can scale up their guessing rate until the server slows down and then stay just below that threshold. Additionally, she can parallelize guesses for multiple users, depending on what the server uses for rate-limiting. One problem with the second is that it adds a round-trip where the server has to see the length of the attacker’s guess before selecting a difficulty for the proof-of-work function. In general, it’s better to select a one-size-fits-all parameter than to try to dynamically scale.

Browser plugin can checksum JS crypto code

This idea helps my argument, not hurts it. If you can deploy a custom plugin to clients, why not run the crypto there? If it can access the host environment, it has a real PRNG, crypto library (Mozilla NSS or Microsoft CryptoAPI), etc. Because of Javascript’s dynamism, no one knows a secure way to verify signatures on all page elements and DOM updates, so a checksumming plugin would not live up to its promise.

Scripts can be sourced cross-site, so you could get jacked without document.domain entering the picture just by <script src=”evil.ads.com”></script>. This threat is real but it is independent of document.domain and it doesn’t make document.domain more hazardous. It does not matter where the scripts come from. They need not come from ads.example.net — if http://www.example.net HTML loads them, they’re #include’d into http://www.example.net‘s origin (whether it has been modifeid by document.domain or not).

In other words, if you have communicating pages that set document.domain to join a common superdomain, they have to be as careful with cross-site scripts as a single page loaded from that superdomain would.

This suggests that document.domain is not the problem — cross-site scripts having full rights is the problem. See my W2SP 2009 slides.