Digging Into the NSA Revelations

Last year was a momentous one in revelations about the NSA, technical espionage, and exploitation. I’ve been meaning for a while to write about the information that has been revealed by Snowden and what it means for the public crypto and security world.

Part of the problem has been the slow release of documents and their high-level nature. We’ve now seen about 6 months of releases, each covering a small facet of the NSA. Each article attempts to draw broad conclusions about the purpose, intent, and implementation of complex systems, based on leaked, codeword-laden Powerpoint. I commend the journalists who have combed through this material as it is both vague and obfuscated, but I often cringe at the resulting articles.

My rule of thumb whenever a new “earth shattering” release appears is to skip the article and go straight for the backing materials. (Journalists, please post your slide deck sources to a publicly accessible location in addition to burying them in your own site’s labyrinth of links.) By doing so, I’ve found that some of the articles are accurate, but there are always a number of unwarranted conclusions as well. Because of the piecemeal release process, there often aren’t enough additional sources to interpret each slide deck properly.

I’m going to try to address the revelations we’ve seen by category: cryptanalysis, computer exploitation, software backdoors, network monitoring, etc. There have been multiple revelations in each category over the past 6 months, but examining them in isolation has resulted in reversals and loose ends.

For example, the first conclusion upon the revelation of PRISM was that the NSA could directly control equipment on a participating service’s network in order to retrieve emails or other communications. Later, the possibility of this being an electronic “drop box” system emerged. As of today, I’m unaware of any conclusive proof as to which of these vastly differing implementations (or others) were referred to by PRISM.

However, this implementation difference has huge ramifications for what the participating services were doing. Did they provide wholesale access to their networks? Or were they providing court-ordered information via a convenient transfer method after reviewing the requests? We still don’t know for sure, but additional releases seem to confirm that at least many Internet providers did not intentionally provide wholesale access to the NSA.

Unwarranted jumping to conclusions has created a new sport, the vendor witch hunt. For example, the revelation of DROPOUTJEEP, an iPhone rootkit, was accompanied by allegations that Apple cooperated with the NSA to create it. It’s great that Jacob Applebaum worked with the Spiegel press, applying his technical background, but he is overreaching here.

Jacob said, “either they [NSA] have a huge collection of exploits that work against Apple products … or Apple sabotaged it themselves.” This ignores a third option, which is that reliable exploitation against a limited number of product versions can be achieved with only a small collection of exploits.

The two critical pieces of information that were underplayed here are that the DROPOUTJEEP description was dated October 1, 2008 and says “the initial release will focus on installing the implant via close access methods” (i.e., physical access) and “status: in development”.

What was October 2008 like? Well, there were two iPhones, the original and just-released 3G model. There were iOS versions 1.0 – 1.1.4 and 2.0 – 2.1 available as well. Were there public exploits for this hardware and software? Yes! The jailbreak community had reliable exploitation (Pwnage and Pwnage 2.0) on all of these combinations via physical access. In fact, these exploits were in the boot ROM and thus unpatchable and reliable. Meanwhile, ex-NSA TAO researcher Charlie Miller publicly exploited iOS 1.x from remote in summer 2007.

So the NSA in October 2008 was in the process of porting a rootkit to iOS, with the advantage of a publicly-developed exploit in the lowest levels of all models of the hardware, and targeting physical installation. Is there any wonder that such an approach would be 100% reliable? This is a much simpler explanation and is not particularly flattering to the NSA.

One thing we should do immediately is stop the witch hunts based on incomplete information. Some vendors and service providers have assisted the NSA and some haven’t. Some had full knowledge of what they were doing, some should have known, and others were justifiably unaware. Each of their stories is unique and should be considered separately before assuming the worst.

Next time, I’ll continue with some background on the NSA that is essential to interpreting the Snowden materials.

History of memory corruption vulnerabilities and exploits

I came across a great paper, “Memory Errors: The Past, the Present, and the Future” by van der Veen et al. The authors cover the history of memory corruption errors as well as exploitation and countermeasures. I think there are a number of interesting conclusions to draw from it.

It seems that the number of flaws in common software is still much too high. Consider what’s required to compromise today’s most hardened consumer platforms, iOS and Chrome. You need a flaw in the default install that is useful and remotely accessible, memory disclosure bug, sandbox bypass (or multiple ones), and often a kernel or other privilege escalation flaw.

Given a sufficiently small trusted computing base, it should be impossible to find this confluence of flaws. We clearly have too large a TCB today since this combination of flaws has been found not once, but multiple times in these hardened products. Other products that haven’t been hardened require even less flaws to compromise, making them more vulnerable even if they have the same rate of bug occurrence.

The paper’s conclusion shows that if you want to prevent exploitation, your priority should be preventing stack, heap, and integer overflows (in that order). Stack overflows are by far still the most commonly exploited class of memory corruption flaws, out of proportion to their prevalence.

We’re clearly not smart enough as a species to stop creating software bugs. It takes a Dan Bernstein to reason accurately about software in bite-sized chunks such as in qmail. It’s important to face this fact and make fundamental changes to process and architecture that will make the next 18 years better than the last.

Has HTML5 made us more secure?

Brad Hill recently wrote an article claiming that HTML5 has made us more secure, not less. His essential claim is that over the last 10 years, browsers have become more secure. He compares IE6, ActiveX, and Flash in 2002 (when he started in infosec) with HTML5 in order to make this point. While I think his analysis is true for general consumers, it doesn’t apply to more valuable targets, who are indeed less secure with the spread of HTML5.

HTML5 is a broad grouping of features, and there are two parts that I think are important to increasing vulnerability. First, there is the growing flexibility in parsing elements for JavaScript, CSS, SVG, etc., including the interpretation of relationships between them. Second, there’s the exposure of complex decoders for images, video, audio, storage, 3D graphics, etc. to untrusted sources.

If you look at the vulnerability history for these two groups, both are common culprits in flaws that lead to untrusted code execution. They still regularly exhibit “game over” vulnerabilities, in Firefox and Chrome. Displaying a PNG has been exploitable as recently as 2012. Selecting a font via CSS was exploitable in 2010. In many cases, these types of bugs are interrelated. A flaw in a codec could require heap grooming via JavaScript to be reliably exploitable. HTML5’s increased surface area of more parsing and complex decoders standardizes remote, untrusted access to components that are still the biggest source of code execution vulnerabilities in the browser, despite attempts to audit and harden them.

Additionally, it exposes elements that have not had this kind of attention. WebGL hands over access to your 3D graphics stack, something which even CERT thinks is worth disabling. If you want to know the future of exploitation, you need to keep an eye on the console and iPhone/Android hacking groups. 3D shaders were the first software exploit of the Xbox 360, a platform that is much more secure than any browser. And Windows GDI was remotely exploitable in 2009. Firefox WebGL is built on top of Mesa, which is software from the bad old days of 1993. How is it going to do any better than Microsoft’s most secure platform?

As an aside, a rather poor PR battle about WebGL is worth addressing here. An article by a group called Context in 2011 raised some of these same issues, but their exploit was only a DoS. Mozilla devs jumped on this right away. Their solution is a whitelist and blacklist for graphics drivers. A blacklist is great for everyone after a 0-day has been discovered and fixed and deployed, but not so good before then.

Call me a luddite, but I measure security by what I can easily disable or route around and ignore. Flash is easily blocked and can be uninstalled. JavaScript can be disabled with a browser setting or filtered. But HTML5? Well, that’s knit into pretty much every area of the browser. You want to disable WebGL? No checkbox, but at least there’s about:config. Just make sure no one set “webgl.force-enabled” or whatever the next software update adds to your settings. Want to disable parts of CSS but not page layout? Want a no-codec browser? Get out the compiler.

Browser vendors don’t care about the individual target getting compromised; they care about the masses. The cost/benefit tradeoff for these two groups are completely opposite. Otherwise, we’d see vendors competing for who could remove as many features as possible to produce the qmail of browsers.

Security happens in waves. If you’re an ordinary user, the work of Microsoft and Google in particular have paid off for you over the past 10 years. But woe to you if you manage high-value targets. The game of whack-a-mole with the browser vendors has been getting worse, not better. The more confident they get from their bug bounties and hardening, the more likely they are to add complex, deeply intertwined features. And so the pendulum begins swinging back the other way for everyone.

Cyber-weapon authors catch up on blog reading

One of the more popular posts on this blog was the one pointing out how Stuxnet was unsophisticated. Its use of traditional malware methods and lack of protection for the payload indicated that the authors were either “Team B” or in a big hurry. The post was intended to counteract the breathless praise in the press for the advent of sophisticated “cyber-weapons”.

This year, more information was released in the New York Times that gave more support for both theories. The authors may not have had a lot of time due to political pressure and concern about Iran’s progress. The uneasy partnership between the US and Israel may have led to both parties keeping their best tricks in their back pockets.

A lot of people seemed skeptical about the software protection method I described called “secure triggers”. (I had written about this before also, calling it “hash-and-decrypt”.) The general idea is to gather information about the environment in order to generate a cryptographic key, which is used to decrypt the payload. If even one bit of info is incorrect, the payload can’t be decrypted. The analyst has to brute-force the proper environment, which can be made infeasible if there’s enough entropy and/or the validation method is too slow.

The critics claimed that secure triggers were too complicated or unable to withstand malware analyst scrutiny. However, this approach had been used successfully in everything from Core Impact to Blu-ray to Team Twiizers exploits, so it was feasible. Either the malware developers were not aware of this technique or there were other constraints, such as time, preventing it from being used.

Now we’ve got Gauss, which uses (surprise!) this exact technique. And, it turns out to be somewhat effective in preventing Kaspersky from analyzing the payload. We either predicted or caused the future, take your pick.

Is this the endgame? Not even, but it does mean we’re ready for the next stage.

The malware industry has had a stable environment for a while. Targeted attacks were rare, and most new malware authors hadn’t spent a lot of effort building in custom protection for their payloads. Honeypots and local analysis methods assume the code and behavior remain stable between the malware analyst’s environment and the intended target.

In the next stage, proper use of mechanisms like secure triggers will divide malware analysis into two phases: infection and payload. The infection stage can be analyzed with traditional techniques in order to find the security flaws exploited, propagation method, etc. The payload stage will change drastically, with more effort being spent on in situ analysis.

When the payload only decrypts and runs on a single target system, the malware analyst will need direct access to the compromised host. There are several forms this might take. The obvious one is providing a remote shell to the analyst to log in, attach a debugger, try to get a memory dump of the process, etc. This is dangerous because it involves giving an outsider access to a trusted system, and one that might be critical to other operations. Even if a whole-system memory dump is generated, say by physical access or a cold-boot attack, there is still going to be a lot of sensitive information there.

Another approach is emulation. The analyst uses a VM that turns all local syscalls into remote ones. This is connected to the compromised target host (or a clone of it), which runs a daemon to answer the API queries. The malware sample or relevant portions of it (such as the hash-and-decrypt routine) are run in the analyst’s VM, but the information the routine gathers comes directly from the compromised host. This allows the analyst to gather the relevant information while not having full access to the compromised machine.

In the next phase after this, malware authors add more anti-emulation checks to their payload decryption routine. They try to prevent this routine from being run in isolation, in an emulator. Eventually, you end up in a cat-and-mouse game of Core Wars on the live hardware. Malware keeps a closely-synchronized global heartbeat so that any attempt to dump and restart it on a single host corrupts its state irrecoverably. The payload, its triggers, and encryption keys evolve in coordination with the other hosts on the network and are tied closely to each machine’s identity.

Is this where we’re headed? I’m not sure, but I do know that software protection measures are becoming more, not less relevant.

More on the evolution of password security

Last time, we covered three factors that affect actual security of a password:

  1. Entropy — How many possibilities does the attacker need to consider?
  2. Guess rate — How quickly can the attacker try guesses, often determined by vantage point.
  3. Responses — What can the admin do about guessing attempts?

There’s another factor that will soon come into play, if it hasn’t already — the ongoing exposure of actual passwords as more sites are compromised. We’ve seen the simplest form of this when password reuse on an unimportant account leads to elevated access of a more important one. But that’s only the tip of the iceberg.

With massive compromises of plaintext passwords, attackers now have a growing source of wordlists derived from actual usage. Not only can you add the most common passwords to a wordlist, but you can even sort them in decreasing order of frequency. An astute attacker could even apply machine learning techniques like clustering and classification to determine which other words are missing. This could be used to identify popular memes (such as Korean pop stars), and lead to new words that are likely to be used in the future.

Hashed passwords posted after compromises are increasing attacker knowledge as well. Sure, your password hasn’t immediately been exposed but it remains available to anyone with the right wordlist or enough computing power, forever. As more of these are cracked, the global picture gets clearer, and you may be vulnerable to a targeted attack long after the original site is gone.

At a higher level, not only are compromised passwords useful in identifying missing words within a group, but they’re also useful in identifying the templates people use to construct passwords. After a compromise, not only are your password and close variants now vulnerable, but also people using the same scheme to choose their passwords. For example, automated analysis could determine that more users put the site name after than before the base word. Or the number 4 is more common as the first numeric value, but only with English speakers.

All of these factors mean that attackers face less entropy as more passwords are revealed. Site compromises not only reveal passwords themselves, but the thinking of the users behind the passwords. Trends in word choice give a more optimal order for cracking. Higher-level templates used to generate passwords are also revealed. Even your joke passwords on useless sites reveal something of your thought patterns.

The only answer may be to take password selection out of the hands of users. Truly random but memorable passwords don’t reveal anything beyond the password itself. And where possible, passwords can be avoided completely. For example, tokens or out-of-band communication can often be used for authentication. Since most devices are connected, such tokens can be shared between paired devices.

All that’s certain is that attackers will be winning the password game for years to come, and there are still many rich patterns to be mined from previous compromises.

On the evolving security of password schemes

Passwords have been around for millenia. The oldest use was to prove to sentries whether you were friend or foe. (I use “password” in this article to include a wide variety of schemes, including passphrases and numerical PINs).

Choosing a proper password has always required considering the attacker’s capabilities and limitations. There are several factors at play:

  1. Entropy — How many possibilities does the attacker need to consider? It may be lower than you’d think since it only depends on how little an attacker knows about you. If she has retrieved paper from your shredder, there may only be a few possibilities to try, even though the password itself is complex and impossible for someone else to guess.
  2. Guess rate — How quickly can the attacker try guesses? This is often determined by the attacker’s vantage point. If they have hashes from your server, an offline attack is many times faster than an online attack. It can also be limited by the attacker’s own hardware and by the hashing algorithm.
  3. Responses — What can you do about guessing? Can you disable a user’s account if there are too many bad attempts? Require secondary authentication? Can you shut down the entire system?

Unix systems in the 1980’s and 90’s limited passwords to 8 characters because of the the underlying DES cipher key size. Because this is too short to use multiple words, most recommendations from that time tried to maximize the entropy of every character while remaining memorable. A common suggestion was to take the first letter of each word in a phrase and mix in some punctuation and numerals. This kind of scheme persists to this day, with many websites enforcing a minimum (and sometimes maximum) length and the use of an uppercase letter or numeral.

Password cracking programs such as Crack (Unix) and Cracker Jack (DOS) targeted this scheme. To mirror user behavior, they would take a dictionary (wordlist) and append numerals or change case. A useful strategy would be to start with a common wordlist and add in local terms such as sports teams or city names. After a few passwords were cracked, you could identify patterns (such as user nationality or college major) and add similar terms to your set. But as long as the user didn’t use too short of a password or an actual word or close variant as the base string, they would usually be secure against Crack.

With the advent of the FreeBSD-MD5 scheme in the early 90’s, passwords could now be arbitrarily long. This brought login systems in line with PGP, which had supported long passwords for a while. The recommended scheme then changed to “use a difficult-to-guess passphrase.” However, not many concrete recommendations were made for what makes a passphrase difficult enough.

Many users thought that just having any passphrase was difficult enough. Who could guess all the letters and spaces among multiple words? While this might have been true if attackers stuck to Cracker Jack, it ignores the fact that attackers can change strategies. Each word can be treated like a single character as before. As long as the words were in a dictionary, multi-word passphrases might have less entropy than a password constructed the old way. Newer tools like John the Ripper help target passphrases.

In choosing a password, consider the entropy for multiple attacker vantage points. How much advantage would a co-worker have over a random stranger? Do they realize you like good Scotch and might use those names in your passphrase? Know you like Will Farrell movies and might use a quote from one? A good passphrase is one where even your spouse would not have an advantage over a stranger.

Additional entropy can be gained by varying it. Misspell or make up words, Dr. Seuss style (but don’t use words from his books!) Ever heard of a “omliyeti”? Me neither, but it might be memorable. Don’t capitalize the first word or put the punctuation (if any) at the end. Put spaces in the middle of words but run the beginning/ends together.

Admins can suggest schemes to help users pick good passwords, and they can attempt to crack their choices to establish password strength. But a user might still pick a low-entropy password that happens to pass this check. Fortunately, the second two factors above (guess rate and responses) are independent of entropy yet still have a big impact on actual password security.

The bcrypt and scrypt password hashing algorithms have greatly slowed the attacker’s guess rate. They use hash functions that are intentionally slow (and in the case of scrypt, memory intensive). More importantly, they have a tunable difficulty parameter that allows the admin to keep pace with Moore’s Law.

Responses can be very important as well. PINs can be numeric and short because access is usually limited to online guessing with lockout after a few tries. One approach I’ve used before is to seed the password file with fake accounts that have easier passwords than the rest (but still hard enough to prevent online guessing). If anyone logs in to them, we know the password file has been retrieved and someone is cracking it.

Another response would be to require secondary authentication. Google does this with their text message authentication. Duo Security provides a phone app. This can be required all the time or activated when the user logs in from a new IP address or doesn’t have the prerequisite cookie.

Password security is a difficult problem, especially with a varied user base. However, most admins focus too much on increasing entropy of user choices and not enough on decreasing the attacker’s guess rate and implementing responses to limit their access when they do get a hit.