root labs rdist

September 23, 2013

20 Years of Internet

Filed under: Protocols,Security — Nate Lawson @ 5:00 am

This month marks my 20th anniversary of first connecting to the Internet. It seems like a good time to look back on the changes and where we can go from here.

I grew up in a rural area, suspecting but never fully realizing the isolation from the rest of the world, technology or otherwise. Computers and robots of the future lived in the ephemeral world of Sears catalogs and Byte magazines rescued from a dumpster. However, the amateur radio and remote-controlled plane hobbies of my father’s friends brought the world of computing and electronics to our house.

Still, communications were highly local. The VIC-20 could connect to a few BBS systems and my father’s industrial control of warehouse refrigeration systems (way before SCADA). However, anything beyond that incurred long distance charges and thus was irrelevant. Only the strange messages and terminology in cracked games, distributed from faraway places like Sweden, hinted at a much broader world out there.

Towards the end of high school, our local BBS finally got a FidoNet connection. Text files started trickling in about hacking COSMOS to change your “friend’s” phone service and building colored boxes to get free calls. One of those articles described how to use the Internet. I’d spend hours trying to remember all the protocol acronyms, TCP port numbers, etc. The Internet of my imagination was a strange amalgamation of X.25, ARPA protocols, TCP/IP, and the futuristic OSI protocols that were going to replace TCP/IP.

Once I arrived at college, I was one of the first in line to register for an Internet account. Our dorm room had an always-on serial connection to the campus terminal server and Ethernet was coming in a few weeks. It took some encouraging from my friends to make the jump to Ethernet (expensive, and 10BASE-T was barely standardized so it was hard to figure out if a given NIC would even work). Along with free cable TV, you’ve got to wonder, “what were they thinking?”

The dorm Ethernet experiment soon became a glorious free-for-all. There was a lot of Windows 3.1 and Linux, but also a few NeXTSTEP and Sun systems. Campus network admin had its hands full, bungling rushed policy changes intended to stop the flood of warez servers, IPX broadcast storms from Doom games, IRC battles, sniffing, hacking, and even a student running a commercial ISP on the side. Life on the dorm network was like a 24/7 Defcon CTF, but if you failed, you were reinstalling your OS from 25 floppies before you could do your homework.

There were three eras I got to see: Usenet (ending in 1994), early Web (1994-1997), and commercial Web (1998 to present). The Usenet era involved major changes in distributed protocols and operating systems, including the advent of Linux and other free Unixes. The early Web era transitioned to centralized servers with HTTP, with much experimentation in how to standardize access to information (remember image maps? Altavista vs. Lycos?) The commercial Web finally gave the non-technical world a reason to get online, to buy and sell stuff. It continues to be characterized by experimentation in business models, starting with companies like eBay.

One of my constant annoyances with technological progress is when we don’t benefit from history. Oftentimes, what comes along later is not better than what came before. This leads to gaps in progress, where you spend time recapitulating the past before you can truly move on to the predicted future.

Today, I morn the abandonment of the end-to-end principle. I don’t mean networking equipment has gotten too smart for its own good (though it has). I mean that we’re neglecting a wealth of intelligence at the endpoints and restricting them to a star topology, client/server communication model.

Multicast is one example of how things could be different. Much of the Internet data today is video streams or OS updates. Multicast allows a single transmission to be received by multiple listeners, building a dynamic tree of routes so that it traverses a minimal set of networks. Now, add in forward error-correction (allows you to tune in to a rotating transmission at any point in time and reconstruct the data) and distributed hash tables (allows you to look up information without a central directory) and you have something very powerful.

Bittorrent is a hack to leverage an oversight in the ISP pricing model. Since upload bandwidth from home broadband was underutilized but paid for, Bittorrent could reduce the load on centralized servers by augmenting them with users’ connections. This was a clever way to improve the existing star topology of HTTP downloads but would have been unnecessary if proper distributed systems using multicast were available.

We have had the technology for 20 years but a number of players have kept it from being widely deployed. Rapid growth in backbone bandwidth meant there wasn’t enough pricing pressure to reduce wastefulness. The domination of Windows and its closed TCP/IP stack meant it was difficult to innovate in a meaningful way. (I had invented a TCP NAT traversal protocol in 1999 that employed TCP simultaneous connect, but Windows had a bug that caused such connections to fail so I had to scrap it.) There have been bugs in core router stacks, and so multicast is mostly disabled there.

Firewalls are another symptom of the problem. If you had a standardized way to control endpoint communications, there would be no need for firewalls. You’d simply set policies for the group of computers you controlled and the OS on each would figure out how to apply them. However, closed platforms and a lack of standardization mean that not only do we still have network firewalls, but numerous variants of host-based firewalls as well.

Since the late 90’s, money has driven an intense focus on web-based businesses. In this latest round of tech froth, San Francisco is the epicenter instead of San Jose. Nobody cares what router they’re using, and there’s a race to be the most “meta”. Not only did EC2 mean you don’t touch the servers, but now Heroku means you don’t touch the software. But as you build higher, the architectures get narrower. There is no HTTP multicast and the same-origin policy means you can’t even implement Bittorrent in browser JavaScript.

It seems like decentralized protocols only appear in the presence of external pressure. Financial pressure doesn’t seem to be enough so far, but legal pressure led to Tor, magnet links, etc. Apple has done the most of anyone commercially in building distributed systems into their products (Bonjour service discovery, Airdrop direct file sharing), but these capabilities are not employed by many applications. Instead, we get simulated distributed systems like Dropbox, which are still based on the star topology.

I hope that the prevailing trend changes, and that we see more innovations in smart endpoints, chatting with each other in a diversity of decentralized, standardized, and secure protocols. Make this kind of software stack available on every popular platform, and we could see much more innovation in the next 20 years.

May 6, 2013

Keeping skills current in a changing world

Filed under: Misc — Nate Lawson @ 10:00 am

I came across this article on how older tech workers are having trouble finding work. I’m sure many others have written about whether this is true, whose fault it is, and whether H1B visas should be increased or not. I haven’t done the research so I can’t comment on such things, but I do know a solution to out-of-date skills.

The Internet makes developing new skills extremely accessible. With a PC and free software, you can teach yourself almost any “hot skill” in a week. Take this quote, for example:

“Some areas are so new, like cloud stuff, very few people have any experience in that,” Wade said. “So whether they hire me, or a new citizen grad, or bring in an H-1B visa, they will have to train them all.”

Here’s a quick way to learn “cloud stuff”. Get a free Amazon AWS account. Download their Java-based tools or boto for Python. Launch a VM, ssh in, and you’re in the cloud. Install Ubuntu, PostgreSQL, and other free and open-source software. Read the manuals and run a couple experiments, using your current background. If you were a DBA, try SimpleDB. If a sysadmin, try EC2 and monitoring.

Another quote from a different engineer:

‘If a developer has experience in Android 2.0, “the company would be hiring only someone who had at least 6 months of the 4.0 experience,” he said. “And you cannot get that experience unless you are hired. And you cannot get hired unless you provably have that experience. It is the chicken-and-the-egg situation. “‘

Android is also free, and includes a usable emulator for testing out your code. Or buy a cheap, wifi-only Android phone and start working from there. Within a week, you can have experience on the latest Android version. Again, for no cost other than your time.

I suspect that it’s not lack of time that is keeping unemployed engineers from developing skills. It’s a mindset problem.

‘He may not pick the right area to focus on, he said. “The only way to know for sure is if a company will pay you to take the training,” he said. “That means it has value to them.”‘

In startups, there’s an approach called customer development. Essentially, it involves putting together different, lightweight experiments based on theories about what a customer might buy. If more than one customer pays you, that’s confirmation. If not, you move onto the next one.

Compare this to the traditional monolithic startup, where you have an idea, spend millions building it into a product, then try to get customers to buy it. You only have a certain timeline before you run out of money, so it’s better when you can try several different ideas in the meanwhile.

There’s an obvious application to job hunting. Take a variety of hypotheses (mobile, cloud, etc.) and put together a short test. Take one week to learn the skill to the point you have a “hello world” demo. Float a custom resume to your targets, leaving out irrelevant experience that would confuse a recruiter. Measure the responses and repeat as necessary. If you get a bite, spend another week learning that skill in-depth before the interview.

There are biases against older workers, and some companies are too focused on keywords on resumes. Those are definitely problems that need changing. However, when it comes to learning new skills, there’s never been a better time for being able to hunt for a job using simple experiments based on free resources. The only barrier is the mindset that skills come through a monolithic process of degrees, certification, or training instead of a self-directed, agile process.

January 28, 2013

History of memory corruption vulnerabilities and exploits

Filed under: Hacking,Security,Software engineering — Nate Lawson @ 5:12 am

I came across a great paper, “Memory Errors: The Past, the Present, and the Future” by van der Veen et al. The authors cover the history of memory corruption errors as well as exploitation and countermeasures. I think there are a number of interesting conclusions to draw from it.

It seems that the number of flaws in common software is still much too high. Consider what’s required to compromise today’s most hardened consumer platforms, iOS and Chrome. You need a flaw in the default install that is useful and remotely accessible, memory disclosure bug, sandbox bypass (or multiple ones), and often a kernel or other privilege escalation flaw.

Given a sufficiently small trusted computing base, it should be impossible to find this confluence of flaws. We clearly have too large a TCB today since this combination of flaws has been found not once, but multiple times in these hardened products. Other products that haven’t been hardened require even less flaws to compromise, making them more vulnerable even if they have the same rate of bug occurrence.

The paper’s conclusion shows that if you want to prevent exploitation, your priority should be preventing stack, heap, and integer overflows (in that order). Stack overflows are by far still the most commonly exploited class of memory corruption flaws, out of proportion to their prevalence.

We’re clearly not smart enough as a species to stop creating software bugs. It takes a Dan Bernstein to reason accurately about software in bite-sized chunks such as in qmail. It’s important to face this fact and make fundamental changes to process and architecture that will make the next 18 years better than the last.

December 4, 2012

Has HTML5 made us more secure?

Filed under: Hacking,Network,Security — Nate Lawson @ 4:19 am

Brad Hill recently wrote an article claiming that HTML5 has made us more secure, not less. His essential claim is that over the last 10 years, browsers have become more secure. He compares IE6, ActiveX, and Flash in 2002 (when he started in infosec) with HTML5 in order to make this point. While I think his analysis is true for general consumers, it doesn’t apply to more valuable targets, who are indeed less secure with the spread of HTML5.

HTML5 is a broad grouping of features, and there are two parts that I think are important to increasing vulnerability. First, there is the growing flexibility in parsing elements for JavaScript, CSS, SVG, etc., including the interpretation of relationships between them. Second, there’s the exposure of complex decoders for images, video, audio, storage, 3D graphics, etc. to untrusted sources.

If you look at the vulnerability history for these two groups, both are common culprits in flaws that lead to untrusted code execution. They still regularly exhibit “game over” vulnerabilities, in Firefox and Chrome. Displaying a PNG has been exploitable as recently as 2012. Selecting a font via CSS was exploitable in 2010. In many cases, these types of bugs are interrelated. A flaw in a codec could require heap grooming via JavaScript to be reliably exploitable. HTML5’s increased surface area of more parsing and complex decoders standardizes remote, untrusted access to components that are still the biggest source of code execution vulnerabilities in the browser, despite attempts to audit and harden them.

Additionally, it exposes elements that have not had this kind of attention. WebGL hands over access to your 3D graphics stack, something which even CERT thinks is worth disabling. If you want to know the future of exploitation, you need to keep an eye on the console and iPhone/Android hacking groups. 3D shaders were the first software exploit of the Xbox 360, a platform that is much more secure than any browser. And Windows GDI was remotely exploitable in 2009. Firefox WebGL is built on top of Mesa, which is software from the bad old days of 1993. How is it going to do any better than Microsoft’s most secure platform?

As an aside, a rather poor PR battle about WebGL is worth addressing here. An article by a group called Context in 2011 raised some of these same issues, but their exploit was only a DoS. Mozilla devs jumped on this right away. Their solution is a whitelist and blacklist for graphics drivers. A blacklist is great for everyone after a 0-day has been discovered and fixed and deployed, but not so good before then.

Call me a luddite, but I measure security by what I can easily disable or route around and ignore. Flash is easily blocked and can be uninstalled. JavaScript can be disabled with a browser setting or filtered. But HTML5? Well, that’s knit into pretty much every area of the browser. You want to disable WebGL? No checkbox, but at least there’s about:config. Just make sure no one set “webgl.force-enabled” or whatever the next software update adds to your settings. Want to disable parts of CSS but not page layout? Want a no-codec browser? Get out the compiler.

Browser vendors don’t care about the individual target getting compromised; they care about the masses. The cost/benefit tradeoff for these two groups are completely opposite. Otherwise, we’d see vendors competing for who could remove as many features as possible to produce the qmail of browsers.

Security happens in waves. If you’re an ordinary user, the work of Microsoft and Google in particular have paid off for you over the past 10 years. But woe to you if you manage high-value targets. The game of whack-a-mole with the browser vendors has been getting worse, not better. The more confident they get from their bug bounties and hardening, the more likely they are to add complex, deeply intertwined features. And so the pendulum begins swinging back the other way for everyone.

September 7, 2012

Toggl time-tracking service failures

Filed under: Security — Nate Lawson @ 10:33 am

A while ago, we investigated using various time-tracking services. Making this quick and easy for employees is helpful in a consulting company. Our experience with one service should serve as a cautionary note for web 2.0 companies that want to sell to businesses.

Time tracking is a service that seems both boring and easy to implement but is quite critical to most companies. Everyone from freelance writers to international manufacturers needs some system, ranging from paper and spreadsheets to sophisticated ERP systems. For a consulting company, lost time entries means lost money.

As we added employees last year, we began evaluating various web-based time tracking services in comparison to our home-grown system (flat text files).

We considered using Toggl, but our experiences with it started ok but got worse as they grew. We found that some test entries would disappear occasionally. The desktop widget was really an HTML app wrapped in a full browser (they started with Qt and moved to Chrome).

Then one day, we found that their servers had screwed up access control and everyone was being logged into a single account. You can see the messages in the attached screenshots. This was at once hilarious and horrifying, and we terminated our account immediately. No customer or sensitive data was lost, but there was no way we were ever going to use this service. There were various complaints about all this on the Toggl forums, but those posts were deleted when they updated their support website.

Dealing with distributed systems and the CAP theorem is particularly hard. Startups ignore this at your own (or your customers’) peril. Time tracking seems simple enough, but you can never lose data. For customers that don’t use code names for clients/projects, a data compromise could be devastating.

I debated whether to write this post since I don’t have a grudge against Toggl. However, I am hoping our experience will be informative as to what can happen when you outsource ownership of data that is directly tied to your revenue.

August 14, 2012

Cyber-weapon authors catch up on blog reading

One of the more popular posts on this blog was the one pointing out how Stuxnet was unsophisticated. Its use of traditional malware methods and lack of protection for the payload indicated that the authors were either “Team B” or in a big hurry. The post was intended to counteract the breathless praise in the press for the advent of sophisticated “cyber-weapons”.

This year, more information was released in the New York Times that gave more support for both theories. The authors may not have had a lot of time due to political pressure and concern about Iran’s progress. The uneasy partnership between the US and Israel may have led to both parties keeping their best tricks in their back pockets.

A lot of people seemed skeptical about the software protection method I described called “secure triggers”. (I had written about this before also, calling it “hash-and-decrypt”.) The general idea is to gather information about the environment in order to generate a cryptographic key, which is used to decrypt the payload. If even one bit of info is incorrect, the payload can’t be decrypted. The analyst has to brute-force the proper environment, which can be made infeasible if there’s enough entropy and/or the validation method is too slow.

The critics claimed that secure triggers were too complicated or unable to withstand malware analyst scrutiny. However, this approach had been used successfully in everything from Core Impact to Blu-ray to Team Twiizers exploits, so it was feasible. Either the malware developers were not aware of this technique or there were other constraints, such as time, preventing it from being used.

Now we’ve got Gauss, which uses (surprise!) this exact technique. And, it turns out to be somewhat effective in preventing Kaspersky from analyzing the payload. We either predicted or caused the future, take your pick.

Is this the endgame? Not even, but it does mean we’re ready for the next stage.

The malware industry has had a stable environment for a while. Targeted attacks were rare, and most new malware authors hadn’t spent a lot of effort building in custom protection for their payloads. Honeypots and local analysis methods assume the code and behavior remain stable between the malware analyst’s environment and the intended target.

In the next stage, proper use of mechanisms like secure triggers will divide malware analysis into two phases: infection and payload. The infection stage can be analyzed with traditional techniques in order to find the security flaws exploited, propagation method, etc. The payload stage will change drastically, with more effort being spent on in situ analysis.

When the payload only decrypts and runs on a single target system, the malware analyst will need direct access to the compromised host. There are several forms this might take. The obvious one is providing a remote shell to the analyst to log in, attach a debugger, try to get a memory dump of the process, etc. This is dangerous because it involves giving an outsider access to a trusted system, and one that might be critical to other operations. Even if a whole-system memory dump is generated, say by physical access or a cold-boot attack, there is still going to be a lot of sensitive information there.

Another approach is emulation. The analyst uses a VM that turns all local syscalls into remote ones. This is connected to the compromised target host (or a clone of it), which runs a daemon to answer the API queries. The malware sample or relevant portions of it (such as the hash-and-decrypt routine) are run in the analyst’s VM, but the information the routine gathers comes directly from the compromised host. This allows the analyst to gather the relevant information while not having full access to the compromised machine.

In the next phase after this, malware authors add more anti-emulation checks to their payload decryption routine. They try to prevent this routine from being run in isolation, in an emulator. Eventually, you end up in a cat-and-mouse game of Core Wars on the live hardware. Malware keeps a closely-synchronized global heartbeat so that any attempt to dump and restart it on a single host corrupts its state irrecoverably. The payload, its triggers, and encryption keys evolve in coordination with the other hosts on the network and are tied closely to each machine’s identity.

Is this where we’re headed? I’m not sure, but I do know that software protection measures are becoming more, not less relevant.

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 87 other followers