Last time, I asked the question, “why did it take 24 years for buffer overflow exploits to mature?” The relevant factors to answering this question are particular to three eras: academic computing (early 1970’s), rise of the Internet (1988), and x86 unification (1996 to present).
In the first era of academia, few systems were networked so buffer overflow flaws could only be used for privilege escalation attacks. (The main type of networking was dialup terminal access, which was a simpler interface and usually not vulnerable to buffer overflows.) Gaining control of an application was relevant to the designers of the trusted computing criteria (Rainbow Books), where objects were tagged with a persistent security level enforced by the TCB, not the program that accesses them. Except for the most secure military systems, computers were mainly used by academia and businesses, which focused more on password security and preventing access by unsophisticated attackers. Why write a complex exploit when there are weak or no passwords on system accounts?
In 1988, the wide availability of Unix source code and a common CPU architecture (VAX) led to the first malicious buffer overflow exploit. Before the rise of the minicomputer, there were many different CPU architectures on the Arpanet. The software stacks were also changing rapidly (TOPS-20, RSX-11, VMS). But by the late 1980’s, the popularity of DEC, Unix, and BSD in particular had led to a significant number of BSD VAX systems on the Internet, probably the most homogeneous it had been up to that point.
I think the widespread knowledge of the VAX architecture along with the number of BSD systems on the Internet led to it being the target of the first malicious buffer overflow exploit. Building a worm to target a handful of systems wouldn’t have been nearly as much fun, and BSD and the VAX had hit a critical mass on the Internet. More general Unix flaws (e.g., trust relationships by the “r” utilities) were also exploited by the worm but the VAX buffer overflow exploit was the only such flaw the author chose to target or had enough familiarity to create.
With this lone exception, any further exploitation of buffer overflows remained quite secret. Sun workstations and servers running 68k and then SPARC processors became the Internet host of choice by the 1990’s. However, there was significant fragmentation with many hosts running IRIX on SGI MIPS, IBM AIX on RT and RS, and HP-UX on PA-RISC. Even though Linux and the various BSD distributions had been launched, they were mostly used for hobbyists as client PCs, not servers.
Meanwhile, a network security community had developed around mailing lists such as Bugtraq and Zardoz. Most members were sysadmins and hackers, and few vendors actively participated (Sun was a notable exception). Exploits and fixes/workarounds were regularly published, most of them targeting logic flaws such as race conditions or file descriptor leakage across trust domains. (On a side note, I find it amusing that any debate about full disclosure back then usually involved how many days someone should wait between notifying the vendor and publishing an exploit, not advisory. Perhaps we should be talking in those same terms today.)
In 1995, the various buffer overflow exploits published were only for non-x86 systems (SunOS and HP-UX, in particular). Such systems were still the most prevalent servers on the Internet. The network security community was focused on this kind of “big iron” with few exploits published for open-source Unix systems or Windows, which was almost exclusively a client OS. This changed with the splitvt exploit for Linux on x86, although it took a year until the real impact of this change appeared.
One of the reasons buffer overflows were so rarely exploited on Unix workstations in the early 1990’s was the difficulty of getting CPU and OS architecture information. Vendors kept a lot of this information private or only provided it in costly books. Sure you could read the binutils source code, but without even an understanding of the assembly behavior, it was difficult to gain this kind of low-level knowledge. Also, common logic flaws were easier to exploit for the Unix-centric sysadmin community and that populated Bugtraq.
In 1996, several worlds collided and buffer overflow exploitation grew rapidly. First, Linux and BSD x86 systems had finally become common enough on the Internet that people cared about privilege escalation exploits specific to that CPU architecture. Second, open-source Linux and BSD code made it easy to understand stack layout, trap frames, and other system details. Finally, the virus community had become interested in network security and brought years of low-level x86 experience.
The last factor is the one I haven’t seen discussed elsewhere. Virus enthusiasts (virus authors, defenders, and various interested onlookers) had been engaged in a cat-and-mouse game since the 1980’s. The game was played out almost exclusively on x86 DOS systems. During this period, attackers created polymorphic code generators (MtE) and defenders created a VM-based system for automated unpacking (TBAV). Even today, the impact of all this innovation is still slowly trickling into mainstream security products.
Once all three of these components were in place, buffer overflow exploits became common and the field advanced quickly through the present day. CPU architecture information is now freely available and shellcode and sample exploits exist for all major systems. New techniques for mitigating buffer overflows and subverting mitigations are also constantly appearing with no end in sight.
The advent of the Code Red and Slammer worms in the early 2000’s can be seen as a second-coming of the 1988 worm. At that point, Windows servers were common enough on the Internet to exploit but did not yet contain valuable enough data to keep the flaw secret. That period quickly ended as banking and virtual goods increased the value of compromising Internet hosts, and client-side exploits also became valuable. While some researchers still publish buffer overflow exploits, the difficulty of producing a reliable exploit and associated commercial value means that many are also sold. The race started in the 1970’s continues to this day.