rdist

February 21, 2011

Memory address layout vulnerabilities

Filed under: Embedded,Hacking,Security — Nate Lawson @ 7:23 am

This post is about a programming mistake we have seen a few times in the field. If you live the TAOSSA, you probably already avoid this but it’s a surprisingly tricky and persistent bug.

Assume you’d like to exploit the function below on a 32-bit system. You control len and the contents of src, and they can be up to about 1 MB in size before malloc() or prior input checks start to error out early without calling this function.

int target_fn(char *src, int len)
{
    char buf[32];
    char *end;

    if (len < 0) return -1;
    end = buf + len;
    if (end > buf + sizeof(buf)) return -1;
    memcpy(buf, src, len);
    return 0;
}

Is there a flaw? If so, what conditions are required to exploit it? Hint: the obvious integer overflow using len is caught by the first if statement.

The bug is an ordinary integer overflow, but it is only exploitable in certain conditions. It depends entirely on the address of buf in memory. If the stack is located at the bottom of address space on your system or if buf was located on the heap, it is probably not exploitable. If it is near the top of address space as with most stacks, it may be exploitable. But it completely depends on the runtime location of buf, so exploitability depends on the containing program itself and how it uses other memory.

The issue is that buf + len may wrap the end pointer to memory below buf. This may happen even for small values of len, if buf is close enough to the top of memory. For example, if buf is at 0xffff0000, a len as small as 64KB can be enough to wrap the end pointer. This allows the memcpy() to become unbounded, up to the end of RAM. If you’re on a microcontroller or other system that allows accesses to low memory, memcpy() could wrap internally after hitting the top of memory and continue storing data at low addresses.

Of course, these kinds of functions are never neatly packaged in a small wrapper and easy to find. There’s usually a sea of them and the copy happens many function calls later, based on stored values. In this kind of situation, all of us (maybe even Mark Dowd) need some help sometimes.

There has been a lot of recent work on using SMT solvers to find boundary condition bugs. They are useful, but often limited. Every time you hit a branch, you have to add a constraint (or potentially double your terms, depending on the structure). Also, inferring the runtime contents of RAM is a separate and difficult problem.

We think the best approach for now is to use manual code review to identify potentially problematic sections, and then restrict the search space to that set of functions for automated verification. Despite some promising results, we’re still a long way from automated detection and exploitation of vulnerabilities. As the program verification field advances, additional constraints from ASLR, DEP, and even software protection measures reduce the ease of exploitation.

Over the next few years, it will be interesting to see if attackers can maintain their early lead by using program verification techniques. Microsoft has applied the same approach to defense, and it would be good to see this become general practice elsewhere.

14 Comments

  1. Interesting set of slides. Another approach to 0-byte allocs that I really like is to give back a pointer to a page that has neither read or write access. You get a pointer that can be used for a realloc (realloc is evil, but you have to provide for it), but if you attempt to use it, you get a non-exploitable crash.

    Comment by David LeBlanc — February 21, 2011 @ 11:27 am

    • David, I think that’s a great idea :) It would certainly make exploiting these types of bugs more interesting :)

      Comment by jduck — February 21, 2011 @ 3:01 pm

      • PaX has had this chunk for something like 2.5 years now… ;)


        @@ -87,10 +88,13 @@
        * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
        * Both make kfree a no-op.
        */
        -#define ZERO_SIZE_PTR ((void *)16)
        +#define ZERO_SIZE_PTR \
        +({ \
        + BUILD_BUG_ON(!(MAX_ERRNO & ~PAGE_MASK));\
        + (void *)(-MAX_ERRNO-1L); \
        +})

        -#define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) = (unsigned long)ZERO_SIZE_PTR - 1)

        Comment by PaX Team — February 22, 2011 @ 2:55 am

  2. OBTW, SafeInt tends to help with this sort of problem, and it is supported on gcc – http://www.codeplex.com/Safeint. Another thing that helps is to use the newer forms of memcpy, namely memcpy_s, which takes the size of the target and the size of the source, which also tends to reduce potential for mayhem.

    Comment by David LeBlanc — February 21, 2011 @ 11:30 am

    • This seems useful. For environments without C++ or memory protection, it might be good to have a RangedInt set of macros. Instead of writing lots of if/then statements, just do:

      RANGED_INT(val, 300, error_out);

      The macro would expand to the appropriate if val > 300 goto error_out function. If val was signed, it could add the < 0 comparison at compile time also. If a msg string was included, it could be printed on the way out. There might be GE, LE, LT, GT and SPAN variants for more complicated scenarios.

      If something like this exists, let me know.

      Comment by Nate Lawson — February 21, 2011 @ 11:51 am

      • Nate: http://gcc.gnu.org/ml/gcc/2009-07/msg00459.html

        Comment by anonymous — February 22, 2011 @ 11:10 am

      • Thanks for the reference to the AIR (“as-if infinitely ranged”) paper.

        http://www.sei.cmu.edu/library/abstracts/reports/10tn008.cfm

        My read on it is they spend a lot of time talking about how certain kinds of integer operations do not need to be checked, but what they actually implemented was “always check”.

        Here’s the link to the clang diff:
        http://article.gmane.org/gmane.comp.compilers.clang.devel/4469

        That patch just adds a range check to every add, sub, and multiply. It does not do any fancy code analysis to see where a check should be inserted or where it can be safely ignored.

        While that’s somewhat useful, it’s too bad that the optimizations they discuss were not implemented. This will hinder wider adoption due to the performance loss.

        Comment by Nate Lawson — March 4, 2011 @ 11:38 am

      • I wonder if saturating integers, which are more easily implemented in hardware, would be a good compromise? C allows that behaviour for signed types and pointers.

        They would seem to eliminate at least some of these vulnerabilities – perhaps even most?

        Comment by caf — March 24, 2011 @ 7:32 pm

    • SafeInt certainly transforms the EOP bug into a lesser one, but there is no reason this function should ever overflow. Comparing either ( len > sizeof(buf) ) or ( end – buf > sizeof(buf) ) can’t overflow.

      Comment by Aaron — February 22, 2011 @ 10:01 am

  3. Interesting exploitation technique, never considered this when auditing.

    Comment by xpn — February 21, 2011 @ 11:47 am

  4. This is why you always, ALWAYS, compute buffer length, and not buffer bounds. Not foolproof, but generally you’re safe if you compare a buffer size to a buffer size:


    if (end - buf > sizeof(buf)) return -1;

    We’ve now mixed signed and unsigned arithmetic, but the compiler will coerce both to unsigned — this is safe, as we’ve already ensured end > buf logically (it may have wrapped around, but pointer subtraction on conventional platforms will wrap around).

    So mechanically, we should be able to identify “ptr OP ptr + size” as imminently dangerous, and prefer to write “ptr – ptr OP size” and already be safe on conventional platforms.

    One level higher, semantically, we should never have performed the addition and, thus, never risked wrap around.


    if (len > sizeof(buf)) return -1;

    Comment by Aaron — February 22, 2011 @ 9:59 am

  5. My favorite comment on integer overflows is on page 38 of http://www.cs.princeton.edu/~dpw/popl/06/Tim-POPL.ppt :
    “C# exposes more than 10 integer-like data types, none of which are those defined by (Pythagoras, 500BC). In the future can we get integers right?”

    Programmers clearly aren’t up to the task of modular arithmetic, so why do we force them to do it? Let the compiler optimize integers as fixed sized binary numbers when they can prove that it is safe to do so.

    Comment by newsham — February 28, 2011 @ 10:13 am

  6. Note that this is more likely to be a problem in microcontroller / embedded or kernel settings than plain old userspace on the desktop or server, because the operating system typically reserves the upper portion of the virtual address space for the kernel.

    Comment by caf — March 16, 2011 @ 7:47 pm

    • Yep. You just named our primary market. :)

      Comment by Nate Lawson — March 21, 2011 @ 10:35 am


RSS feed for comments on this post.

Blog at WordPress.com.