After my last post claiming anti-debugging techniques are overly depended on, various people asked how I would improve them. My first recommendation is to go back to the drawing board and make sure all your components of your software protection design hang together in a mesh. In other words, each has the property that the system will fail closed if a given component is circumvented in isolation. However, anti-debugging does have a place (along other components including obfuscation, integrity checks, etc.) so let’s consider how to implement it better.
The first principle of implementing anti-debugging is to use up a resource instead of checking it. The initial idea most novices have is to check a resource to see if a debugger is present. This might be calling IsDebuggerPresent() or if more advanced, using GetThreadContext() to read the values of hardware debug registers DR0-7. Then, the implementation is often as simple as “if (debugged): FailNoisily()” or if more advanced, a few inline versions of that check function throughout the code.
The problem with this approach is it provides a choke point for the attacker, a single bottleneck that circumvents all instances of a check no matter where they appear in the code or how obfuscated they are. For example, IsDebuggerPresent() reads a value from the local memory PEB struct. So attaching a debugger and then setting that variable to zero circumvents all uses of IsDebuggerPresent(). Looking for suspicious values in DR0-7 means the attacker merely has to hook GetThreadContext() and return 0 for those registers. In both cases, the attacker could also skip over the check so it always returned “OK”.
Besides the API bottleneck, anti-debugging via checking is fundamentally flawed as suffering from “time of check, time of use“, aka race conditions. An attacker could quickly swap out the values or virtualize accesses to the particular resources in order to lie about their state. Meanwhile, they could still be using the debug registers for their own purposes while stubbing out GetThreadContext(), for example.
Contrast this with anti-debugging by using up a resource. For example, a timer callback could be set up to vector through PEB!BeingDebugged. (Astute readers will notice this is a byte-wide field, so it would actually have to be an offset into a jump table). This timer would fire normally until a debugger was attached. Then the OS would store a non-zero value in that location, overwriting the timer callback vector. The next time the timer fired, the process would crash. Even if an attacker poked the value to zero, it would still crash. They would have to find another route to read the magic value in that location to know what to store there after attaching the debugger. To prevent merely disabling the timer, the callback should perform some essential operation to keep the process operating.
Now this is a simple example and there are numerous ways around it, but it illustrates my point. If an application uses a resource that the attacker also wants to use, it’s harder to circumvent than a simple check of the resource state.
An old example of this form of anti-debugging in the C64 was storing loader code in screen memory. As soon as the attacker broke into a debugger, the prompt would overwrite the loader code. Resetting the machine would also overwrite screen memory, leaving no trace of the loader. Attackers had to realize this and jump to a separate routine elsewhere in memory that made a backup copy of screen memory first.
For x86 hardware debug registers, a program could keep an internal table of functions but do no compile-time linkage. Each site of a function call would be replaced with code to store an execute watchpoint for EIP+4 in a debug register and a “JMP SlowlyCrash” following that. Before executing the JMP, the CPU would trigger a debug exception (INT1) since the EIP matched the watchpoint. The exception handler would look up the calling EIP in a table, identify the real target function, set up the stack, and return directly there. The target could then return back to the caller. All four debug registers should be utilized to avoid leaving room for the attacker’s breakpoints.
There are a number of desirable properties for this approach. If the attacker executes the JMP, the program goes off into the weeds but not immediately. As soon as they set a hardware breakpoint, this occurs since the attacker’s breakpoint overwrites the function’s watchpoint and the JMP is executed. The attacker would have to write code that virtualized calls to get and set DR0-7 and performed virtual breakpoints based on what the program thought should be the correct values. Such an approach would work, but would slow down the program enough that timing checks could detect it.
I hope these examples make the case for anti-debugging via using up a resource versus checking it. This is one implementation approach that can make anti-debugging more effective.
[Followup: Russ Osterlund has pointed out in the comments section why my first example is deficient. There’s a window of time between CreateProcess(DEBUG_PROCESS) and the first instruction execution where the kernel has already set PEB!BeingDebugged but the process hasn’t been able to store its jump table offset there yet. So this example is only valid for detecting a debugger attach after startup and another countermeasure is needed to respond to attaching beforehand. Thanks, Russ!]
Though only an example, the PEB idea would not stop my debugger (or any other designed with a similar technique, e.g. setting a pending breakpoint in WinDbg). If the debugger is able to set a onetime breakpoint on NTDLL!LdrpInitializeProcess (or earlier), the debugger also has the opportunity to reset the PEB!BeingDebugged flag as well (think ReadProcessMemory/WriteProcessMemory but at a different address). Clearing this flag while in the loader code before any application code has a chance to execute will also stop other anti-debugging tricks (e.g., checking heap flags) in addition to the obvious technique of stepping through the actual code itself.
Russell, thanks for commenting.
1. Yes, I am using this as an example of how an anti-debugging implementation can be strengthened relative to the simple approach. Depending on PEB!BeingDebugged in any way is extremely weak, even when strengthened via this approach, but the fact that it becomes slightly better than the simple checking approach means it’s still an appropriate example.
2. Allowing any one part of the protection to stand on its own makes it easy to remove, so even if this approach was used as part of some scheme, it would have to be covered with multiple layers of protection (timing and integrity checks, for example).
That being said, simply clearing the flag will not bypass this particular implementation. See the article again. It’s using something like stack cookies, where PEB!BeingDebugged would be set to a random value between 2 and 255 and used as an offset into a jump table filled with some invalid targets, as well as one valid one. So even though you could attach early, the OS would overwrite PEB!BeingDebugged with 1 as soon as you attached. Then when you tried to read the actual value, you wouldn’t find it.
So my point that you’d have to use another way to read the original value of PEB!BeingDebugged is still valid. Since it’s only a byte, it could be brute-forced if you knew this was being done, but it is still stronger than a simple comparison. That’s the main point of this article.
There’s still an open question about how to initialize PEB!BeingDebugged before a debugger could attach. Rather than spending a lot of time on this one, consider other ways to accomplish the same thing. Of course it’s better to start with a stronger anti-debug check, then strengthen it further by using up a resource instead of a simple comparison. Still, this is a fun thought experiment.
In case it wasn’t clear in my previous comment, this all requires figuring out how to store the random offset in PEB!BeingDebugged before the kernel stores a “1” there. If there isn’t a window to do this via early instruction execution or static table initialization, then this countermeasure is only effective for detecting attach after process startup.
Since I never use PEB!BeingDebugged in my own protection designs, you’ll have to forgive my improvements for being extremely limited. I still hope the principle comes across clearly.
Nate, a question here. From my understanding, IsDebuggerPresent() returns TRUE when PEB!BeingDebugged field is a non-zero value. By storing a value (e.g. offset to a jump table, non-zero value), wouldn’t it make IsDebuggerPresent() also return TRUE? Say, before an anti-anti debugging method replaces it with zero and before our timer has a chance to run to it and crash, any IsDebuggerPresent() calls from outside (I believe some system APIs do) would be given a false detection (caused by the “resource” value set by us). Such false alarms may happen and cause problems. Am I missing something here?
JB, you’re missing the point. The hypothetical protection would never call IsDebuggerPresent(). It would implicitly depend on the value stored in PEB!BeingDebugged to be some number (say “145”), since it would use that as an offset in a jump table. When PEB!BeingDebugged was overwritten (with 1 by the OS or 0 by the attacker trying to hide), both values would cause the program to go down a path that eventually leads to the program stopping. Specifically, the entries at offset 0, 1, and maybe 0xff in your jump table would lead to failures.
There are problems with this specific example (due to an attacker being able to attach early) but the general principle is what I’m trying to illustrate: implicit (use of a resource) rather than explicit (checking it) protection.
Nate, thank you for the response. Please correct me if I am wrong. Yes, the hypothetical (our) protection would not call IsDebuggerPresent() (only using PEB!BeingDebugged implicitly as a resource like you emphasized) but other non user written functions or even Windows system APIs within the same process (thus the same PEB) would. And when those calls happen, wouldn’t non-zero values give unintentional results (e.g. debugger found, do something bad…)? To avoid that, PEB!BeingDebugged has to stay as 0. Then, using 0 as the offset in a jump table would not make any sense as that’s what an attacker might try changing the value to.
Yes, if other parts of the same app used IsDebuggerPresent() in the traditional way, they would get incorrect results. However, since the PEB is local to process memory, it would only affect that single application. So the programmer would know not to use this API the traditional way.