Most protection schemes include various anti-debugger techniques. They can be as simple as IsDebuggerPresent() or complex as attempting to detect or crash a particular version of SoftICE. The promise of these techniques is that they will prevent attackers from using their favorite tools. The reality is that they are either too simple and thus easy to bypass or too specific to a particular type or version of debugger. When designing software protection, it’s best to build a core that is resistant to reverse-engineering of all kinds and not rely on anti-debugger techniques.
One key point that is often overlooked is that anti-debugger techniques, at best, increase the difficulty of the first break. This characteristic is similar to other approaches, including obfuscation. Such techniques do nothing to prevent optimizing the first attack or packaging an attack for distribution.
In any protection system, there are two kinds of primitives: checks and landmines. Checks produce a changing value or code execution path based on the status of the item checked. Landmines crash, hang, scramble, or otherwise interfere with the attacker’s tools themselves. Anti-debugger techniques come in both flavors.
IsDebuggerPresent() is an example of a simple check. It is extremely general but can be easily bypassed with a breakpoint script or by traditional approaches like patching the import table (IAT). However, since the implementation of this function merely returns a value from the process memory, it can even be overwritten to always be 0 (False). The approach is to find the TIB (Thread Information Block) via the %fs segment register, dereference a pointer to the PEB (Process Environment Block), and overwrite a byte at offset 2. Since it is so general, it has little security value.
More targeted checks or landmines have been used before, against SoftICE for example. (Note that the SIDT method listed is very similar to the later Red Pill approach — everything old is new again.) To the protection author, targeted anti-debugger techniques are like using a 0day exploit. Once you are detected and the hole disabled or patched, you have to find a new one. That may be a reasonable risk if you are an attacker secretly trying to compromise a few valuable servers. But as a protection author, you’re publishing your technique to the entire world of reverse engineers, people especially adept at figuring it out. You may slow down the first one for a little while, but nothing more than that.
Anti-debugging techniques do have a small place if their limits are recognized. IsDebuggerPresent() can be used to provide a courtesy notice reminding the user of their license agreement (i.e., the “no reverse engineering” clause.) However, since it is so easily bypassed, it should not be used as part of any protection scheme. Debugger-specific checks and landmines can be sprinkled throughout the codebase and woven into the overall scheme via mesh techniques. However, their individual reliability should be considered very low due to constantly improving reversing tools and the ease with which attackers can adapt to any specific technique.
49 thoughts on “Anti-debugger techniques are overrated”
I wish to add that it is easy to bypass anti-debugging techniques at a place that is generally not known (at least on Windows systems). A sophisticated debugger receives notification that a process is starting via the CREATE_PROCESS_DEBUG_EVENT. What may not be apparent with this notification is that NTDLL.DLL has been already mapped into the process’ address space (or, otherwise, how would the loader continue?) With this bit of knowledge, one can then set an INT3 breakpoint on the exported API, LdrInitializeThunk, and then after the breakpoint is hit, step through the entire loading process. This technique allows one to watch for patches in the loader itself or tricks involved in static constructors and other code that run before the program’s entry-point is called. One can even take this opportunity to “reset” the IsDebuggerPresent flag. You can even see the “intial breakpoint” call where debuggers normally stop. Unless somehow the system has been patched with a driver that somehow interferes with the Windows loader, this will always work.
Russell, excellent comment on how to get in early with a debugger. Perhaps we should have a small contest on who can get in earliest without resorting to kernel mode.
Thats a good oulet but you have just set a worng layout
in PEB. Its is Process Environment Block exactly in a system
context.It contains all User-Mode parameters associated by
system with current process.
Process Event Block =~ PEB
I dont think structure in this way.
Rest the issue is quiet good.
Nice post Nate. Anti-debugger tricks are one way to help make the reversing process harder, but theres more then one way to reverse an executable object. The program author has to evaluate all of the potential threats to reversing his application (this is where your ‘mesh techniques’ comes into play). This includes static disassemblers. I’ve seen good protection code stop me from using a debugger before, but when I open the object from disk, its completely unprotected. Code paths are a good way of trying to confuse both kinds of analysis tools but my personal favorite is twisting and turning the file format itself. This way it takes a lot more then just the flipping of a return value in memory to see the correct portions of code. And in the end your not quite sure if your analyzing the right stuff or not. Keep these posts coming please :)
I’ve always wondered (but have never tried) why more people dont use a cpu emulator/simulator to record all of the instructions that are actually executed. These can be used to reconstruct the executable, including any encrypted code, runtime generated code and patches. Some obvious shortcomings are code coverage, speed and richness of the simulated environment (ie. bochs doesnt have SSE or an HDMI display adapter).
Anyway, the approach always seemed like an obvious win to me but I’ve not heard of it being used.
Tim, I was thinking the same thing. I just posted about it, including the idea I talked with you about re: using a VM to assist in open source driver development.
zeroknock, thanks for pointing out that typo. I corrected it in the post.
The IsDebuggerPresent override is trivial, but even if you intercept the loader, how do you get past NtQuerySystemInformation? IsDebuggerPresent is a courtesy function, a warmup; the rest of the game consists of (a) ways that NTOSKRNL betrays that there is a debugger present (ie, NQSI), and (b) ways that program behavior betrays the debugger (ie, the UEF).
There’s a nice driver I came across for hiding SoftICE called IceExt. It even includes source. It filters these SoftICE detection approaches: NtCreateFile, NtQuerySystemInformation, NtQueryDirectoryObject, NtContinue, int 3, int 1 (single step and EIP + 2), int 41, int 0e, and UEF.
To do the same thing in user mode against a program that doesn’t have heavy obfuscation, why not just sweep memory looking for int 2e instructions (0xcd 0x2e)? Set a breakpoint on each and when triggered, print the syscall associated with the value in %eax. (For WinXP, that would be sysenter, 0x0f 0x34). You can easily virtualize NQSI by allowing it to continue one instruction and breaking after it returns. Then just overwrite the area of the struct for the debugger IPC port with NULL.
To get around this, protection authors have to use both obfuscation and anti-debugger techniques. It’s too easy to get around just the latter.
Just thought I would chime in a bit here.
There are a bunch of anti-debugging tricks that can be done in both user and kernel land. In user-mode, it will be tricky to protect against a well written kernel debugger and most times you have to focus on the communication between the debuggers kernel and user-land parts (if it has any). A driver that surveys kernel land will also be great help.
When it comes to protecting against user-land debuggers, there are some really creative things you can play with:
* spawning a child, which debugs the parent and fires exceptions to drive execution. Armadillo does something similar to this
* when a debugger attaches, a thread is created inside the debugged process to call functions like DbgBreakpoint. Overwriting these functions will allow you to run code in your process when the debugger attaches, such as trashing or killing the process
* ollydbg (windbg is much more stable against these attacks) has a bunch of issue parsing info in the PEB and loader block. You can trigger anything from 100% CPU usage to code execution in olly this way (or at least could, haven’t tested the latest version).
* trigger some less common exceptions such as stack overflow and invalid handle. The user-land debug API (windbg) has some issues recovering nicely from some of these
* tls callback functions (allows you to run code before the initial bp)
* variants of well known tricks, such as int 1’s, int 3’s, GetTickCount, device checks, registry checks, service checks and so on
* enumerate windows and see if there is a debugger window with your PID in the caption
* create a port (NtCreatePort) and set the process DebugPort to this handle. The debug API can’t attach to the process after this since it thinks a debugger is already attached
* alter the process sid to deny access to the process space
As a side note though, it will always be much harder to protect against a debugger attached before you can run any code. It is quite obvious why :) Also, none of these won’t really be any help unless they are combined with code obfuscation, code encryption and code integrity checks. All checks should also be randomized in between and executed at randomized events to make discovery harder. Running the same check 3000 times won’t help you. If you didn’t discover the debugger the first time you ran a check, you won’t discover it 2999 times later. Additionally, just killing the process when a debugger is discovered isn’t the best response either. It is better to trash something that makes it crash at a later stage instead, since this will be harder to track down.
In response to Thomas’s post: Just hook NtQuerySystemInformation. I’ve written a little program that works like usermode debugger, hooks a bunch of system functions (further into the function to avoid simple hook checks) and then hides the program to be able to function as a anti-debugging tracer. Works like a charm against some drm’ed programs ;)
In response to Tim’s post: There are ways of finding VM’s as well. Bochs is a bad example, since I haven’t really played with it. But when it comes to vmware, you can trash the host quite severely through the host / guest API used by vmware tools.
Andreas, thanks for the helpful tips. There are some open questions in the overall approach you raise. For example, randomizing checks may be a good thing sometimes and others it may be harmful to security. You might write a tricky function that depends on side effects of a check that appears to be in the open, regular, and easy to bypass. You definitely want to tie all the individual security functions together (checks, obfuscation, encryption, and integrity).
Various people have used the hypervisor (VT, Pacifica) features to implement an invisible debugger. The question is, what method do you use to communicate with the guest from the debugger? A simple option is to use Firewire to DMA to physical pages that are shared with the hypervisor stub as a comms port. Others?
Perhaps we should have a small contest on who can get in earliest without resorting to kernel mode.
well russel osterlund is a few instruction behind this
Log data, item 0
Message=Break-on-access when executing [77FB4D83]
77FB4D83 KiUserApcDispatcher 8D7C24 10 LEA EDI,DWORD PTR SS:[ESP+10]
77FB4D87 58 POP EAX ; LdrInitializeThunk
77FB4D88 FFD0 CALL EAX
actaully three instructions to be specific :)
thats the earliet break ive broken on with ollydbg and guardmemory bps
since i hadn’t enumerated process memory
on CREATE_PROCESS_EVENT but used gaurd memory bp on LOAD_DLL_DEBUG_EVENT i think i can break
even before this but a situation never arose :)
breaking NtQueryInformationProcess isnt a problem
it return the details in a buffer one provides and tweaking buffer contents to ones liking isnt a big deal especially if one is sitting in front of a debugger
You haven’t looked at a protection in years, have you? Try any recent protector (SecuROM 7.x, StarForce, TheMida) and then tell the entire cracking scene how overrated anti-debugging is…
Anti-debugger techniques have a small place in protection schemes, but they can’t stand alone, as I said in the article.
SecuROM, SF, etc. all use a lot more than just anti-debugger techniques. They carefully hide their anti-debugger schemes with obfuscation, encryption, and integrity checking. They certainly don’t rely on anti-debugger as the only protection mechanism. On the whole, I’d argue that their overall protection level is mostly derived from integrity checks (i.e. verifying the code and environment haven’t been patched) and very little from anti-debugger techniques.
Finally, anonymous insults are boring. If you’d like to insult me, fine, just add some technical info to the discussion also. Otherwise, I may have to reconsider my policy of not deleting any comments.
Back to the technical discussion — what’s the earliest someone has hooked the Windows boot process? NTLOADER + additional serial stub? BIOS? Obviously, ICE gives you access from the reset vector on, but what about software-only techniques?
Not sure this counts, but I added instrumentation to bochs that would log all kernel to userland transitions and fetch the process name from the ntfs kernel and used that to trace process execution during startup and shutdown of windows xp. You can also do “normal” type debugging from bochs (breakpoints, single step, etc).
Nate, it depends if you go kernel or stay in user-land. With kernel access, you can trace all the way from boot if that is to your liking. Me, I’m lazy. I want to trace as little as possible, so I rather start tracing as late as possible but still without missing anything.. :)
“SecuROM, SF, etc. all use a lot more than just anti-debugger techniques. They carefully hide their anti-debugger schemes with obfuscation, encryption, and integrity checking. They certainly don’t rely on anti-debugger as the only protection mechanism. On the whole, I’d argue that their overall protection level is mostly derived from integrity checks (i.e. verifying the code and environment haven’t been patched) and very little from anti-debugger techniques.”
As the backbone of its protection, StarForce hooks int1 and int3 from kernel mode, and uses these as regular interrupts. E.g. mov eax, 3 / int 3 is meaningful code under StarForce, and this type of thing is used very frequently in StarForce code. The result is incompatibility with any existing r0 debugger.
In order to circumvent this, people had to get creative and develop new forms of debuggers (not just a new debugger). While this wasn’t the only protection in StarForce, it certainly was one of the major two techniques used, and was very hard to bypass. This is not something that the StarForce crackers dismissed by hand-waving, so neither should you.
Surely you’re not arguing that reusing int3 alone was all it took to keep you out? Given that this was a known difficulty since the SoftICE days?
I’m not saying any particular protection scheme is easy to crack. I’m saying that anti-debugger techniques are like putting a couple huge posts in the ground in front of your house. If someone runs head-first into the post, it stops them. But as soon as they step off the sidewalk and go around, that particular post isn’t doing any good any more.
Where by “very hard to bypass”, you seem to mean “less than 1000 lines of Python code”, which is what it took for us to implement a debugger with breakpointing, remote variable access, and remote function calls without touching the Win32 IPC port.
The thing I think software protection people need to get through their head is that the game has changed. I know, I know, if you look back at what the C64 hackers did, or what the Xbox hackers did, or what the DirecTV hackers did, it’s all considerably more hardcore than what software crackers are doing now. But many of those techniques are now mainstream — programmatic disassembly is now a single Python function call, runtime code generation is a small Python module. “User mode single stepping” was a very clever idea, but it doesn’t look like it took a lot of code.
Revise your expectations about what “very hard” means.
Thomas: Does your 1,000-lines-of-python debugger cover kernel mode, where much of StarForce’s code executes? If not, how is it relevant? I also strongly dispute your viewpoint that other sectors of reverse engineering are more advanced than cracking in 2007. Pick up a recent protector and compare it to the work that you do before deciding what’s more advanced.
Nate: Your original point was that mixing in tricks like OutputDebugString(“%s%s”) and IsDebuggerPresent() are not strong deterrents to reverse engineering. I agree with this. My counter-argument was that this does not cover the entire spectrum of anti-debugging techniques.
Despite the documentedness of the int1/int3 interfaces, subverting them is guaranteed to screw up conventional kernel debugging (and unhooking them isn’t the answer in this case). If an anti-debugging technique causes you to have to develop new techniques in kernel debugging, then it is not merely an overrated gimmick, and it has done its job.
You keep making the point that anti-debug tricks are only good once. What simple counter-measure would you use on a protection that used the processor-level debug interface for its own purposes? If your answer involves coding something substantial, then you defeat your own point.
There is one more thing to add to the discussion. Anti-debugging / reversing / disassembly is done to protect IP. What you as an IP owner want to create is a scenario where, if you break one instance of the program, that break can’t be reused to break another instance. This is called a break one / break one scenario. There are only a handful implementations of this on the market today, since it requires individually compiled binaries for each instance. There is also a limitation on the number of permutations compared to the number of instances released. Almost everything else implements anti-re in a break one / break all scenario, which basically means if you create a break for one (for example, a binary that removes the anti-debugging) it can be reused on all instances without modification.
With that in mind, if you know your anti-re techniques and implement them in a correct manner, it will be hard to break. Individual instances will broken, but if the break can’t be reused on other instances , it is not a concern for you as an IP owner.
Thomas: It’s quite easy to take out the windows debug API, making your debug API fueled python code quite useless. Unless you implement your own debug API that is ;)
anonymous Starforce advocate: the answer is yes, the point of my comment is that you can implement the program controls a debugger provides using nothing but read/write access to memory, and so I challenge your contention that “an anti-debugging feature that requires me to implement a new kind of debugger” has “done its job”, since the only job it’s done is forcing me to write 1000 lines of Python.
Or, not write 1000 lines of Python, as the case may be, because we’re simply going to publish this code.
You seem confident about Starforce. I’d love to take a crack at it. Do you know their team? Let’s set something up; if our code allows us to evade Starforce protections, we can publish the actual results for everyone to see, instead of throwing around assertions.
Darn, lost a sentence in my post :)
With that in mind, if you know your anti-re techniques and implement them in a correct manner, it will be hard to break. Individual instances will broken, but if the break can’t be reused on other instances , it is not a concern for you as an IP owner. However, it will require that you can provide individual binary releases for each instance, so you might want to get started on building your compiler farm.. :)
Andreas: yeah, I’m not communicating well. I’m not using the Windows debug API. I don’t even need to use the Windows process manipulation APIs, which you can’t easily disable.
Incidentally, I keep seeing this argument that “circumventing anti-debugging will break conventional debugging”, as if that’s a real loss for an attacker.
What the heck do I care if your anti-debugging breaks WinDbg? I’m not using WinDbg. I’m not trying to track down random bugs in your code. I’m trying to make your code do what I want it to do, and for that, a high-level programming language is way, way more useful than a debugger. Even for remote “debugging”.
A major difference between attacker tools today and attacker tools ten years ago — correct me if I’m wrong, though — is that debuggers are scripted. This is what Dave Hanson was talking about in the late 90’s with “Deet”. Detours and PaiMei offer an attacker more than WinDbg does. Attackers needs programming tools, not fancy UIs.
Thomas, I wrote something similar a few years ago while working on a DRM (anti-re) enabled application. Basically, it allows you to do most things you can do through a debugger but only relying on OpenProcess / ReadProcessMemory / WriteProcessMemory. Of course, there are ways to find these as well or just make it really annoying to work with: Checksum code and data, use different memory protections to trigger exception on access, randomly move data around, keep track of number of threads in process, keep track of open handles, change process SID and so on.
Another fun thing to play with is to heavily rely on multi-threading. Even a “nice” process can be a bitch to debug when there are 10+ threads doing different things. Another not that well-known fact is how single stepping works by default while in a debugger. It will only single step the current selected thread, while the rest run normally during their time-slice. So, you can modify the context for the single stepped thread in “between steps”. That was at least the way it worked when I checked a few years back, might have changed now.
Andreas, totally agreed, which brings us back to Nate’s point: “antidebugging” is not as valuable as integrity controls, and even integrity controls are troublesome if you can’t trust the hardware, which is where we’ve landed with SVM and VTX.
Thomas: In the back-end there is no real difference between windbg / ollydbg / PaiMei, they all rely on the win32 debug API. So if I take that out, all that fancy scripting won’t really be of any good.. :) Detours hooks are easy to find and defeat as well.
And before this turns into a flame war: I’m on your side. I break anti-re for fun. I know this is a cat and mouse game where one side will have a slight advantage once in a while.
Yeah, we’re talking past each other, because I’m making two points but not being clear about which is which.
On the one hand: leaving aside integrity controls (and the integrity of the platform, which you can compromise through emulation, virtualization, and DMA), it doesn’t take a lot of code to make the “formal” debugging interface (ie, Win32’s debugging API) irrelevant. We seem to agree that you can get 80-100% of the value of a debugger without ever attaching to a remote program (in fact, without even creating a remote thread).
On the other: “formal” debugging tools are getting more advanced as well, and look less and less like a window with a block of assembly code and a cursor and a button for “step into” and a button for “step over”. That’s orthogonal to my first point, though =).
anon: There are 2 types of anti-debugger techniques I included here: IsDebuggerPresent (very general to all debuggers) and say Running-SoftICE-cmds-via-int3-FGJM (very specific to even that version of SoftICE, won’t work against windbg or others).
Reusing int3 for your own purposes is closer to the former in that it interferes with “all debugger programs/cracking methods that depend on int3 breakpoints”. Just because this worked against all existing kernel debugger implementations doesn’t make it any harder to avoid. There are so many other ways to hook program flow (overwriting instructions, debug registers, page protection/exception handlers, NMI, SMI). What I meant by saying that anti-debugger techniques only work once is that once someone has written the new non-int3 hooking approach, that code can be reused by everyone out there (hence things like IceExt).
The part that probably made it hard to bypass is integrity protection. Otherwise you would just hook int3 after SF and call SF’s int3 handlers after your code ran. As I said in the post, anti-debugger methods have some value if they are tied together with integrity protection, obfuscation, and other mechanisms. When we talk about individual protection methods in isolation, of course it’s obvious how to bypass them. But first you have to examine them individually to see their strengths and weaknesses before you can combine them into a strong protection system.
Thomas: 1. Yes 2. Yes. Isn’t nice when you argue about the same thing? :)
Nate: Tip on finding integrity check code. Get your debugger attached (or just set up a post-mortem and hope the exception wont get caught). Make the page where the checked code resides a guard page (or no read/exec access) and see it break nicely when the integrity code touches it :)
Andreas, if you can’t detect thread suspension, does tracking open handles help you? (Can’t I just get in, do what I need, and get out?).
Obviously a moot point for DMA debugging, but the userland scenario is more interesting to me.
Correct, if you can’t detect suspension, keeping track of open handles will be hard. Unless you keep the handle open that is. On the other hand, hiding suspension could prove to be quite hard.
Joanna had some interesting finds on hw aided debugging (in her case it was forensics though). Some really nifty “side effects” can be created by altering the memory map table in the north bridge.
Btw, I would like to see someone reverse engineer some small haskell programs. The compilation techniques are totally foreign to anyone familiar with standard imperative languages and there are no tools designed specifically for the task.
Thomas: “If our code allows us to evade Starforce protections, we can publish the actual results for everyone to see, instead of throwing around assertions.”
To do this legally, buy yourself a legit copy of Tom Clancy’s “Splinter Cell: Chaos Theory” and install it on XP SP2. This is StarForce 3.6 Advanced with drivers. I picked one up at a local GameStop for $20. The box has to say “Notice: This game contains technology intended to prevent copying that may conflict with some CD-RW, DVD-RW, and virtual drives” in a yellow box on the back and reference the date 2005, or else it’s probably using SafeDisc instead. For this reason I recommend buying it from a brick-and-mortar establishment, so you can verify that it has those markings. Looking forward to you putting or shutting up.
Newsham: I’ve done compiled ML, does that count?
Ok, I’m game. That’s how my last re project worked. What’s my objective?
My apologies to Nate for getting off topic here.
“Newsham: I’ve done compiled ML, does that count?”
I don’t know entirely. I’m not familiar with ML compiler implementation. They could use similar compilation techniques, but might not. ML is not “pure” (and additionally is strict) so the compilation techniques might be different. I put up some samples in case you want to check it out:
For more details see:
Anonymous StarForce advocate: we need to agree on what it is I’m trying to demonstrate against SF before I’m going to spend time on this. SF has lots of features and I’m not even slightly interested in most of them. Remember, we’re debating whether SF’s *antidebugging features* are a real impediment.
You’ve done this before, apparently, so, what are some things we should try making this game do? How do I demonstrate a break?
Yes, I have the perfect thing in mind, which is relevant to what we’ve been discussing (antidebugging, specifically the int1/int3 takeover) and does not involve breaking the entire protection. Once you have the game installed, I shall tell you.
Newsham: I took a look. The compiled Haskell is definitely different from the compiled ML I looked at. Roughly the same order of magnitude as to how terrible it was, though. Mine actually used Peano arithmetic on lists for simple arithmetic operations. What was funny was the authors of that program bragging about how algorithmically fast their technology was. I couldn’t help but think, after examining some entire functions and finding that all of the code was dead except for a tiny fraction of the instructions, how much a decent back-end (something with constant propagation and dead-code elimination) could have improved the runtime performance …
Which ML? Not to descend into minutae irrationally, but SML/NJ with MLRISC looks demonstrably different than OCaml, and Moscow ML code is likely to look even slower, generally.
Re: functional programming languages and reverse engineering
I’ve created a new post just for that discussion, please continue it here:
Status: bought it, it should get here mid-week. I may lose a day to getting my hardware configured properly (I’ll have to reinstall XP and probably replace my video card — thanks a LOT, anonmous StarForce advocate).
The game is old: even the GeForce 3 that came with my Dell is compatible with it. Hopefully you don’t need to upgrade.
Well? Are you ready yet?
Slammed, but haven’t walked away from this challenge. I have the game, but haven’t set it up yet.
Really the effectiveness of anti-reversing techniques depends on how, what and how much of them are applied- many fail horribly, but then consider an application like skype that actually does a fairly decent job of it.
jf: If you read the rest of the blog, you will find I agree with you. This post was about the weakness of anti-debugging techniques used in isolation. They’re only helpful as part of a larger strategy.
Hi Nate. Great article!
Comments are closed.