During a conversation with Thomas Ptacek about bug-hunting techniques, I came up with an interesting question. Do patches for bugs found through fuzzing or other automated techniques look any different than those found manually? Of course, the bugs themselves will likely be similar but will the patches also have some signature?
I have a hunch that bugs found via fuzzing show up in the perimeter of code, whereas those found manually may be deeper down the callstack. Or, they may usually be the same class of header-based integer overflow, fixed by similar range checks.
Can anyone who has more experience in this area enlighten me? Halvar and Ero, got some neat bindiff stats to show?
3 thoughts on “Do fuzzed bugs look different?”
Wouldn’t this be largely dependent on the person(s)/team(s) responsible for patching these classes of bugs?
I think this would not only be specific to the company, product, and/or individual, but possibly also additional factors.
If there are patch characteristics that come from automated fuzz testing techniques vs. data-flow testing (i.e. tracing where variables are defined to where they are used by following program control with a CFG), then I’m not sure that they would be consistent. My guess is that there wouldn’t be a “patch signature” specific to any type of testing.
However, I’m not sure what you mean when you say “manual testing”. Are you talking about a manual DFA or perhaps a different technique? What do you mean by “other automated techniques”?
Is the fuzz testing assumed to be black-box? What about white-box robustness testing that using a decision-table, state/transition information, etc? What about static analysis which can go between CFA/DFA and classify information such as modularity, decision control, data-flow structure, and control structure?
We’ve all heard the joke about the developer who writes a conditional statement checking for 257 A’s as the bug fix. Certainly this indicates that the developer found out about the bug through a PoC exploit, since he/she has no knowledge of how buffer overflows actually work. Characteristics like this may swing a good guess towards one method of testing vs. another, but I wouldn’t count on anything scientific or that can be an argument/proof using statistics.
Andre, thanks for the in-depth comment. This was just a thought experiment and I don’t think it has much value. For a specific coder/tester, they may have a similar signature, as you mention. So if a patch is sourced from an external researcher, it may look different from one developed completely internally. However, it’s not clear yet if any of this brainstorm is useful. :-)
Well, the bugs in Debian certainly look different when you comment out a line of code containing all the entropy from an uninitialized portion of memory :\ Yay, Valgrind.
Comments are closed.