Reverse engineering with a VM

April 21, 2007April 22, 2007 ~ Nate Lawson

In a previous comment, Tim Newsham mentions reverse engineering an application by running it in a VM. As it so happened, I gave a talk on building and breaking systems using VMs a couple years ago. One very nice approach is ReVirt, which records the state of a VM, allowing debugging to go forwards or backwards. That is, you can actually rewind past interrupts, IO, and other system events to examine the state of the software at any arbitrary point. Obviously, this would be great for reverse engineering though, as Tim points out, there haven’t been many public instances of people doing this. (If there have, can you please point them out to me?)

An idea I had a few years back was to design a VM-based system to assist in developing Linux or FreeBSD drivers when only Windows drivers are available. The VM would be patched to record data associated with all IO instructions (inb, outb, etc.), PCI config space access, and memory-mapped IO (a “wedge” device.) It would pass through the data for a single real hardware device. To the guest OS, it would appear to be a normal VM with one non-virtual device.

To reverse engineer a device, you would configure the VM with the bus:slot:function of the device to pass through. Boot Windows in the VM with the vendor driver installed. Use the device normally, marking the log at various points (“boot probe”, “associating with an AP”). Pass that log on to the open source developer to assist in implementing or improving a driver.

A similar approach without involving a VM would be to make a Windows service that loads early and hooks HAL.DLL as well as sets protection on any memory mappings of the target device. Similar to copy-on-write, access to that memory would trigger an exception that the service could handle, recording the data and permitting access. This could be distributed to end users to help in remote debugging of proprietary hardware.

11 thoughts on “Reverse engineering with a VM”

Anonymous says:

April 22, 2007 at 2:42 am

See http://stackframe.blogspot.com/2007/04/debugging-linux-kernels-with.html
“””
Debugging Linux kernels with Workstation 6.0

We just quietly added an exciting feature to Workstation 6.0. I believe it will make WS6 a great tool for Linux kernel development. You can now debug kernel of Linux VM with gdb running on the Host without changing anything in the Guest VM. No kdb, no recompiling and no need for second machine. All you need is a single line in VM’s configuration file.
“””
Nate Lawson says:

April 22, 2007 at 8:11 am

Anon, that seems nice. I wonder what mechanism they use for guest breakpoints. Just TF and debug registers or something intended to be more invisible?
newsham says:

April 22, 2007 at 9:58 am

Doesn’t vmware do just-in-time code rewriting to ring zero code before executing it [mentioned in http://www.vmware.com/pdf/asplos235_adams.pdf%5D? If so, injecting breakpoints should be trivial and (for the most part) unobservable.
nolan says:

April 22, 2007 at 12:15 pm

You’ll need an IOMMU (and support in the VMM for it) to do this with any device that does DMA, since the physical addresses the driver is giving the hardware are not the actual physical addresses.

Strangely, your second suggestion actually exists for Linux, called MMIOTrace:
http://nouveau.freedesktop.org/wiki/MmioTrace
Nate Lawson says:

April 22, 2007 at 3:35 pm

No, you don’t need any special IOMMU. You’re right that if you wanted to log bulk data transferred via DMA, you’d need to be aware of the physical addresses allocated to the driver. However, I don’t know of any hardware that is configured via DMA. Instead, devices are configured solely through IO instructions (easily trapped) and accesses to memory mapped to virtual addresses (i.e. via PCI BARs). Knowing these tells you all you need to know to access a device in the same way.

MMIOTrace is ok for debugging bugs in existing Linux drivers, but not as useful as a version for Windows. That’s where you have the most driver support for proprietary hardware and the need for reveng support to implement open source drivers.
nolan says:

April 22, 2007 at 5:58 pm

Nate,

When you’re running an OS in a VMM, the virtual memory mapping has 3 layers instead of the usual 2. To use VMware’s terminology, you have your guest kernel mapping from Virtual Addresses (VAs) to Physical Addresses (PAs). The monitor then maps the guest’s PAs to Machine Addresses (MAs) which are what in the non virtualized case you would call Physical Addresses. This mapping is usually done using shadow page tables, but Intel and AMD chips coming real soon now will support the extra mapping in hardware (called Extended Page Tables and Nested Page Tables respectively). If you’re more used to Xen’s terminology, simply s/PA/Guest PA/g and s/MA/Host PA/g.

With that background out of the way, you have a driver in your guest poking PAs into the hardware, telling it to do a DMA to or from them. Without an IOMMU to do the PA->MA mapping, the hardware will happily use the PA as an MA, and clobber whatever memory had the bad luck to be at that MA. You certainly won’t get correct results; more likely you’ll get total flaming death.

Xen can do PCI device passthrough with paravirtualized guests without an IOMMU because it changes the guest kernel’s DMA mapping functions to translate PAs to MAs. This doesn’t help you with Windows, and while it works correctly, it is obviously completely insecure; any guest with direct access to a DMA capable device can simply program it directly to DMA anywhere in memory, including memory owned by other guests or the hypervisor/VMM.
Nate Lawson says:

April 22, 2007 at 7:56 pm

nolan, thanks for the explanation. It’s been a while since I looked at VMware-type approaches. Is there any reason why the host OS can’t reserve MAs that equal PAs requested by the guest OS? That way the DMA will proceed normally without overwriting anything. The host could even spoof an overly restrictive E820h BIOS map to convince the guest OS not to use anything other than a strict range of PAs.

What do you think? Remember, the goal is to provide full device passthrough without modifying the guest OS. It doesn’t matter if the host OS has to reserve 90% of its RAM or open itself up security-wise during the reverse engineering.
nolan says:

April 23, 2007 at 9:42 am

Nate,

What you describe can work. One (and only one) VM can be linearly mapped with PA 0 being MA 0.

VMware doesn’t do this, nor does Xen or KVM out of the box. It would be very difficult to make this happen on a hosted VMM, as most OSes do not provide an easy way to allocate large contiguous chunks of memory, much less ones that start at PA 0.

There is a company called “Neocleus” that has patches for Xen for this purpose, but I don’t think they’ve released them publicly yet. I believe it is their intention to do so eventually, so you might try contacting Guy Zana there if you’re interested.
Jordan Wiens says:

April 23, 2007 at 10:04 am

Also in VMWare 6.0, Record and Replay:

http://blogs.vmware.com/sherrod/2007/04/the_amazing_vm_.html

Sounds pretty similar to what you’re suggesting. When I first heard about it, I was pretty excited. VMWare 6.0 is impressive.
Nate Lawson says:

April 23, 2007 at 5:37 pm

Jordan, that’s great that they added that feature. I’m a little disappointed it doesn’t include “backstep” capability though since they obviously have the full log of the instruction trace. It’s really helpful to be able to back up the state one op at a time to do a binary search for a bug. You could do that with the VMware approach by dividing the recorded trace into smaller and smaller chunks, but it’s not as easy to use for this. It’s still good that they are going that direction.
newsham says:

April 28, 2007 at 11:19 pm

re: “backstep,” I just came across simics’ “hindsight”:
http://www.virtutech.com/products/simics-hindsight.html
“Simics Hindsight is the first complete, general-purpose tool for reverse execution and debugging of arbitrary electronic systems.”