Protecting memory from DMA

September 28, 2007September 27, 2007 ~ Nate Lawson ~ 5 Comments

Previously, we discussed how DMA works in the PC architecture. The northbridge is only aware of physical addresses and directs transactions to the appropriate devices or RAM based solely on that address.

Unlike within the CPU where there is virtual memory translation and protection, the chipset previously did not perform any translation of physical addresses or place restrictions on which addresses can be accessed. From the RAM’s perspective, a memory access that originated from the CPU or from the integrated Ethernet is exactly the same. Stability and security depended on the device driver properly programming the device to DMA only to physical addresses that were within the buffer assigned by the OS. This was fine except for device driver or hardware bugs, since ring 0 code was trusted.

With system-wide virtualization and DRM becoming more common, ring 0 code is no longer trusted. To avoid DMA corruption between guests or the host, the hypervisor previously would create a fake device for each OS instance. The guest talked to the fake device, and the host would multiplex the transactions over the real device. This has a lot of overhead, so it would be preferable to let the guest talk directly to the real device.

An IOMMU provides translation and protection for physical addresses. The hypervisor sets up a page table within the northbridge that groups page table entries by their device IDs. Then, when a DMA request arrives at the northbridge from a device, it is looked up by its ID, translated into the actual destination physical address, and allowed or denied based on the protection settings. If a write is denied, no data is transferred to RAM. If it’s a read, all bits are set to 1 in the response. Either way, an abort error is returned to the device as well.

DMA protection (AMD: DEV, Intel: NoDMA table) is currently available in shipping products and physical address translation (AMD: IOMMU, Intel: VT-d) is coming very soon. While these features were implemented separately, it is expected that they will usually be used together.

There have been a few surprising studies of IOMMU performance. The first paper, by IBM researchers, shows that the overhead in setting up and tearing down mappings consumed up to 60% more CPU than without. They discuss various mapping allocation strategies to address this. However, they all have their disadvantages. One of the strategies, setting up the mappings at guest startup and never changing them, interferes with the hypervisor strategy called “ballooning”, where resources are only allocated to a guest as it uses them. This is what allows VMware to run guests with more RAM available to them than the host actually has. Read the paper for more analysis of their other strategies.

Another paper, by Rice University researchers, proposes virtualization support built into the devices themselves (“CDNA”). They build a NIC that maintains a unique set of registers for each guest. Each guest believes it has direct access to the NIC, although requests to set up DMA go through the hypervisor. The NIC hardware manages the fair scheduling of DMA among all the register contexts, so actual packets going out on the wire will be balanced between the various guests sending them. This approach requires no IOMMU, but each device needs to be capable of maintaining multiple register contexts. Again, read this paper for a different take on device virtualization.

This research shows that an IOMMU is not the only way to achieve DMA protection, and it’s important to carefully design how a hypervisor uses an IOMMU to prevent a loss of performance. Next time, we’ll examine some usage scenarios for IOMMUs, both in virtualization and DRM.

PC memory architecture overview

September 27, 2007February 14, 2011 ~ Nate Lawson

The topics of DMA protection and a new Intel/AMD feature called an IOMMU (or VT-d) are becoming more prevalent. I believe this is due to two trends: increased use of virtualization and hardware protection for DRM. It’s important to first understand how memory works in a traditional PC before discussing the benefits and issues with using an IOMMU.

DMA (direct memory access) is a general term for architectures where devices can talk directly to RAM, without the CPU being involved. In PCs, the CPU is not even notified when DMA is in progress, although some chipsets do report a little information (i.e., bus mastering status bit or BM_STS). DMA was conceived to provide higher performance than the alternative, which is for the CPU to copy each byte of data from the device to memory (aka programmed IO). To write data to a hard drive controller via DMA, the driver running on the CPU writes the memory address of the data to the hardware and then goes on to doing other tasks. The drive controller finishes reading the data via DMA and generates an interrupt to notify the CPU that the write is complete.

DMA can actually be slower than programmed IO if the overhead in talking to the DMA controller to initiate the transaction takes longer than the transaction itself. This may be true for very short data. That’s why the original PC parallel port (LPT) doesn’t support DMA. When there are only 8 bits of data per transaction, it doesn’t make sense to spend time telling the hardware where to put the data, just read it yourself.

Behind this concept of DMA, common to nearly all modern architectures, the PC has a particular breakdown of responsibilities between the various chips. The CPU executes code and talks to the northbridge (Intel MCH). Integrated devices like USB and Ethernet are all located in the southbridge (Intel ICH), with the exception of on-board video, which is located in the northbridge. Between each of these chips is an Intel or AMD proprietary bus, which is why your Intel CPU won’t work with your AMD chipset, even if you were to rework the socket to fit it. Your RAM is accessed only via the northbridge (Intel) or via a bus shared with the northbridge (AMD).

Interfacing with the CPU is very simple. All complexities (privilege level, paging, task management, segmentation, MSRs) are handled completely internally. On the external bus shared with the northbridge, a CPU has a set of address and data lines and a few control/status lines. Besides power supply, the address and data pins are the most numerous. In the Intel quad-core spec, there are only about 60 types of pins. Only three pins (LINT[0:1], SMI#) are used to signal all interrupts, even on systems with dozens of devices.

Remember, these addresses are purely physical addresses as all virtual memory translation is internal to the CPU. There are two types of addresses known to the northbridge: memory and IO space. The latter are generated by the in/out asm instructions and merely result in a special value being written to the address lines on the next clock cycle after the address is sent. IO space addresses are typically used for device configuration or legacy devices.

The northbridge is relatively dumb compared to the CPU. It is like a traffic cop, directing the CPU’s accesses to devices or RAM. Likewise, when a device on the southbridge wants to access RAM via DMA, the northbridge merely routes the request to the correct location. It maintains a map, set during PCI configuration, which says something like “these address ranges go to the southbridge, these others go to the integrated video”.

With integrated peripherals, PCI is no longer a bus, it’s merely a protocol. There is no set of PCI bus lines within your southbridge that are hooked to the USB and Ethernet components of the chip. Instead, only PCI configuration remains in common with external devices on a PCI bus. PCI configuration is merely a set of IO port reads/writes to walk the logical device hierarchy, programming the northbridge with which regions it decodes to which device. It’s setting up the table for the traffic cop.

Next time, we’ll examine the advent of IOMMUs and DEVs/NoDMA tables.

Next Baysec: Sept 17 at O’Neills

September 9, 2007 ~ Nate Lawson

The next Baysec meeting is at O’Neills again. We’re happy to also welcome some WASC people that are in town. As always, this is not a sponsored meeting, there is no agenda or speakers, and no RSVP is needed.

See you on Monday, September 17th, 7-11 pm.

O’Neills Irish Pub
747 3rd St (at King), San Francisco