What is DMA (Part 9): I/O MMUs and why you’ll wish you’d used the DMA DDI in 3 or 4 years

Perhaps you’ve read through all of my posts about DMA and still think that using the DMA DDIs is optional.  After all you’ve built a 64-bit card, and you’re not using DAC so you don’t have to worry about busses being downgraded on you (or maybe you don’t believe in the boogyman either).  Sure someone might build an X86 machine that has actual map registers (companies have done such things before), but what are the chances you’ll ever have to run on one of them?  The reason is coming in the next few years. 

There are certain problems i hate debugging.  Any problem that involves a mis-programmed DMA controller is high on my list.  These look like random memory corruption and usually don’t reproduce well.  If you’re lucky the corruption is identifyable (like a text file, or WAV data).  If not you sit back and collect repros until the pool of machines is large enough to determine that they all have the same network controller.

And that’s just DMA that’s accidentally gone wrong.  Since DMA gets around all page protections it could be used to steal data as well.  Of course that’s a moot point for a kernel-mode driver since they can steal anything they want anyway.  But if we ever want to allow direct hardware access from a guest OS (on a Virtual PC system) or from user-mode DMA is going to be the killer issue.  It’s hard to build a secure system when one of the restricted components can tell its device to read any page on the computer.

Enter the IOMMU*.  Just like the regular MMU creates virtual address spaces from physical memory, the IOMMU creates logical address spaces for each device (function actually) on your PCI-X bus.  Smart people tell me these should be in “every” new system by the end of the decade, and I for one am very excited.

In a nutshell – the IOMMU has page tables for each bus/device/function that describe a logical address space (similar to the CPU pages tables).  When your devices attempts to read a logical address L, the IOMMU does a lookup, finds the appropriate physical page, and returns that page’s contents.  If there’s no mapping, or if the protections only allow write, then the DMA is “logged” and isn’t allowed to happen.  It’s much more complex than that, involving lots of electrons and transistors and such, and i’ve only seen a glimmer of it so i can’t talk in detail about how it works.

In short, the IOMMU lets us control which pages each device can access through DMA.  We could, for example, block access to core kernel information (interrupt & system call tables, non-paged kernel code, etc…) by leaving them out of the device page maps.  We could allow a Virtual PC guest to directly access a device since we could now virtualize the DMA operations as well – ensuring the device can’t access any pages that aren’t in the guest’s physical address pool.  We could coallesce fragmented transfers or make 32-bit devices work on 64-bit systems without any copies (if there are still 32-bit devices in three years :).

Of course we’d want to enable this in a way which didn’t break existing drivers, or at least “well-behaved” ones.  To do this, we’d need to add code to some function calls the driver makes before and after every DMA operation.  Maybe something like GetScatterGatherList and PutScatterGatherList?

We’re only just starting to make decisions about what we’ll do with this in the base OS.  It’s clearly interesting, but how much of it we can use is up in the air.  However even now a couple things are clear:

  1. This can help with some of the reliability problems around DMA

  2. This helps most if we can tie it into every DMA operation a driver initiates

  3. The most logical place to do that is building it into the DMA DDI (whatever that may look like in the future).

So to summarize: even if you don’t see a reason you need to use the DMA DDI today, that reason is looming on the horizon.  I think we’ll see demand from system administrators, IT departments & OEMs with high support costs to start using the IOMMU as a protection mechanism.  It’s going to see usage in the virtualization space to allow direct hardware access (and on an enlightened OS i’m sure it will leak into the DMA functions).

If you’re using the DMA DDIs already you’re probably in good shape.  If you’re not, you should start thinking about how you’ll integrate it in during one of your future design changes.

* IOMMU is AMD’s term for it.  Intel uses some different term that starts with V, which is less catchy and which i can’t rememeber.  So for now assume IOMMU applies to both.

Comments (3)

  1. rmlexi says:

    Are you speaking of the Intel Vanderpool tech? AMD’s codename was Pacifica iirc for their IOMMU / VT project.

    Sounds like a great deal for us that dip into almost all environments.

  2. Jeremiah says:

    See the DART on PPC Macintosh for an example of a simple IOMMU that is working on quasi-mainstream systems today.  


    Also, I believe that Linux supports using the Athlon64’s built-in mini IO-MMU (I think it was intended for graphics, a newer GART?) to avoid the buffer copies that Windows performs.

    I wish there was an official WDM DMA DDI that supported allocating CommonBuffers that were logically contiguous but not physically contiguous (for scatter-gather).  Until that time, I will use the MmAllocatePagesForMdl() backdoor and spend much time in meditation.  My other option is to *abuse* the packet-based WDM DMA DDI (hold onto *many* map registers for long durations).