What Is DMA?
DMA is a way for you to offload the work of transferring data between main memory and the device onto your device. This is in contrast to programmed I/O (PIO) where you have the processor copying data between main memory and the device.
PIO results in high data-rates (processors are fairly good at moving data from A to B), but since you're effectively running memcpy() for every transfer. For a larger transfers, it's better to offload this to some other unit which can move the data from A to B and then signal (preferably with an interrupt) when the transfer is done.
Someone paid a lot of money for a CPU that can do complicated things - math, comparisons, branches, etc... It's better to leave it free to do this complex stuff and offload the mundane work of moving data between A and B to a cheaper component dedicated to that task.
Flavors of DMA
There are two flavors of DMA - slave-mode and bus-mastering. In bus-mastering DMA your device initiates the bus cycles that read from or write to main memory, just like a processor might. In slave-mode your device depends on some system component to do the transfers.
Slave mode transfers make a device cheaper because it doesn't have to include a DMA controller of its own. However they're very limiting - you have to share this separate controller across all devices, your device can't do much to slow the rate at which data is transferred. Finally (i believe) the PC DMA controller was limited to a 24-bit address space and required contiguous buffers so it really doesn't scale well for modern PCs.
For Bus-Mastering DMA you place a "DMA Controller" on your device to run the DMA cycles for you. The device will steal some bus time and initiate a memory transfer as if it were another CPU. Data is transferred directly from main-memory into the device's memory ranges. You can have multiple bus-masters running independently of each other - they share the memory bus using some common protocol. This is more effiicent than having all your devices fight over a single transfer agent (whether it's the CPU (PIO) or a separate DMA controller (slave-mode)).
Part 2 will talk about the various models i've seen for using DMA when programming a controller