x86 Linear Address Space Paging Revisited

Last time we revisited x86 segment addressing, which translates logical-address into linear-address. As we mentioned earlier, two stages of address translation would be used to arrive at a physical address: logical-address translation and linear address space paging.

Paging in x86 is optional and is controlled by CR0.PG. If paging is disabled (CR0.PG = 0), the linear-address would be mapped directly into the physical address space of processor. When protection is enabled (CR0.PE = 1), paging can be turned on by having CR0.PG = 1.

As we mentioned, paging is optional, then why do we need paging? I think there are several reasons:

  1. Paging can be used to implement virtual memory, in the good old day virtual memory is a crucial component of operating systems, because physical memory is small and expensive.
  2. Paging allows the physical address space to be noncontiguous. The system will have less physical memory fragmentation problems to deal with.
  3. Paging can be used to implement some tricky algorithms, such like a high performance ring buffer with faked contiguous space.
  4. Paging makes it possible to implement features such like copy-on-write, on demand commit, and fine-grained access control.

When paging was first introduced to x86 family with the 80386 processor, there was only one paging mode, and the page size will always be 4KB. A lot of features were added as time moves on, such like PAE (Physical Address Extension), PSE (Page Size Extension) and 64bit support. Based on whether certain features are enabled or not, the CPU will determine which paging mode and page size to use.

No matter which paging mode is used, the concept is the same - hierarchical paging structures will be used. Paging always starts from CR3 register, which holds the physical address of the first paging structure, and each paging structure is always 4KB in size. During each step, a portion of the linear address will be used to select an entry from a paging structure, this happens recursively until the entry maps a page instead of referencing another paging structure.

According to the "Intel 64 and IA-32 Architectures Software Developer's Manual", there are three paging modes:

  • 32-Bit Paging
  • PAE Paging
  • IA-32e Paging

32-Bit Paging

Each paging entry is 4 bytes in size, there are 1024 entries in each paging structure.

The translation process uses 10 bits at a time from a 32-bit linear address:

  1. Bits 31:22 identify the first paging structure, which is known as PDE.
  2. Bits 21:12 identify the second paging structure, which is known as PTE.
  3. Bits 11:0 are the page offset within the 4-KByte page frame.

If PSE enabled, each page is 4-MByte in size, which would reduce one level of indirection (which in turns reduce the TLB pressure):

PAE Paging

Each paging entry is 8 bytes in size, there are 512 entries in each paging structure.

The first paging structure is an exception, which is 32 bytes in size and contains 4 64-bit entries.

The translation process uses 9 bits at a time from a 32-bit linear address, except for the first paging structure:

  1. Bits 31:30 identify the first paging structure.
  2. Bits 29:21 identify the second paging structure.
  3. Bits 20:12 identify the third paging structure, which is the page frame.
  4. Bits 11:0 are the page offset within the 4-KByte page frame.

If PSE enabled, each page is 2-MByte in size.

IA-32e Paging

Each paging entry is 8 bytes in size, there are 512 entries in each paging structure.

The translation process uses 9 bits at a time from a 48-bit linear address, except for the first paging structure:

  1. Bits 47:39 identify the first paging structure.
  2. Bits 38:30 identify the second paging structure.
  3. Bits 29:21 identify the third paging structure.
  4. Bits 20:12 identify the fourth paging structure, which is the page frame.
  5. Bits 11:0 are the page offset within the 4-KByte page frame.

If PSE enabled in PDE, each page is 2-MByte in size.

If PSE enabled in PDPTE, each page is 1-GByte in size.

The main reason of having PSE and large page is to reduce the load of Translation Lookaside Buffer (TLB). However, this requires contiguous physical memory, which would be a problem when physical memory got fragmented (in Windows NT the memory manager would defrag physical memory in kernel mode when contiguous physical memory is required, which is very time consuming).

Now we finished the introduction, and I would recommend some exercises:

  1. Let’s get physical
  2. Flags and Large Pages
  3. Non-PAE and X64