Understanding Linux 32-bit Memory Layout and Page Table Storage
Exploring Page Tables and Memory Layout
Introduction
In modern operating systems, memory management is a critical component that ensures efficient and secure access to physical memory. Central to this management is the page table, a data structure that maps virtual addresses used by processes to physical addresses in RAM.
This blog post explores where page tables are stored in Linux, focusing on 32-bit systems, and provides a hands-on approach to visualizing the entire memory layout, including both user and kernel spaces.
Virtual Memory in 32-bit Linux
On a 32-bit system, the total addressable virtual memory space is 4GB. Linux divides this space into two parts:
User Space: The lower 3GB (0x00000000 to 0xBFFFFFFF) is reserved for user processes.
Kernel Space: The upper 1GB (0xC0000000 to 0xFFFFFFFF) is reserved for the kernel.
This split ensures that kernel code and data are protected from user processes while allowing the kernel to access the entire system's resources.
Page Tables and Their Role
Page tables translate virtual addresses to physical addresses. On 32-bit x86 systems without Physical Address Extension (PAE):
The page directory (one per process) contains 1024 entries (PDEs), each pointing to a page table.
Each page table contains 1024 entries (PTEs), each mapping a 4KB page (or a 4MB page if using Page Size Extension, PSE).
The kernel maintains its page tables in the upper 1GB, shared across all processes. Each process has its page directory, but the kernel portion (upper 256 PDEs) remains consistent.
Viewing the Memory Layout
While tools like /proc/pid/maps
show user-space mappings, they omit kernel-space. To inspect the full layout, we need to:
Access the page directory via the
CR3
register (holds the physical address of the current process's page directory).Traverse the page tables to determine memory regions and their attributes.
Kernel Module for Memory Inspection
We’ll write a kernel module to:
Read the
CR3
register.Map the page directory into kernel memory.
Traverse PDEs and PTEs to generate a memory map.
Code: Kernel Module to Dump Memory Layout
Prerequisites
A 32-bit Linux system (PAE disabled for simplicity).
Kernel headers installed.
Basic understanding of kernel module development.
Module Code
#include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/mm.h> #include <linux/io.h> typedef unsigned int pd_entry_t; // Page Directory Entry (32-bit) typedef unsigned int pt_entry_t; // Page Table Entry (32-bit) static int __init memmap_init(void) { unsigned long cr3 = read_cr3(); // Get page directory's physical address pd_entry_t *page_dir = ioremap(cr3, 4096); // Map to kernel virtual address if (!page_dir) { printk(KERN_ERR "Failed to map page directory\n"); return -ENOMEM; } printk(KERN_INFO "Memory map:\n"); for (int i = 0; i < 1024; i++) { // Iterate all 1024 PDEs pd_entry_t pde = page_dir[i]; if (!(pde & 1)) { // PDE not present printk("."); } else { if (pde & 0x80) { // 4MB page if (pde & 0x004) { // User page if (pde & 0x002) printk("*"); // Read/write else printk("R"); } else printk("X"); // Supervisor } else { // Page table (4KB pages) unsigned long pt_phys = pde & 0xFFFFF000; pt_entry_t *page_table = ioremap(pt_phys, 4096); if (!page_table) { printk("?"); continue; } char symbol = '.'; // Default: no present pages for (int j = 0; j < 1024; j++) { // Check PTEs pt_entry_t pte = page_table[j]; if (pte & 1) { // Page present if (pte & 0x004) { // User page if (pte & 0x002) symbol = '+'; // Read/write else symbol = 'r'; } else symbol = 'x'; // Supervisor break; // Check only first present PTE } } printk("%c", symbol); iounmap(page_table); } } if (i % 64 == 63) printk("\n"); // Newline every 64 entries } iounmap(page_dir); return 0; } static void __exit memmap_exit(void) { printk(KERN_INFO "Module unloaded\n"); } module_init(memmap_init); module_exit(memmap_exit); MODULE_LICENSE("GPL");
Compiling and Running the Module
1. Save the Code
Save the code as memmap.c
.
2. Create a Makefile
obj-m += memmap.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
3. Compile the Module
$ make
4. Insert the Module
$ sudo insmod memmap.ko
5. View Output
$ dmesg | tail -n 30
Example Output and Interpretation
Sample Output
Memory map: ................................r............................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ...............................+..............................+. xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXxX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX..x...........................xx
Key to Symbols
.
: Unmapped region.X
: 4MB kernel (supervisor) page.R
: 4MB user read-only page.*
: 4MB user read/write page.x
: Page table with supervisor pages.r
: Page table with user read-only pages.+
: Page table with user read/write pages.
Analysis
User Space (First 768 Entries): Mostly unmapped (
.
) with small regions for code (r
), data (+
), and libraries.Kernel Space (Last 256 Entries): Dominated by
X
(4MB kernel pages) andx
(page tables for kernel data).
Conclusion
The page tables in 32-bit Linux are stored in physical memory, referenced by the CR3
register. By traversing the page directory and tables, we can visualize the memory layout, revealing how user and kernel spaces coexist. The provided module offers a practical way to inspect this layout, crucial for understanding system internals and debugging.