Reading:
Virtual Memory
Translation Lookaside Buffer
Memory Management Unit
The kernel can see all memory in the system, including memory from devices and I/O ports, which may be mapped at different non-contiguous physical addresses. A user-space process on the other hand does not directly access real memory addresses, but instead through a Translation Lookaside Buffer (TLB) which maps virtual addresses into real memory addresses. The TLB is usually integrated into the CPU as part of the processors Memory Management Unit (MMU) which translates virtual addresses into physical addresses.
In virtual memory, memory is divided into pages, typically 4K (4096 bytes) in size, however newer architectures can divide memory into huge pages which may range in size up to 1GB in size. Huge pages make for a more efficient TLB (increasing processor efficiency,) but reduce memory efficiency as more memory will be wasted if the user-space process does not use it all.
Assuming a 4K page, the lower 12 bits of the address represent the page offset within the page, and the upper bits represent the virtual page number. The upper bits may be further subdivided in hardware into offsets into multiple levels of page directories in the TLB but we are not concerned with such a low level view of things.
Address bits: 63 - 12 11 - 0 ┌─────────────────────────────────────────────────────┬────────────────────────┐ │ Virtual Page Number │ Page offset │ └─────────────────────────────────────────────────────┴────────────────────────┘
Virtual memory Real Memory Pages Pages ... ... ┌─────┐ ┌─────┐ │ 3 │─────┐ TLB ┌───────▷│ 3 │ │ │ │ ┌─────────┬─────────┐ │ │ │ ├─────┤ └────▷│ VM_3 ▷ RM_1 │──│─┐ ├─────┤ │ 2 │─────┐ ├─────────┼─────────┤ │ │ ┌───▷│ 2 │ │ │ └────▷│ VM_2 ▷ RM_0 │──│─│─│─┐ │ │ ├─────┤ ├─────────┼─────────┤ │ │ │ │ ├─────┤ │ 1 │──────────▷│ VM_1 ▷ RM_3 │──┘ └─│─│─▷│ 1 │ │ │ ├─────────┼─────────┤ │ │ │ │ ├─────┤ ┌────▷│ VM_0 ▷ RM_2 │──────┘ │ ├─────┤ │ 0 │─────┘ └─────────┴─────────┘ └─▷│ 0 │ │ │ │ │ └─────┘ └─────┘ ... ...
When a virtual address is accessed, the virtual page is extracted from the upper bits of the address and searched for in the TLB and replaced with the real memories page number. The lower bits of the address (the page offset) remain unmodified.
Virtual address before TLB translation:
┌────────────────────────┬─────────────┐ │ VM_1 │ Page offset │ └────────────────────────┴─────────────┘
After translation:
┌────────────────────────┬─────────────┐ │ RM_3 │ Page offset │ └────────────────────────┴─────────────┘
The kernel manages the TLB for each process. When a process needs memory, the kernel picks a real page of memory from it's free memory pool and allocates it to the process and adds it to the processes virtual address space by adding an entry for it in the processes TLB.
When a processes memory is extended the memory isn't doesn't mean that real memory has immediately been allocated to the process. The kernel will instead give the process one or more TLB entries either to non-existent pages or to a blank page that is copy on write, only giving the process real memory when it attempts to use the memory. If memory over-committing is enabled in the kernel, then it becomes possible for a process to request more memory than can ever be made available to it.
When free memory is unavailable, the kernel may consider taking away memory from a process to use elsewhere. When doing so, it will save the contents of the memory in on-disk storage called swap and when the process attempts to access the saved memory, it will cause a page fault that interrupts the process and tells the kernel to attempt to re-load the saved data back into memory for the process. This shuffling of memory will continue until memory is exhausted or the kernel attempts to prevent excessive swapping (called thrashing) and call the Out Of Memory Killer which will pick a process (usually the one most like to be causing the memory pressure) and terminate the process, freeing the memory it was using.
The heap is the area of a processes memory that is used for dynamic memory allocations using the malloc()/calloc()/realloc() and free() C library calls. It begins at some random offset above the programs Text/Data/BSS memory segments and grows upward toward the Memory Mapping Segment (the location of shared libraries and other mmap()ed memory. The ending address of the heap is called the program break.
malloc()/calloc()/realloc()
free()
mmap()
The heap is increased in size via the brk() system call, sbrk() is an additional C library function with some additional functionality to make easier to extend the size of the heap w/o having to remember just where the program break is.
brk()
sbrk()
void *sbrk(intptr_t increment);
void *sbrk(intptr_t
);
Given increment, sbrk will attempt to increase the size of the heap by increment bytes, and if successful, returns the address of the old program break. On failure it returns (void *)-1
sbrk
malloc(), free(), calloc(), realloc()
The dynamic memory functions malloc, free, calloc, and realloc (and reallocarray) are part of the C runtime (CRT), which among other things, initializes the heap and sets up the free list for the heap prior to passing control over to the main function.
void *malloc(size_t size);
void *malloc(size_t
void free(void *ptr);
void free(void *
Malloc and free form the basis of the dynamic memory system. Memory is allocated from the heap via malloc() and returned to the heap via free(). The other functions are modifications or combinations of malloc() and free().
malloc()
malloc() returns the starting address of a memory region of size size. In C the size_t type represents a data type that can hold the size of the largest possible in-memory object, probably an unsigned 64 bit value for a 64 bit architecture.
free() should be given the same address that is given by malloc/calloc/realloc to return the memory region back to the heap, potentially to be re-allocated again. Attempting to give it any other address will likely lead to corruption of the heaps free list structures. Attempting to free the same memory twice is also ill advised.
void *calloc(size_t nmemb , size_t size);
void *calloc(size_t
, size_t
calloc() is essentially malloc(nmemb * size); with one key difference in that the memory that is allocated is zero'ed (i.e. filled with 0 bytes) i.e. memset(ptr, 0, size); before being returned by calloc. Using calloc is recommended when allocating pointers.
calloc()
malloc(
*
memset(
, 0,
void *realloc(void *ptr, size_t size);
void *realloc(void *
Given a pointer to old memory (ptr) and a new size (size), realloc will malloc memory of the new size, copy data from the old allocation to the new allocation (up to the lesser of the two sizes), then free the old allocation and return the new allocation as malloc does. If given NULL as ptr, then realloc acts just as malloc does. Given a pointer and a size of 0, realloc will free() the given memory. There is nothing that realloc cannot do!
void *reallocarray(void *ptr, size_t nmemb, size_t size);
void *reallocarray(void *
Essentially realloc(ptr, nmemb * size); however returns NULL if the multiplication would overflow (unlike realloc).
realloc(
,
The following code has been modified to compile with a modern compiler, use modern data types and to rename malloc/free to my_malloc/my_free to avoid name-space collisions with the C standard library and simplify the struct header data structure as alignment should not be a problem as both elements in the structure are 64 bits in size.
malloc
free
my_malloc
my_free
struct header
#include <stdint.h> #include <unistd.h> /* ┌──────────────────────────────────────────────────────────────────┐ free list ─────┐|┌──┐┌────────────────────┐┌──────────────────────┐┌────┐┌────┐┌───┘ ────────┬────────┬▼▼│┬─▼│┬────────┬────────┬─▼│─┬─────────┬────────┬─▼│─┬──▼│──┬─▼│─┬──────── ....... │ In use | | | In use | In use | | ....... | In use | | | | ────────┴────────┴───┴───┴────────┴────────┴────┴─────────┴────────┴────┴──────┴────┴──────── ┌────────┐ ┌────────┐ | | Free, owned by malloc | In use | In use, owned by malloc └────────┘ └────────┘ ┌────────┐ | ...... | Not owned by malloc └────────┘ */ /* Header: ┌───────► (ptr) Points to next free block ┌─|─┬──────┬──────── │ ● | size | └───┴──────┴▲─────── └──── Address returned to user */ struct header { /* Block header: */ struct header *ptr; /* Next block if on free list */ uint64_t size; /* Size of this block (in header units) */ }; typedef struct header Header; static Header base; /* Empty list to get started */ static Header *freep = NULL; /* Start of free list */ void *my_malloc(uint64_t nbytes); static Header *morecore(uint64_t nu); void my_free(void *ap); /* malloc: General-purpose storage allocator */ void *my_malloc(uint64_t nbytes) { Header *p, *prevp; Header *morecore(uint64_t); uint64_t nunits; nunits = (nbytes + sizeof(Header) -1) / sizeof(Header) + 1; if ((prevp = freep) == NULL) { /* No free list yet */ base.ptr = freep = prevp = &base; base.size = 0; } for(p = prevp->ptr; ; prevp = p, p = p->ptr) { if (p->size >= nunits) { /* Big enough */ if (p->size == nunits) /* Exactly */ prevp->ptr = p->ptr; else { /* Allocate tail end */ p->size -= nunits; p += p->size; p->size = nunits; } freep = prevp; return (void *)(p+1); } if (p == freep) /* Wrapped around free list */ if ((p = morecore(nunits)) == NULL) return NULL; /* None left */ } } #define NALLOC 1024 /* morecore: Ask system for more memory */ static Header *morecore(uint64_t nu) { char *cp; Header *up; if (nu < NALLOC) nu = NALLOC; cp = sbrk(nu * sizeof(Header)); if (cp == (char *) -1) /* No space at all. */ return NULL; up = (Header *)cp; up->size = nu; my_free((void *)(up+1)); return freep; } /* free: Put block ap in free list */ void my_free(void *ap) { Header *bp, *p; bp = (Header *)ap - 1; /* Point to block header */ for(p = freep; !(bp > p && bp < p->ptr); p = p->ptr) if (p >= p->ptr && (bp > p || bp < p->ptr)) break; /* Freed block at start or end of arena */ if (bp + bp->size == p->ptr) { /* Join to upper nbr */ bp->size += p->ptr->size; bp->ptr = p->ptr->ptr; } else bp->ptr = p->ptr; if (p + p->size == bp) { /* Join to lower nbr */ p->size += bp->size; p->ptr = bp->ptr; } else p->ptr = bp; freep = p; }