,

Contents · Garbage collectors (mark-sweep, copying, generational, G1)


Roots, reachability, and tri-color invariant

  • Roots: stacks, registers, globals; tracing finds reachable objects.
  • Tri-color: white (unreached), gray (reached, not scanned), black (scanned).
  • Write/read barriers preserve invariants during mutator execution.
Invariant: no black points to white (Dijkstra) or no white accessible from black (Yuasa)

Mark–sweep GC

  • Mark phase: trace from roots; Sweep phase: return unmarked objects to free lists.
  • Non-moving; can fragment; simple to integrate with free-list allocators.
  • Incremental variants rely on barriers and bitmaps.
mark(rootset)
sweep(heap)

Copying GC (Cheney)

  • Semi-space: from-space → to-space using a scan pointer; compacts and improves locality.
  • Requires forwarding pointers; doubles memory reservation for semi-spaces.
  • Great for short-lived objects (nursery/young gen).
while scan < to_end: for each ref in obj(scan): if in from-space then copy & forward

Generational GC and remembered sets

  • Empirical: most objects die young; collect young frequently, old rarely.
  • Card marking/remembered sets track old→young pointers (write barriers).
  • Minor vs major GCs; promotion thresholds and survivor spaces.
write barrier: card_table[addr >> card_shift] = DIRTY

G1/region-based collectors

  • Heap split into regions; collect sets of regions to meet pause-time goals.
  • Remembered sets per region; evacuation moves live objects and compacts.
  • Pause prediction models guide region selection (CSet).

Concurrent/incremental GC and barriers

  • Concurrent marking with SATB (Snapshot-At-The-Beginning) or incremental update.
  • Barriers: write (post/pre), read barriers (e.g., Brooks, ZGC load barriers).
  • Compacting collectors: sliding compaction vs evacuations.

Tuning, trade-offs, and pauses

  • Balance throughput vs latency: gen sizes, pause targets, parallelism.
  • Monitor promotion failures, fragmentation, and humongous object behavior.
  • Use logs/telemetry to validate pause-time SLOs.

Exercises

  1. Build a toy copying collector (semi-space) for a simple object graph.
  2. Add a generational layer with a card table; observe minor vs major GC behavior.
  3. Experiment with pause targets and region sizes to mimic G1-style choices.
Different collectors optimize different objectives—pick the one that matches your latency and throughput goals.