Outline of Garbage Collection

For a very detailed discussion of garbage collection, I highly recommend this survey paper. Paul Wilson, Uniprocessor Garbage Collection Techniques, International Workshop on Memory Management, St. Malo, France, September 1992..
  • Introduction
  • General Compiler Runtime Support
  • Type checking.
  • Bounds checking.
  • Memory checking.
  • The Problem of Memory Management.
  • Keeping track of allocated memory.
  • Usually in interpreted languages: Lisp, Java, CLR.
  • Also possible as a library: C and C++.
  • Manual Approaches to Garbage Collection:
  • Do nothing!
  • Arena memory management.
  • Manual reference counting.
  • Automatic Garbage Collection
  • Preliminaries
  • Memory layout: code - static - stack - heap
  • What? compiler tracking / conservative
  • When? stop-the-world / incremental
  • How? refcount / mark-sweep / copying / generational
  • Reference Counting
  • Increment/decrement on first/last reference.
  • When do references come and go?
  • Small but repetetive cost.
  • Optimization: Only examine stack at cleanup time.
  • Mark and Sweep
  • Start with set of base pointers.
  • Chase them to find live data.
  • When all done, iterate and remove dead data.
  • Fragmentation and poor locality.
  • Optimization: mark-compact collection.
  • Copying
  • Keep two areas in memory: fromspace and tospace.
  • Allocate as normal in fromspace.
  • Copy live items fromspace->tospace
  • Throw away the fromspace.
  • Swap fromspace and tospace.
  • Optimization: Special-case very large objects.
  • Generational
  • Short-lived vs. long-lived data.
  • Move survivors to an "older" generation.
  • Collect the older generation less frequently.
  • Garbage Collection in the Real World
  • Garbage collection in Java:
  • Pointers are explicitly identified in the JVM.
  • Fixed size heap allocated at startup.
  • Optionally call gc() when desired.
  • finalize() method used to clean up external state.
  • Garbage collection in C:
  • Interpose on malloc/free/realloc/etc.
  • Conservative identification.
  • Any integer in memory could be a pointer.
  • Which methods are possible with conservative?
  • Garbage collection in the OS:
  • File system after a crash (fsck)
  • Which data blocks are in use: mark and sweep.
  • Which inodes are in use: reference counting.