,

Contents · JIT vs AOT, tiered compilation


Execution models: JIT vs AOT

  • AOT compiles ahead of time to native code; predictable deploys, no runtime compiler.
  • JIT compiles at runtime; adapts to actual workload and hardware.
  • Hybrid models: AOT baseline + JIT for hot paths (JDK, .NET ReadyToRun).

Tiered compilation

  • Multiple tiers: interpreter → baseline JIT → optimizing JIT.
  • Hotness counters and sampling guide upgrades/downgrades between tiers.
  • On-stack replacement (OSR) moves executing frames between tiers.
if hot(f): recompile(f, opt_level=2); OSR into optimized loop

Profiling and feedback (PGO, PICs, counters)

  • Collect edge counters, type feedback, inline caches (PICs/Megamorphic ICs).
  • Use static PGO (offline) or dynamic profiling (online) to inform inlining and layout.
  • Guarded optimizations rely on deopt paths when feedback changes.
PIC at call site: cache target by receiver class; miss → update cache

Devirtualization, speculation, and guards

  • Speculate on likely types/targets; insert guards to validate assumptions.
  • Failed guard triggers deoptimization to a safe tier/state.
  • Enable aggressive inlining, vectorization, and strength reduction on the hot path.

Code cache, invalidation, and deoptimization

  • Manage a code cache with eviction policies; handle self-modifying code constraints.
  • Invalidation on class loading or assumptions changing.
  • Deoptimization rebuilds interpreter frames from metadata (stack maps).

Startup, warmup, and steady state

  • Cold start cost vs peak throughput; tiering mitigates warmup time.
  • AOT helps startup (e.g., native images), JIT recovers peak performance.
  • Profile-guided AOT reduces warmup by embedding feedback.

Trade-offs and deployment considerations

  • Environment: servers (long-running) favor JIT+tiered; CLIs and FaaS favor AOT.
  • Security and policy constraints may limit JIT (W^X, iOS); use AOT or ahead-of-time caching.
  • Diagnostics: perf, GC, and deopt logs essential for tuning.

Exercises

  1. Implement a tiny tiered counter: interpret, then JIT a hot loop into simplified native IR.
  2. Add inline caches to a dynamic dispatch site and measure hit rates.
  3. Experiment with PGO-guided inlining thresholds on a sample program.
Tiered compilation blends fast startup with peak throughput—use feedback to move code to the right tier at the right time.