,

Contents · Dataflow and control-flow analysis


Control Flow Graph (CFG)

  • Nodes are basic blocks; edges represent control transfers.
  • Construct CFG from AST by introducing blocks at branches/loops.
  • Compute dominators, post-dominators, loops (natural loops via back-edges).
Entry → B1 → B2 → Exit; edges from if/while form the graph

Dataflow framework: lattices, transfer, meet

  • Each program point has a fact from a lattice (partial order with meet ⊓/join ⊔).
  • Transfer functions map input facts to output facts for each node.
  • Monotonicity ensures convergence on finite-height lattices.
OUT[n] = f_n(IN[n]); IN[n] = ⋂_{p ∈ preds(n)} OUT[p]

Worklist algorithms and convergence

  • Initialize facts, push nodes to worklist, iterate until fixed point.
  • Reverse postorder (forward) or postorder (backward) traversals accelerate convergence.
  • Widening/narrowing for infinite lattices (abstract interpretation).
while worklist not empty: pick n; recompute IN/OUT; if changed, enqueue succ/pred

Classic analyses: reaching defs, liveness, available exprs

  • Reaching definitions (forward, may): where can a def reach a use?
  • Liveness (backward, may): which vars needed in the future? drives register allocation.
  • Available expressions (forward, must): common subexpression elimination.
GEN/ KILL sets per block; OUT = GEN ∪ (IN − KILL)

SSA and sparse dataflow

  • Exploit SSA to run analyses only at defs/uses, not every program point.
  • Sparse conditional constant propagation (SCCP) merges dataflow + CFG reachability.
  • Phi nodes act as merge points; use def-use chains for propagation.
Worklist over SSA uses; lattice {undef, const c, overdefined}

Interprocedural analysis (call graphs, context)

  • Build call graph (CHA/RTA/points-to); handle recursion and virtual calls.
  • Contexts: k-CFA, summaries; trade precision for scalability.
  • Whole-program vs module-level analyses.

Practical tips and precision vs cost

  • Choose lattice and flow direction to match optimization goals.
  • Beware of path sensitivity and aliasing; use SSA and points-to info.
  • Profile and cache results; incremental analysis helps IDEs and JITs.

Exercises

  1. Implement reaching definitions and liveness on a small IR; visualize IN/OUT sets.
  2. Implement SCCP and compare with classic constant propagation.
  3. Add available expressions and perform common subexpression elimination.
Dataflow turns program structure into facts. With good lattices and transfer functions, powerful optimizations emerge.