,

Contents · Selection (Quickselect, Median of Medians)


Overview and Problem Statement

  • Selection asks for the element with rank k (0-index or 1-index) in an array, e.g., the median (k = ⌊n/2⌋).
  • Goals: linear time O(n), in-place, and ideally small constant factors.
  • Two canonical approaches: randomized pivot (Quickselect) and deterministic pivot (Median of Medians).

Quickselect (Average-case Linear)

// Quickselect: expected O(n) using a random or median-of-3 pivot
function partition(arr, lo, hi, pivotIndex) {
  const pivot = arr[pivotIndex];
  [arr[pivotIndex], arr[hi]] = [arr[hi], arr[pivotIndex]]; // move pivot to end
  let store = lo;
  for (let i = lo; i < hi; i++) if (arr[i] < pivot) {
    [arr[store], arr[i]] = [arr[i], arr[store]]; store++;
  }
  [arr[store], arr[hi]] = [arr[hi], arr[store]];
  return store;
}

function quickselect(arr, k) {
  let lo = 0, hi = arr.length - 1;
  while (lo <= hi) {
    const pivotIndex = lo + Math.floor(Math.random() * (hi - lo + 1));
    const p = partition(arr, lo, hi, pivotIndex);
    if (p === k) return arr[p];
    if (k < p) hi = p - 1; else lo = p + 1;
  }
  return undefined; // k out of bounds
}
  • Expected time: O(n) with random pivot; worst case O(n^2) if partitions are highly unbalanced.
  • Pivot choices: random, median-of-3/5, Tukey ninther improve robustness.
  • Stability: selection is not stable; partitions are in-place to save memory.

Median of Medians (Deterministic Linear)

// Deterministic pivot in O(n): groups of 5, recurse on medians
function medianOfMedians(arr, lo = 0, hi = arr.length - 1) {
  const n = hi - lo + 1;
  if (n <= 5) {
    const slice = arr.slice(lo, hi + 1).sort((a, b) => a - b);
    return slice[Math.floor(slice.length / 2)];
  }
  const meds = [];
  for (let i = lo; i <= hi; i += 5) {
    const r = Math.min(i + 4, hi);
    const g = arr.slice(i, r + 1).sort((a, b) => a - b);
    meds.push(g[Math.floor(g.length / 2)]);
  }
  return medianOfMedians(meds, 0, meds.length - 1);
}

function selectDeterministic(arr, k, lo = 0, hi = arr.length - 1) {
  while (true) {
    if (lo === hi) return arr[lo];
    const pivot = medianOfMedians(arr, lo, hi);
    // partition around pivot value (not index), using 3-way partition
    let lt = lo, i = lo, gt = hi;
    while (i <= gt) {
      if (arr[i] < pivot) { [arr[lt], arr[i]] = [arr[i], arr[lt]]; lt++; i++; }
      else if (arr[i] > pivot) { [arr[i], arr[gt]] = [arr[gt], arr[i]]; gt--; }
      else { i++; }
    }
    if (k < lt) hi = lt - 1;
    else if (k > gt) lo = gt + 1;
    else return arr[k];
  }
}
  • Guarantee: O(n) worst case by ensuring a constant fraction is discarded each iteration.
  • Why groups of 5? Ensures at least 3/10 fraction on each side after partitioning (tight recurrence).
  • Practice: constant factors are higher than randomized Quickselect; often combine heuristics.

Engineering Tips and Variants

  • Use 3-way partitioning for inputs with many duplicates to avoid O(n^2) behavior.
  • Switch to insertion sort for tiny partitions (threshold ~16) to reduce overhead.
  • For top-k, use partial selection with a max-heap of size k or nth_element-style selection plus sort the top portion.
  • Streaming/online: maintain two heaps to track running median in O(log n) per update.

Exercises

  1. Implement randomized Quickselect and measure partition sizes across many runs; plot distribution.
  2. Implement deterministic Median of Medians, counting comparisons; compare runtime vs. Quickselect on adversarial arrays.
  3. Extend Quickselect to support selection by a custom comparator and test stability with many equal keys.
Randomized quickselect is often fastest in practice; Median of Medians guarantees linear worst-case time.