,

Contents · Optimization basics (convexity, gradients)


Overview

Optimization studies minimizing or maximizing functions subject to constraints. Convex problems have no spurious local minima. Gradients and first-order methods are the workhorses for large-scale optimization.


Details

  • Convex sets and functions: Jensen's inequality; first/second-order characterizations.
  • Strong convexity and smoothness (L-Lipschitz gradients); condition numbers.
  • Unconstrained optima: ∇f(x*) = 0; Hessian tests; saddle points.
  • Gradient descent: step size, line search, convergence rates for convex/smooth/strongly convex.
  • Variants: momentum, Nesterov acceleration, adaptive methods (AdaGrad/Adam) basics.
  • Constraints preview: Lagrangian, KKT conditions (see separate topic).

Exercises

  1. Prove that a function with L-Lipschitz gradient satisfies the descent lemma.
  2. Show that for μ-strongly convex and L-smooth f, gradient descent with step 1/L converges linearly.
  3. Implement gradient descent on a quadratic and compare convergence with and without momentum.