The course will cover techniques in unconstrained and constrained
convex optimization and a practical introduction to convex duality.
The course will focus on (1) formulating and understanding convex
optimization problems and studying their properties; (2) presenting
and understanding optimization approaches; and (3) understanding the
dual problem. Limited theoretical analysis of convergence properties
of methods will be presented. Examples will be mostly from data
fitting, statistics and machine learning.
Specific Topics:
Formalization of optimization problems
Smooth unconstrained optimization: Gradient Descent, Conjugate Gradient Descent, Newton and Quasi-Newton methods; Line Search methods
Standard formulations of constrained optimization: Linear, Quadratic, Conic and Semi-Definite Programming
KKT optimality conditions
Lagrangian duality, constraint qualification, weak and strong duality
Fenchel conjugacy and its relationship to Lagrangian duality
Multi-objective optimization
Equality constrained Newton method
Log Barrier (Central Path) methods, and Primal-Dual optimization methods
Overview of the simplex method as an example of an active set method.
Overview of the first order oracle model, subgradient and separation oracles, and the Ellipsoid algorithm.
There will be roughly bi-weekly homework assignments, counting
toward 30% of the grade. Assignments must be typed (not handwritten)
and submitted electronically in PDF. Collaborative work on the
homeworks is encouraged, but each student must eventually write up the
solution independently.
There will also be several MATLAB programming and experimentation
assignments, counting toward another 20% of the grade.
The remaining 50% of the grade will be based on a final exam.
Lectures and Required Reading:
Lecture I: Monday March 26th
Intro to Course
Formulation of Optimization Problems
Feasibility, Optimality, Sub-optimality
Boyd and Vandenberghe Sections
1.1-1.4
Lecture II: Wednesday March 28th
Affine & Convex Sets, and operations that preserve convex sets
Affine & Convex Functions
Separating and Supporting Hyperplanes, Halfspaces, Polyhedra
Notions of Separation
Alpha-sublevel sets, epigraphs
Boyd and Vandenberghe Sections 2.1-2.3, 2.5, 3.1-3.2
Lecture III: Friday March 30th
Unconstrained Optimization: Descent Methods; Descent Direction
Gradient Descent
Line Search: Exact Line Search and Backtracking Line Search
Analysis of Gradient Descent with Exact Line Search
Boyd and Vandenberghe Sections 9.1-9.3
Lecture IV: Monday April 2nd
Analysis of Gradient Descent with Backtracking Line Search
Problems with badly conditioned functions; Preconditioning
Preconditioned Gradient Descent
Newton's Method
Analysis of Newton's Method
Self-Concordance analysis and Newton's Method
Boyd and Vandenberghe Sections 9.5-9.6
Lecture V: Wednesday April 4th
Derivative, Differential.
Directional derivative, Jacobian, gradient.
Hessian.
First and Second Order Taylor Expansion.
Steepest Descent Direction.
Boyd and Vandenberghe Section A.4
Lecture VI: Friday April 6th
Conjugate Gradient Descent
Quasi-Newton Methods: BFGS
Summary of Unconstrained Optimization
Constrained Optimization: Intro
Nocedal and Wright Sections 5.1, 6.1
Lecture VII: Wednesday April 11th
Constrained Optimization: Formulation and Optimality Condition
Linear Programming
The Lagrangian, the dual problem, weak duality
Slater's condition and strong duality
Boyd and Vandenberghe Sections 4.2-4.3,5.1,5.2,5.4,5.5.1
Lecture VIII: Friday April 13th
Dual of a linear program
Dual of a non-convex problem: max-cut clustering
Dual of least-squares solution to under-determined linear system
Least-squares solution: recovering the primal optimum from the dual optimum
Complimentary slackness (5.5.2)
Lecture IX: Wednesday April 18th
KKT conditions (5.5.3)
Complimentary slackness (5.5.2)
Examples: Max-flow/min-cut, large-margin linear discrimination
Least-squares solution: recovering the primal optimum from the dual optimum
Lecture X: Friday April 20th
Fenchel conjugate, properties of Fenchel conjugate
Examples on norm, linear functions
Expressing the dual in terms of Fenchel conjugates (Section 3.3,5.1.6,5.7.1)
Example: Logistic regression
Minimum volume covering ellipsoid and the log-det function
Matrix inequalities, semi-definite constraints, and semi-definite programming (Section 4.6)
Lecture XI: Wednesday April 25th
Cones, Dual Cones
K-convexity, generalized inequalities
Generalized inequality optimization problems
Semidefinite programming
Example: Distance metric learning
KKT conditions with generalized inequalities
Lecture XII: Friday April 27th
Complimentary slackness and KKT optimality conditions for semi-definite constraints (Section 5.9)
Norm-constrained matrix factorization as an example of an SDP and its dual.
Equality constrained Newton's method (10.1-10.2).
Lecture XIII: Monday April 30th
Optimization of problems with implicit constraints
Log-barrier problems and the central path interior point method (Sections 11.1, 11.2, 11.3.1)
Analysis of the log-barrier central path method (11.5)
Log-barrier method for semi-definite constraints (11.6).
Lecture XIV: Friday May 4th
Feasibility problems and Phase I methods (Sections 11.4.1, 11.4.3).
Reducing Optimization to feasibility.
Log-barrier and weakened KKT conditions (10.3.4).
Primal-Dual Interior Point Methods, including infeasible start (11.7).
Lecture XV: Wednesday May 9th
Multi-objective problems and Pareto optimality (Boyd and Vendenberghe Section 4.7)
The Simplex Method (Nocedal and Wright Chapter 13)
Lecture XIV: Friday May 11th
The first order oracle model: subgradients and seperation oracles.
Classical first order methods: center-of-mass method, oracle lower bound, the Eillipsoid method (Chapters 1-3 of Nemirovksi's "Efficient Methods in Convex Programming").
Lecture XV: Wednesday May 16th
Review and limitations of methods with polynomial convergence. Dual norm, Lipschitz continuity.
Bundle method, Trust Region Newton.
Sub-gradient descent with projection, step size and analysis for Lipschitz functions over a bounded domain (Section 5.3.1 of Nemirovksi's "Lectures on Modern Convex Optimization").
Lecture XVI: Friday May 18th
More about sub-gradient descent, step-size selection and constraints.
Mirror Descent formulation and basic concepts: strong convexity with respect to a norm, dual norm, distance generating function and Bergman divergence (Section 5.4.1 of Nemirovksi's "Lectures on Modern Convex Optimization").