EE 227C (Spring 2018)
Convex Optimization and Approximation
University of California, Berkeley
Time: TuTh 12:30PM - 1:59PM, Location: Etcheverry 3106
Instructor: Moritz Hardt (Email: hardt+ee227c@berk…edu)
Graduate Instructor: Max Simchowitz (Email: msimchow@berk…edu).
Office hours: Max on Mon 3-4pm, Soda 310 (starting 1/29), Moritz on Fri 9–9:50a, SDH 722
Summary
This course will explore theory and algorithms for nonlinear optimization. We will focus on problems that arise in machine learning and modern data analysis, paying attention to concerns about complexity, robustness, and implementation in these domains. We will also see how tools from convex optimization can help tackle non-convex optimization problems common in practice.
Course notes
Course notes will be publicly available. Participants will collaboratively create and maintain notes over the course of the semester using git. See this repository for source files.
All course materials are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Feel free to use, share, and adapt for your own non-commercial purposes.
All available lecture notes (pdf)
See individual lectures below. These notes likely contain several mistakes. If you spot any please send an email or pull request.
Schedule
# | Date | Topic | ipynb | |
---|---|---|---|---|
Part I: Basic gradient methods | ||||
1 | 1/16 | Convexity | ipynb | |
2 | 1/18 | Gradient method (non-smooth and smooth) | — | |
3 | 1/23 | Gradient method (strongly convex) | — | |
4 | 1/25 | Some applications of gradient methods | ipynb | |
5 | 1/30 | Conditional gradient (Frank-Wolfe algorithm) | ipynb | |
Part II: Krylov methods and acceleration | ||||
6 | 2/1 | Discovering acceleration with Chebyshev polynomials | ipynb | |
7 | 2/8 | Eigenvalue intermezzo | — | |
8 | 2/6 | Nesterov’s accelerated gradient descent | — | |
9 | 2/13 | Lower bounds, robustness vs acceleration | py | |
Part III: Stochastic optimization | ||||
10 | 2/15 | Stochastc optimization | — | |
11 | 2/20 | Learning, regularization, and generalization | — | |
12 | 2/22 | Coordinate Descent (guest lecture by Max Simchowitz) | — | |
Part IV: Dual methods | ||||
13 | 2/27 | Duality theory | — | |
14 | 3/1 | Dual decomposition, method of multipliers | — | |
15 | 3/6 | Stochastic Dual Coordinate Ascent | — | |
16 | 3/8 | Backpropagation and adjoints | — | |
Part V: Non-convex problems | ||||
17 | 3/13 | Non-convex problems | — | |
18 | 3/15 | Saddle points | — | |
19 | 3/20 | Alternating minimization and expectaction maximization | — | ipynb |
20 | 3/22 | Derivative-free optimization, policy gradient, controls | — | ipynb |
21 | 4/3 | Non-convex constraints I (guest lecture by Ludwig Schmidt) | ||
22 | 4/5 | Non-convex constraints II (guest lecture by Ludwig Schmidt) | ipynb | |
Part VI: Higher-order and interior point methods | ||||
23 | 4/10 | Newton’s method | — | |
24 | 4/12 | Experimenting with second-order methods | — | ipynb |
25 | 4/17 | Enter interior point methods | — | |
26 | 4/19 | Primal-dual interior point methods | — | |
27 | 4/24 | Ellipsoid method | — | |
28 | 4/26 | Submodular functions, Lovasz extension | — | |
29 | 5/1 | Reading, review, recitation | ||
30 | 5/3 | Reading, review, recitation |
Sign up for scribing here
All three scribes should collaborate to provide a single tex file as seen here. Students are required to closely follow these instructions.
We suggest that each scribe takes down notes, and then all three meet after class to consolidate.
Assignments
Assignments will be posted on Piazza. If you haven’t already, sign up here. Homeworks will be assigned roughly every two weeks, and 2–3 problems will be selected for grading (we will not tell you which ones in advance). Assignments should be submitted through GradeScope; the course is listed as EE227C, which you may join with entry code 9P5NDV. All homeworks should be latexed. Students will be permitted two unexcused late assignments (up to a week late). Students requesting additional extensions should email Max.
Grading
Grading policy: 50% homeworks, 10% scribing, 20% midterm exam, 20% final exam.
Background
The prerequisites are previous coursework in linear algebra, multivariate calculus, probability and statistics. Coursework or background in optimization theory as covered in EE227BT is highly recommended. The class will involve some basic programming. Students are encouraged to use either Julia or Python. We discourage the use of MATLAB.
Material
- Nonlinear Programming (3rd edition). D. Bertsekas, Athena Scientific.
- Numerical Optimization. J. Nocedal and S. J. Wright, Springer Series in Operations Research, Springer-Verlag, New York, 2006 (2nd edition).
- Convex Optimization. S. Boyd and L. Vandenberghe. Cambridge University Press, Cambridge, 2003. PDF available here
- Introductory Lectures on Convex Optimization: A Basic Course. Y. Nesterov. Kluwer, 2004.
- Convex Optimization: Algorithms and Complexity. S. Bubeck. PDF available here
- Nonlinear Programming D. P. Bertsekas. Athena Scientific, Belmont, Massachusetts. (2nd edition). 1999.
- Efficient Methods in Convex Programming. A. Nemirovski. Lecture Notes as PDF available here.