A novel, simple interpretation of Nesterov’s accelerated method as a combination of gradient and mirror descent
MetadataShow full item record
Citation (published version)L Orecchia. "A Novel, Simple Interpretation of Nesterov’s Accelerated Method as a Combination of Gradient and Mirror Descent."
First-order methods play a central role in large-scale convex optimization. Despite their various forms of descriptions and many applications, such methods mostly and fundamentally rely on two basic types of analyses: gradient-descent analysis, which yields primal progress, and mirror-descent analysis, which yields dual progress. In this paper, we observe that the performances of these two analyses are complementary, so that faster algorithms can be designed by coupling the two analyses, and their corresponding descent steps. In particular, we show in this paper how to obtain a conceptually simple reinterpretation of Nesterov's accelerated gradient method [Nes83, Nes04, Nes05]. Nesterov's method is the optimal first-order method for the class of smooth convex optimization problems. However, the proof of the fast convergence of Nesterov's method has no clear interpretation and is regarded by some as relying on an "algebraic trick". We apply our novel insights to express Nesterov's algorithm as a coupling of gradient descent and mirror descent, and as a result, the convergence proof can be understood as some natural combination of the two underlying convergence analyses. We believe that this complementary view of the two types of analysis may not only facilitate the study of Nesterov's method in a white-box manner so as to apply it to problems outside its original scope, but also let us design better first-order methods in a conceptually easier way.