Show simple item record

dc.contributor.authorOrecchia, L.en_US
dc.date.accessioned2018-06-07T15:00:09Z
dc.date.available2018-06-07T15:00:09Z
dc.identifier.citationL Orecchia. "A Novel, Simple Interpretation of Nesterov’s Accelerated Method as a Combination of Gradient and Mirror Descent."
dc.identifier.urihttps://hdl.handle.net/2144/29257
dc.description.abstractFirst-order methods play a central role in large-scale convex optimization. Despite their various forms of descriptions and many applications, such methods mostly and fundamentally rely on two basic types of analyses: gradient-descent analysis, which yields primal progress, and mirror-descent analysis, which yields dual progress. In this paper, we observe that the performances of these two analyses are complementary, so that faster algorithms can be designed by coupling the two analyses, and their corresponding descent steps. In particular, we show in this paper how to obtain a conceptually simple reinterpretation of Nesterov's accelerated gradient method [Nes83, Nes04, Nes05]. Nesterov's method is the optimal first-order method for the class of smooth convex optimization problems. However, the proof of the fast convergence of Nesterov's method has no clear interpretation and is regarded by some as relying on an "algebraic trick". We apply our novel insights to express Nesterov's algorithm as a coupling of gradient descent and mirror descent, and as a result, the convergence proof can be understood as some natural combination of the two underlying convergence analyses. We believe that this complementary view of the two types of analysis may not only facilitate the study of Nesterov's method in a white-box manner so as to apply it to problems outside its original scope, but also let us design better first-order methods in a conceptually easier way.en_US
dc.subjectData structures and algorithmsen_US
dc.subjectNumerical analysisen_US
dc.subjectOptimization and controlen_US
dc.subjectMachine learningen_US
dc.titleA novel, simple interpretation of Nesterov’s accelerated method as a combination of gradient and mirror descenten_US
dc.typePresentationen_US
pubs.elements-sourcemanual-entryen_US
pubs.notesEmbargo: Not knownen_US
pubs.organisational-groupBoston Universityen_US
pubs.organisational-groupBoston University, College of Arts & Sciencesen_US
pubs.organisational-groupBoston University, College of Arts & Sciences, Department of Computer Scienceen_US


This item appears in the following Collection(s)

Show simple item record