Boston University Libraries OpenBU
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    •   OpenBU
    • BU Open Access Articles
    • BU Open Access Articles
    • View Item
    •   OpenBU
    • BU Open Access Articles
    • BU Open Access Articles
    • View Item

    Accelerated extra-gradient descent: a novel accelerated first-order method

    Thumbnail
    License
    Copyright © Jelena Diakonikolas and Lorenzo Orecchia 2018;
licensed under Creative Commons Attribution License (CC-BY)
    Date Issued
    2018
    Publisher Version
    10.4230/LIPIcs.ITCS.2018.23
    Author(s)
    Orecchia, Lorenzo
    Diakonikolas, Jelena
    Share to FacebookShare to TwitterShare by Email
    Export Citation
    Download to BibTex
    Download to EndNote/RefMan (RIS)
    Metadata
    Show full item record
    Permanent Link
    https://hdl.handle.net/2144/38507
    Version
    Published version
    Citation (published version)
    Lorenzo Orecchia, Jelena Diakonikolas. 2018. "Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method." 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). https://doi.org/10.4230/LIPIcs.ITCS.2018.23
    Abstract
    We provide a novel accelerated first-order method that achieves the asymptotically optimal convergence rate for smooth functions in the first-order oracle model. To this day, Nesterov’s Accelerated Gradient Descent (agd) and variations thereof were the only methods achieving acceleration in this standard blackbox model. In contrast, our algorithm is significantly different from agd, as it relies on a predictor-corrector approach similar to that used by Mirror-Prox [18] and ExtraGradient Descent [14] in the solution of convex-concave saddle point problems. For this reason, we dub our algorithm Accelerated Extra-Gradient Descent (axgd). Its construction is motivated by the discretization of an accelerated continuous-time dynamics [15] using the classical method of implicit Euler discretization. Our analysis explicitly shows the effects of discretization through a conceptually novel primal-dual viewpoint. Moreover, we show that the method is quite general: it attains optimal convergence rates for other classes of objectives (e.g., those with generalized smoothness properties or that are non-smooth and Lipschitz-continuous) using the appropriate choices of step lengths. Finally, we present experiments showing that our algorithm matches the performance of Nesterov’s method, while appearing more robust to noise in some cases.
    Rights
    Copyright © Jelena Diakonikolas and Lorenzo Orecchia 2018; licensed under Creative Commons Attribution License (CC-BY)
    Collections
    • CAS: Computer Science: Scholarly Papers [186]
    • BU Open Access Articles [3664]


    Boston University
    Contact Us | Send Feedback | Help
     

     

    Browse

    All of OpenBUCommunities & CollectionsIssue DateAuthorsTitlesSubjectsThis CollectionIssue DateAuthorsTitlesSubjects

    Deposit Materials

    LoginNon-BU Registration

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Boston University
    Contact Us | Send Feedback | Help