Optimizing First-Order Method Parameters via Differentiation of the Performance Estimation Problem

University essay from Lunds universitet/Institutionen för reglerteknik

Author: Anton Åkerman; [2023]

Keywords: Technology and Engineering;

Abstract: This thesis treats the problem of finding optimal parameters for first-order optimization methods. In part, we use the Performance Estimation Problem (PEP), a framework for convergence analysis of first-order optimization methods. The fundamental idea of the PEP is to formulate the problem of finding the worst-case convergence rate of a first-order optimization algorithm, as an optimization problem. We also use recent methods for differentiating convex optimization problems. The goal is to explore the use of gradient-based methods for finding optimal parameters of first-order optimization methods, within the context of the Performance Estimation Problem. By differentiating the PEP, we can find gradients which can be used in an attempt to search for optimal method parameters. We consider the state space representation of first-order methods, which include many well-known first-order operator splitting methods. We propose a gradientbased algorithm for optimizing first-order method parameters, based on the differentiation algorithm from [Agrawal et al., 2020] and the PEP representations from [Upadhyaya et al., 2023], and show decent results. This is a heuristic approach to a non-convex optimization problem, but it works well for the Douglas–Rachford and Davis–Yin operator splitting methods. The results seem to agree with the theoretically optimal parameters for the Douglas–Rachford method, and the obtained convergence rates for the Davis–Yin method are better than the ones found in [Upadhyaya et al., 2023], using fixed parameters. The presented results, concern only those two methods, but the proposed algorithm is general. Based on some limited testing, this problem seems sensitive to numerical inaccuracy, and as a consequence, our approach using more exact gradients seems to outperform the built-in solver from SCIPY, which uses approximate gradients, terminating faster with comparable (or better) accuracy.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)