Stochastic Frank-Wolfe Algorithm : Uniform Sampling Without Replacement

University essay from Umeå universitet/Institutionen för matematik och matematisk statistik

Abstract: The Frank-Wolfe (FW) optimization algorithm, due to its projection free property, has gained popularity in recent years with typical application within the field of machine learning. In the stochastic setting, it is still relatively understudied in comparison to the more expensive projected method of Stochastic Gradient Descent (SGD). We propose a novel Stochastic Frank-Wolfe (SFW) algorithm, inspired by SGDo where random sampling is considered without replacement. In analogy to this abbreviation, we call the proposed algorithm SFWo. We consider a convex setting of gradient Lipschitz (smooth) functions over compact domains. Depending on experiment design (LASSO, Matrix Sensing, Matrix Completion), the SFWo algorithm, exhibits a faster or matching empirical convergence, and a tendency of bounded suboptimality by the precursor SFW. Benchmarks on both synthetic and real world data display that SFWo improves on the number of stochastic gradient evaluations needed to achieve the same guarantee as SFW. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)