Conjugate Gradient Algorithms in Nonconvex Optimization


Book Description

This book details algorithms for large-scale unconstrained and bound constrained optimization. It shows optimization techniques from a conjugate gradient algorithm perspective as well as methods of shortest residuals, which have been developed by the author.




Conjugate Gradient Algorithms and Finite Element Methods


Book Description

The position taken in this collection of pedagogically written essays is that conjugate gradient algorithms and finite element methods complement each other extremely well. Via their combinations practitioners have been able to solve complicated, direct and inverse, multidemensional problems modeled by ordinary or partial differential equations and inequalities, not necessarily linear, optimal control and optimal design being part of these problems. The aim of this book is to present both methods in the context of complicated problems modeled by linear and nonlinear partial differential equations, to provide an in-depth discussion on their implementation aspects. The authors show that conjugate gradient methods and finite element methods apply to the solution of real-life problems. They address graduate students as well as experts in scientific computing.




Algorithms for Smooth Nonconvex Optimization with Worst-case Guarantees


Book Description

The nature of global convergence guarantees for nonconvex optimization algorithms has changed significantly in recent years. New results characterize the maximum computational cost required for algorithms to satisfy approximate optimality conditions, instead of focusing on the limiting behavior of the iterates. In many contexts, such as those arising from machine learning, convergence to approximate second order points is desired. Algorithms designed for these problems must avoid saddle points efficiently to achieve optimal worst-case guarantees. In this dissertation, we develop and analyze a number of nonconvex optimization algorithms. First, we focus on accelerated gradient algorithms and provide results related to the avoidance of "strict saddle points''. In addition, the rate of divergence these accelerated gradient algorithms exhibit when in a neighborhood of strict saddle points is proven. Subsequently, we propose three new algorithms for smooth, nonconvex optimization with worst-case complexity guarantees. The first algorithm is developed for unconstrained optimization and is based on the classical Newton Conjugate Gradient method. This approach is then extended to bound constrained optimization by modifying the primal-log barrier method. Finally, we present a method for a special class of ``strict saddle functions'' which does not require knowledge of the parameters defining the optimization landscape. These algorithms converge to approximate second-order points in the best known computational complexity for their respective problem classes.




Nonlinear Conjugate Gradient Methods for Unconstrained Optimization


Book Description

Two approaches are known for solving large-scale unconstrained optimization problems—the limited-memory quasi-Newton method (truncated Newton method) and the conjugate gradient method. This is the first book to detail conjugate gradient methods, showing their properties and convergence characteristics as well as their performance in solving large-scale unconstrained optimization problems and applications. Comparisons to the limited-memory and truncated Newton methods are also discussed. Topics studied in detail include: linear conjugate gradient methods, standard conjugate gradient methods, acceleration of conjugate gradient methods, hybrid, modifications of the standard scheme, memoryless BFGS preconditioned, and three-term. Other conjugate gradient methods with clustering the eigenvalues or with the minimization of the condition number of the iteration matrix, are also treated. For each method, the convergence analysis, the computational performances and the comparisons versus other conjugate gradient methods are given. The theory behind the conjugate gradient algorithms presented as a methodology is developed with a clear, rigorous, and friendly exposition; the reader will gain an understanding of their properties and their convergence and will learn to develop and prove the convergence of his/her own methods. Numerous numerical studies are supplied with comparisons and comments on the behavior of conjugate gradient algorithms for solving a collection of 800 unconstrained optimization problems of different structures and complexities with the number of variables in the range [1000,10000]. The book is addressed to all those interested in developing and using new advanced techniques for solving unconstrained optimization complex problems. Mathematical programming researchers, theoreticians and practitioners in operations research, practitioners in engineering and industry researchers, as well as graduate students in mathematics, Ph.D. and master students in mathematical programming, will find plenty of information and practical applications for solving large-scale unconstrained optimization problems and applications by conjugate gradient methods.




On the Relationship Between Conjugate Gradient and Optimal First-Order Methods for Convex Optimization


Book Description

In a series of work initiated by Nemirovsky and Yudin, and later extended by Nesterov, first-order algorithms for unconstrained minimization with optimal theoretical complexity bound have been proposed. On the other hand, conjugate gradient algorithms as one of the widely used first-order techniques suffer from the lack of a finite complexity bound. In fact their performance can possibly be quite poor. This dissertation is partially on tightening the gap between these two classes of algorithms, namely the traditional conjugate gradient methods and optimal first-order techniques. We derive conditions under which conjugate gradient methods attain the same complexity bound as in Nemirovsky-Yudin's and Nesterov's methods. Moreover, we propose a conjugate gradient-type algorithm named CGSO, for Conjugate Gradient with Subspace Optimization, achieving the optimal complexity bound with the payoff of a little extra computational cost. We extend the theory of CGSO to convex problems with linear constraints. In particular we focus on solving $l_1$-regularized least square problem, often referred to as Basis Pursuit Denoising (BPDN) problem in the optimization community. BPDN arises in many practical fields including sparse signal recovery, machine learning, and statistics. Solving BPDN is fairly challenging because the size of the involved signals can be quite large; therefore first order methods are of particular interest for these problems. We propose a quasi-Newton proximal method for solving BPDN. Our numerical results suggest that our technique is computationally effective, and can compete favourably with the other state-of-the-art solvers.




Integer and Nonlinear Programming


Book Description

A NATO Summer School held in Bandol, France, sponsored by the Scientific Affairs Division of NATO.




Evaluation Complexity of Algorithms for Nonconvex Optimization


Book Description

A popular way to assess the “effort” needed to solve a problem is to count how many evaluations of the problem functions (and their derivatives) are required. In many cases, this is often the dominating computational cost. Given an optimization problem satisfying reasonable assumptions—and given access to problem-function values and derivatives of various degrees—how many evaluations might be required to approximately solve the problem? Evaluation Complexity of Algorithms for Nonconvex Optimization: Theory, Computation, and Perspectives addresses this question for nonconvex optimization problems, those that may have local minimizers and appear most often in practice. This is the first book on complexity to cover topics such as composite and constrained optimization, derivative-free optimization, subproblem solution, and optimal (lower and sharpness) bounds for nonconvex problems. It is also the first to address the disadvantages of traditional optimality measures and propose useful surrogates leading to algorithms that compute approximate high-order critical points, and to compare traditional and new methods, highlighting the advantages of the latter from a complexity point of view. This is the go-to book for those interested in solving nonconvex optimization problems. It is suitable for advanced undergraduate and graduate students in courses on advanced numerical analysis, data science, numerical optimization, and approximation theory.




Introduction to Methods for Nonlinear Optimization


Book Description

This book has two main objectives: • to provide a concise introduction to nonlinear optimization methods, which can be used as a textbook at a graduate or upper undergraduate level; • to collect and organize selected important topics on optimization algorithms, not easily found in textbooks, which can provide material for advanced courses or can serve as a reference text for self-study and research. The basic material on unconstrained and constrained optimization is organized into two blocks of chapters: • basic theory and optimality conditions • unconstrained and constrained algorithms. These topics are treated in short chapters that contain the most important results in theory and algorithms, in a way that, in the authors’ experience, is suitable for introductory courses. A third block of chapters addresses methods that are of increasing interest for solving difficult optimization problems. Difficulty can be typically due to the high nonlinearity of the objective function, ill-conditioning of the Hessian matrix, lack of information on first-order derivatives, the need to solve large-scale problems. In the book various key subjects are addressed, including: exact penalty functions and exact augmented Lagrangian functions, non monotone methods, decomposition algorithms, derivative free methods for nonlinear equations and optimization problems. The appendices at the end of the book offer a review of the essential mathematical background, including an introduction to convex analysis that can make part of an introductory course.




Non-convex Optimization for Machine Learning


Book Description

Non-convex Optimization for Machine Learning takes an in-depth look at the basics of non-convex optimization with applications to machine learning. It introduces the rich literature in this area, as well as equips the reader with the tools and techniques needed to apply and analyze simple but powerful procedures for non-convex problems. Non-convex Optimization for Machine Learning is as self-contained as possible while not losing focus of the main topic of non-convex optimization techniques. The monograph initiates the discussion with entire chapters devoted to presenting a tutorial-like treatment of basic concepts in convex analysis and optimization, as well as their non-convex counterparts. The monograph concludes with a look at four interesting applications in the areas of machine learning and signal processing, and exploring how the non-convex optimization techniques introduced earlier can be used to solve these problems. The monograph also contains, for each of the topics discussed, exercises and figures designed to engage the reader, as well as extensive bibliographic notes pointing towards classical works and recent advances. Non-convex Optimization for Machine Learning can be used for a semester-length course on the basics of non-convex optimization with applications to machine learning. On the other hand, it is also possible to cherry pick individual portions, such the chapter on sparse recovery, or the EM algorithm, for inclusion in a broader course. Several courses such as those in machine learning, optimization, and signal processing may benefit from the inclusion of such topics.




Handbook Of Machine Learning - Volume 2: Optimization And Decision Making


Book Description

Building on , this volume on Optimization and Decision Making covers a range of algorithms and their applications. Like the first volume, it provides a starting point for machine learning enthusiasts as a comprehensive guide on classical optimization methods. It also provides an in-depth overview on how artificial intelligence can be used to define, disprove or validate economic modeling and decision making concepts.