Integral and Inverse Reinforcement Learning for Optimal Control Systems and Games


Book Description

Integral and Inverse Reinforcement Learning for Optimal Control Systems and Games develops its specific learning techniques, motivated by application to autonomous driving and microgrid systems, with breadth and depth: integral reinforcement learning (RL) achieves model-free control without system estimation compared with system identification methods and their inevitable estimation errors; novel inverse RL methods fill a gap that will help them to attract readers interested in finding data-driven model-free solutions for inverse optimization and optimal control, imitation learning and autonomous driving among other areas. Graduate students will find that this book offers a thorough introduction to integral and inverse RL for feedback control related to optimal regulation and tracking, disturbance rejection, and multiplayer and multiagent systems. For researchers, it provides a combination of theoretical analysis, rigorous algorithms, and a wide-ranging selection of examples. The book equips practitioners working in various domains – aircraft, robotics, power systems, and communication networks among them – with theoretical insights valuable in tackling the real-world challenges they face.




Handbook of Reinforcement Learning and Control


Book Description

This handbook presents state-of-the-art research in reinforcement learning, focusing on its applications in the control and game theory of dynamic systems and future directions for related research and technology. The contributions gathered in this book deal with challenges faced when using learning and adaptation methods to solve academic and industrial problems, such as optimization in dynamic environments with single and multiple agents, convergence and performance analysis, and online implementation. They explore means by which these difficulties can be solved, and cover a wide range of related topics including: deep learning; artificial intelligence; applications of game theory; mixed modality learning; and multi-agent reinforcement learning. Practicing engineers and scholars in the field of machine learning, game theory, and autonomous control will find the Handbook of Reinforcement Learning and Control to be thought-provoking, instructive and informative.




Reinforcement Learning for Optimal Feedback Control


Book Description

Reinforcement Learning for Optimal Feedback Control develops model-based and data-driven reinforcement learning methods for solving optimal control problems in nonlinear deterministic dynamical systems. In order to achieve learning under uncertainty, data-driven methods for identifying system models in real-time are also developed. The book illustrates the advantages gained from the use of a model and the use of previous experience in the form of recorded data through simulations and experiments. The book’s focus on deterministic systems allows for an in-depth Lyapunov-based analysis of the performance of the methods described during the learning phase and during execution. To yield an approximate optimal controller, the authors focus on theories and methods that fall under the umbrella of actor–critic methods for machine learning. They concentrate on establishing stability during the learning phase and the execution phase, and adaptive model-based and data-driven reinforcement learning, to assist readers in the learning process, which typically relies on instantaneous input-output measurements. This monograph provides academic researchers with backgrounds in diverse disciplines from aerospace engineering to computer science, who are interested in optimal reinforcement learning functional analysis and functional approximation theory, with a good introduction to the use of model-based methods. The thorough treatment of an advanced treatment to control will also interest practitioners working in the chemical-process and power-supply industry.




Reinforcement Learning


Book Description

This book offers a thorough introduction to the basics and scientific and technological innovations involved in the modern study of reinforcement-learning-based feedback control. The authors address a wide variety of systems including work on nonlinear, networked, multi-agent and multi-player systems. A concise description of classical reinforcement learning (RL), the basics of optimal control with dynamic programming and network control architectures, and a brief introduction to typical algorithms build the foundation for the remainder of the book. Extensive research on data-driven robust control for nonlinear systems with unknown dynamics and multi-player systems follows. Data-driven optimal control of networked single- and multi-player systems leads readers into the development of novel RL algorithms with increased learning efficiency. The book concludes with a treatment of how these RL algorithms can achieve optimal synchronization policies for multi-agent systems with unknown model parameters and how game RL can solve problems of optimal operation in various process industries. Illustrative numerical examples and complex process control applications emphasize the realistic usefulness of the algorithms discussed. The combination of practical algorithms, theoretical analysis and comprehensive examples presented in Reinforcement Learning will interest researchers and practitioners studying or using optimal and adaptive control, machine learning, artificial intelligence, and operations research, whether advancing the theory or applying it in mineral-process, chemical-process, power-supply or other industries.




Inverse Dynamic Game Methods for Identification of Cooperative System Behavior


Book Description

This work addresses inverse dynamic games, which generalize the inverse problem of optimal control, and where the aim is to identify cost functions based on observed optimal trajectories. The identified cost functions can describe individual behavior in cooperative systems, e.g. human behavior in human-machine haptic shared control scenarios.







Reinforcement Learning and Approximate Dynamic Programming for Feedback Control


Book Description

Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.




Output Feedback Reinforcement Learning Control for Linear Systems


Book Description

This monograph explores the analysis and design of model-free optimal control systems based on reinforcement learning (RL) theory, presenting new methods that overcome recent challenges faced by RL. New developments in the design of sensor data efficient RL algorithms are demonstrated that not only reduce the requirement of sensors by means of output feedback, but also ensure optimality and stability guarantees. A variety of practical challenges are considered, including disturbance rejection, control constraints, and communication delays. Ideas from game theory are incorporated to solve output feedback disturbance rejection problems, and the concepts of low gain feedback control are employed to develop RL controllers that achieve global stability under control constraints. Output Feedback Reinforcement Learning Control for Linear Systems will be a valuable reference for graduate students, control theorists working on optimal control systems, engineers, and applied mathematicians.




Online Learning Algorithms for Differential Dynamic Games and Optimal Control


Book Description

Optimal control deals with the problem of finding a control law for a given system that a certain optimality criterion is achieved. It can be derived using Pontryagin's maximum principle (a necessary condition), or by solving the Hamilton-Jacobi-Bellman equation (a sufficient condition). Major drawback of optimal control is that it is offline. Adaptive control involves modifying the control law used by a controller to cope with the facts that the system is unknown or uncertain. Adaptive controllers are not optimal. Adaptive optimal controllers have been proposed by adding optimality criteria to an adaptive controller, or adding adaptive characteristics to an optimal controller. In this work, online adaptive learning algorithms are developed for optimal control and differential dynamic games by using measurements along the trajectory or input/output data. These algorithms are based on actor/critic schemes and involve simultaneous tuning of the actor/critic neural networks and provide online solutions to complex Hamilton-Jacobi equations, along with convergence and Lyapunov stability proofs. The research begins with the development of an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost for nonlinear systems with known dynamics. That is, the algorithm learns online in real-time the solution to the optimal control design Hamilton-Jacobi (HJ) equation. This is called 'synchronous' policy iteration. Then it became interesting to develop an online learning algorithm to solve the continuous-time two-player zero-sum game with infinite horizon cost for nonlinear systems. The algorithm learns online in real-time the solution to the game design Hamilton-Jacobi-Isaacs equation. This algorithm is called online gaming algorithm 'synchronous' zero-sum game policy iteration. One of the major outcomes of this work is the online learning algorithm to solve the continuous time multi player non-zero sum games with infinite horizon for linear and nonlinear systems. The adaptive algorithm learns online the solution of coupled Riccati and coupled Hamilton-Jacobi equations for linear and nonlinear systems respectively. The optimal-adaptive algorithm is implemented as a separate actor/critic parametric network approximator structure for every player, and involves simultaneous continuous-time adaptation of the actor/critic networks. The next result shows how to implement Approximate Dynamic Programming methods using only measured input/output data from the systems. Policy and value iteration algorithms have been developed that converge to an optimal controller that requires only output feedback. The notion of graphical games is developed for dynamical systems, where the dynamics and performance indices for each node depend only on local neighbor information. A cooperative policy iteration algorithm, is given for graphical games, that converges to the best response when the neighbors of each agent do not update their policies and to the cooperative Nash equilibrium when all agents update their policies simultaneously. Finally, a synchronous policy iteration algorithm based on integral reinforcement learning is given. This algorithm does not need the drift dynamics.