Reinforcement Learning in Eco-driving for Connected and Automated Vehicles


Book Description

Connected and Automated Vehicles (CAVs) can significantly improve transportation efficiency by taking advantage of advanced connectivity technologies. Meanwhile, the combination of CAVs and powertrain electrification, such as Hybrid Electric Vehicles (HEVs) and Plug-in Hybrid Electric Vehicles (PHEVs), offers greater potential to improve fuel economy due to the extra control flexibility compared to vehicles with a single power source. In this context, the eco-driving control optimization problem seeks to design the optimal speed and powertrain components usage profiles based upon the information received by advanced mapping or Vehicle-to-Everything (V2X) communications to minimize the energy consumed by the vehicle over a given itinerary. To overcome the real-time computational complexity and embrace the stochastic nature of the driving task, the application and extension of state-of-the-art (SOTA) Deep Reinforcement Learning (Deep RL, DRL) algorithms to the eco-driving problem for a mild-HEV is studied in this dissertation. For better training and a more comprehensive evaluation, an RL environment, consisting of a mild HEV powertrain and vehicle dynamics model and a large-scale microscopic traffic simulator, is developed. To benchmark the performance of the developed strategies, two causal controllers, namely a baseline strategy representing human drivers and a deterministic optimal-control-based strategy, and the non-causal wait-and-see solution are implemented. In the first RL application, the eco-driving problem is formulated as a Partially Observable Markov Decision Process, and a SOTA model-free DRL (MFDRL) algorithm, Proximal Policy Optimization with Long Short-term Memory as function approximator, is used. Evaluated over 100 trips randomly generated in the city of Columbus, OH, the MFDRL agent shows a 17% fuel economy improvement against the baseline strategy while keeping the average travel time comparable. While showing performance comparable to the optimal-control-based strategy, the actor of the MFDRL agent offers an explicit control policy that significantly reduces the onboard computation. Subsequently, a model-based DRL (MBDRL) algorithm, Safe Model-based Off-policy Reinforcement Learning (SMORL) is proposed. The algorithm addresses the following issues emerged from the MFDRL development: a) the cumbersome process necessary to design the rewarding mechanism, b) the lack of the constraint satisfaction and feasibility guarantee and c) the low sample efficiency. Specifically, SMORL consists of three key components, a massively parallelizable dynamic programming trajectory optimizer, a value function learned in an off-policy fashion and a learned safe set as a generative model. Evaluated under the same conditions, the SMORL agent shows a 21% reduction on the fuel consumption over the baseline and the dominant performance over the MFDRL agent and the deterministic optimal-control-based controller.




Deep Learning-based Eco-driving System for Battery Electric Vehicles


Book Description

Eco-driving strategies based on connected and automated vehicles (CAV) technology, such as Eco-Approach and Departure (EAD), have attracted significant worldwide interest due to their potential to save energy and reduce tail-pipe emissions. In this project, the research team developed and tested a deep learning–based trajectory-planning algorithm (DLTPA) for EAD. The DLTPA has two processes: offline (training) and online (implementation), and it is composed of two major modules: 1) a solution feasibility checker that identifies whether there is a feasible trajectory subject to all the system constraints, e.g., maximum acceleration or deceleration; and 2) a regressor to predict the speed of the next time-step. Preliminary simulation with microscopic traffic modeling software PTV VISSIM showed that the proposed DLTPA can achieve the optimal solution in terms of energy savings and a greater balance of energy savings vs. computational efforts when compared to the baseline scenarios where no EAD is implemented and the optimal solution (in terms of energy savings) is provided by a graph-based trajectory planning algorithm.




Developing an Adaptive Strategy for Connected Eco-driving Under Uncertain Traffic and Signal Conditions


Book Description

The Eco-Approach and Departure (EAD) application has been proved to be environmentally efficient for a Connected and Automated Vehicles (CAVs) system. In the real-world traffic, traffic conditions and signal timings are usually dynamic and uncertain due to mixed vehicle types, various driving behaviors and limited sensing range, which is challenging in EAD development. This research proposes an adaptive strategy for connected eco-driving towards a signalized intersection under real world conditions. Stochastic graph models are built to link the vehicle and external (e.g., traffic, signal) data and dynamic programing is applied to identify the optimal speed for each vehicle-state efficiently. From energy perspective, adaptive strategy using traffic data could double the effective sensor range in eco-driving. A hybrid reinforcement learning framework is also developed for EAD in mixed traffic condition using both short-term benefit and long-term benefit as the action reward. Microsimulation is conducted in Unity to validate the method, showing over 20% energy saving.




AI-enabled Technologies for Autonomous and Connected Vehicles


Book Description

This book reports on cutting-edge research and advances in the field of intelligent vehicle systems. It presents a broad range of AI-enabled technologies, with a focus on automated, autonomous and connected vehicle systems. It covers advanced machine learning technologies, including deep and reinforcement learning algorithms, transfer learning and learning from big data, as well as control theory applied to mobility and vehicle systems. Furthermore, it reports on cutting-edge technologies for environmental perception and vehicle-to-everything (V2X), discussing socioeconomic and environmental implications, and aspects related to human factors and energy-efficiency alike, of automated mobility. Gathering chapters written by renowned researchers and professionals, this book offers a good balance of theoretical and practical knowledge. It provides researchers, practitioners and policy makers with a comprehensive and timely guide on the field of autonomous driving technologies.




Eco-driving of Connected and Automated Vehicles (CAVs)


Book Description

In recent years, the trend in the automotive industry has been favoring the reduction of fuel consumption in vehicles with the help of new and emerging technologies. This drive stemmed from the developments in communication technologies for Connected and Autonomous Vehicles (CAV), such as Vehicle to Infrastructure (V2I), Vehicle to Vehicle (V2V) and Vehicle to Everything (V2X) communication. Coupled with automated driving capabilities of CAVs, a new and exciting era has started in the world of transportation as each transportation agent is becoming more and more connected. To keep up with the times, research in the academia and the industry has focused on utilizing vehicle connectivity for various purposes, one of the most significant being fuel savings. Motivated by this goal of fuel saving applications of Connected Vehicle (CV) technologies, the main focus and contribution of this dissertation is developing and evaluating a complete Eco-Driving strategy for CAVs. Eco-Driving is a term used to describe the energy efficient use of vehicles. In this dissertation, a complete and comprehensive Eco-Driving strategy for CAVs is studied, where multiple driving modes calculate speed profiles ideal for their own set of constraints simultaneously to save fuel as much as possible while a High Level (HL) controller ensures smooth transitions between the driving modes for Eco-Driving. The first step in making a CAV achieve Eco-Driving is to develop a route-dependent speed profile called Eco-Cruise that is fuel optimal. The methods explored to achieve this optimally fuel economic speed profile are Dynamic Programming (DP) and Pontryagin’s Minimum Principle (PMP). Using a generalized Matlab function that minimizes the fuel rate for a vehicle travelling on a certain route with route gradient, acceleration and deceleration limits, speed limits and traffic sign (traffic lights and STOP signs) locations as constraints, a DP based fuel optimal velocity profile is found. The ego CAV that is controlled by the automated driving system follows this Eco-Cruise speed profile as long as there is no preceding vehicle impeding its motion or upcoming traffic light or STOP sign ahead. When the ego CAV approaches a traffic light, then a V2I algorithm called Pass-at-Green (PaG) calculates a fuel-economic and Signal Phase and Timing (SPaT) dependent speed profile. When the ego CAV approaches a STOP sign, the eHorizon electronic horizon unit is used to get STOP sign location while the Eco-Stop algorithm calculates a fuel optimal Eco-Approach speed trajectory for the ego CAV, so that the ego vehicle smoothly comes to a complete stop at the STOP sign. When the ego CAV departs from the traffic light or STOP sign, then the Eco-Departure algorithm calculates a fuel optimal speed trajectory to smoothly accelerate to a higher speed for the ego CAV. Other than the interaction of the CAV with road infrastructure, there could also be other vehicles around the ego vehicle. When there is a preceding vehicle in front of the ego CAV, typically, an Adaptive Cruise Control (ACC) is used to follow the lead vehicle keeping a constant time gap. Lead vehicle acceleration that was received by the ego CAV through V2V can be utilized in Cooperative Adaptive Cruise Control (CACC) to follow the preceding vehicle better than the ACC. If the ego CAV is found to be erratic, then the Ecological Cooperative Adaptive Cruise Control (Eco-CACC) takes over and calculates a fuel efficient speed trajectory for car following. If the preceding vehicle acts too erratically or slows down too much, and the ego CAV has a chance to change its lane, then the Lane Change mode takes control and changes the lane. The default driving mode in all these scenarios is the Eco-Cruise mode, which is the optimal fuel economic and route-dependent solution acquired using DP. Unmanned Aerial Vehicles (UAVs) are part of Intelligent Transportation Systems (ITS) and can communicate with CAVs and other transportation agents. Whenever there are UAVs with communication capabilities around the ego CAV, information can be transferred between the UAV and CAV. As part of this communication capability, when the ego CAV approaches a bottleneck or a queue, information regarding the queue can be broadcast either from a Roadside Unit (RSU) or a Connected UAV (C-UAV) acting like an RSU with Dedicated Short Range Communication (DSRC). The queue information can be received by the On-Board-Unit (OBU), which is the vehicle communication unit using DSRC protocol in the ego CAV. Using the queue information, the Dynamic Speed Harmonization (DSH) model can be activated to take the main driver role for generating a smooth deceleration profile while the ego CAV approaches the queue. Once the queue is passed, the ego CAV goes back to the default Eco-Cruise mode. The elements of the proposed Eco-Driving method outlined above are first treated individually and then integrated in a holistic manner in this dissertation. The organization of this dissertation is as follows. Firstly, a summary is given on the topic of CAVs and various ways that connectivity is utilized in CAV research in Chapter 1 Introduction and Literature Review. Then, in Chapter 2 Modelling, Simulation and Testing Environment, details about the state-of-the-art simulation environment used for this dissertation are presented. Chapter 3 Scenario Development and Selection focuses on test route development procedure and the types of roadways tested in this work. Chapter 4 Fuel Economic Driving for a Single CAV with V2I in No Traffic explains the different models developed for fuel optimal speed trajectory calculation using roadway infrastructure. Chapter 5 Fuel Economic Driving for a CAV with V2V in Traffic gives details about the models developed for an ego CAV travelling among other connected vehicles. The Model-in-the-Loop (MIL) simulation results for the Eco-Driving algorithms developed for Chapter 4 and Chapter 5 are presented in Chapter 6. The Hardware-in-the-Loop (HIL) simulation results for the Eco-Driving algorithms in Chapter 4 and Chapter 5 are presented in Chapter 7. Chapter 8 shows results about testing the complete Eco-Driving strategy in a traffic simulator with realistic traffic flow. Chapter 9 touches on CAV and UAV communication and presents Dynamic Speed Harmonization (DSH) as a use case scenario. Chapter 10 Conclusion presents the results of this dissertation and draws conclusions about this work.




Deep Learning for Autonomous Vehicle Control


Book Description

The next generation of autonomous vehicles will provide major improvements in traffic flow, fuel efficiency, and vehicle safety. Several challenges currently prevent the deployment of autonomous vehicles, one aspect of which is robust and adaptable vehicle control. Designing a controller for autonomous vehicles capable of providing adequate performance in all driving scenarios is challenging due to the highly complex environment and inability to test the system in the wide variety of scenarios which it may encounter after deployment. However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalizing previously learned rules to new scenarios. For these reasons, the use of deep neural networks for vehicle control has gained significant interest. In this book, we introduce relevant deep learning techniques, discuss recent algorithms applied to autonomous vehicle control, identify strengths and limitations of available methods, discuss research challenges in the field, and provide insights into the future trends in this rapidly evolving field.




Human-Like Decision Making and Control for Autonomous Driving


Book Description

This book details cutting-edge research into human-like driving technology, utilising game theory to better suit a human and machine hybrid driving environment. Covering feature identification and modelling of human driving behaviours, the book explains how to design an algorithm for decision making and control of autonomous vehicles in complex scenarios. Beginning with a review of current research in the field, the book uses this as a springboard from which to present a new theory of human-like driving framework for autonomous vehicles. Chapters cover system models of decision making and control, driving safety, riding comfort and travel efficiency. Throughout the book, game theory is applied to human-like decision making, enabling the autonomous vehicle and the human driver interaction to be modelled using noncooperative game theory approach. It also uses game theory to model collaborative decision making between connected autonomous vehicles. This framework enables human-like decision making and control of autonomous vehicles, which leads to safer and more efficient driving in complicated traffic scenarios. The book will be of interest to students and professionals alike, in the field of automotive engineering, computer engineering and control engineering.




Reinforcement Learning-Enabled Intelligent Energy Management for Hybrid Electric Vehicles


Book Description

Powertrain electrification, fuel decarburization, and energy diversification are techniques that are spreading all over the world, leading to cleaner and more efficient vehicles. Hybrid electric vehicles (HEVs) are considered a promising technology today to address growing air pollution and energy deprivation. To realize these gains and still maintain good performance, it is critical for HEVs to have sophisticated energy management systems. Supervised by such a system, HEVs could operate in different modes, such as full electric mode and power split mode. Hence, researching and constructing advanced energy management strategies (EMSs) is important for HEVs performance. There are a few books about rule- and optimization-based approaches for formulating energy management systems. Most of them concern traditional techniques and their efforts focus on searching for optimal control policies offline. There is still much room to introduce learning-enabled energy management systems founded in artificial intelligence and their real-time evaluation and application. In this book, a series hybrid electric vehicle was considered as the powertrain model, to describe and analyze a reinforcement learning (RL)-enabled intelligent energy management system. The proposed system can not only integrate predictive road information but also achieve online learning and updating. Detailed powertrain modeling, predictive algorithms, and online updating technology are involved, and evaluation and verification of the presented energy management system is conducted and executed.




Mixed Platoon Control Strategy of Connected and Automated Vehicles Based on Physics-informed Deep Reinforcement Learning


Book Description

This dissertation presents a distributed platoon control strategy of connected and automated vehicles (CAVs) based on physics-informed Deep Reinforcement Learning (DRL) for mixed traffic of CAVs and human-driven vehicles (HDVs). The dissertation will mainly consist of three parts: (i) generic DRL-based CAV control framework for the mixed traffic flow; (ii) DRL-based CAV distributed control under communication failure for the fully connected automated environment; (iii) distributed CAVs control for the mixed traffic flow, under real-time aggregated macroscopic car-following behavior estimation based on DRL. For the first part, we first discussed the current challenges for CAV control in mixed traffic flow. For distributed CAV control, we categorize the local downstream environment into two broad traffic scenarios based on the composition of CAVs and HDVs to accommodate any possible CAV-HDV platoon configuration: (i) a fully connected automated environment, where all local downstream vehicles are CAVs, forming a CAV-CAVs topology; and (ii) a mixed local downstream environment, comprising the closest downstream CAV followed by one or more HDVs, creating a CAV-HDVs-CAV topology. This generic control framework effectively accommodates any CAV-HDV platoon topology that may emerge within the mixed traffic platoon. This part is discussed in Section 3. For the second part, this study proposes a deep reinforcement learning (DRL) based distributed longitudinal control strategy for connected and automated vehicles (CAVs) under communication failure to stabilize traffic oscillations. Specifically, the Signal-Interference-plus-Noise Ratio (SINR) based vehicle-to-vehicle (V2V) communication is incorporated into the DRL training environment to reproduce the realistic communication and time-space varying information flow topologies (IFTs). A dynamic information fusion mechanism is designed to smooth the high-jerk control signal caused by the dynamic IFTs. Based on that, each CAV controlled by the DRL-based agent was developed to receive the real-time downstream CAVs' state information and take longitudinal actions to achieve the equilibrium consensus in the multi-agent system. Simulated experiments are conducted to tune the communication adjustment mechanism and further validate the control performance, oscillation dampening performance and generalization capability of our proposed algorithm. This part is discussed in Section 4. The third part proposes an innovative distributed longitudinal control strategy for connected automated vehicles (CAVs) in the mixed traffic environment of CAV and human-driven vehicles (HDVs), incorporating high-dimensional platoon information. For mixed traffic, the traditional CAV control method focuses on microscopic trajectory information, which may not be efficient in handling the HDV stochasticity (e.g., long reaction time; various driving styles) and mixed traffic heterogeneities. Different from traditional methods, our method, for the first time, characterizes consecutive HDVs as a whole (i.e., AHDV) to reduce the HDV stochasticity and utilize its macroscopic features to control the following CAVs. The new control strategy takes advantage of platoon information to anticipate the disturbances and traffic features induced downstream under mixed traffic scenarios and greatly outperforms the traditional methods. In particular, the control algorithm is based on deep reinforcement learning (DRL) to fulfill car-following control efficiency and further address the stochasticity for the aggregated car following behavior by embedding it in the training environment. To better utilize the macroscopic traffic features, a general platoon of mixed traffic is categorized as a CAV-HDVs-CAV pattern and described by corresponding DRL states. The macroscopic traffic flow properties are built upon the Newell car-following model to capture the characteristics of aggregated HDVs' joint behaviors. Simulated experiments are conducted to validate our proposed strategy. The results demonstrate that the proposed control method has outstanding performances in terms of oscillation dampening, eco-driving, and generalization capability. This part is discussed in Section 5.