Proceedings of 2021 International Conference on Autonomous Unmanned Systems (ICAUS 2021)


Book Description

This book includes original, peer-reviewed research papers from the ICAUS 2021, which offers a unique and interesting platform for scientists, engineers and practitioners throughout the world to present and share their most recent research and innovative ideas. The aim of the ICAUS 2021 is to stimulate researchers active in the areas pertinent to intelligent unmanned systems. The topics covered include but are not limited to Unmanned Aerial/Ground/Surface/Underwater Systems, Robotic, Autonomous Control/Navigation and Positioning/ Architecture, Energy and Task Planning and Effectiveness Evaluation Technologies, Artificial Intelligence Algorithm/Bionic Technology and Its Application in Unmanned Systems. The papers showcased here share the latest findings on Unmanned Systems, Robotics, Automation, Intelligent Systems, Control Systems, Integrated Networks, Modeling and Simulation. It makes the book a valuable asset for researchers, engineers, and university students alike.




Uncertainty-aware Spatiotemporal Perception for Autonomous Vehicles


Book Description

Autonomous vehicles are set to revolutionize transportation in terms of safety and efficiency. However, autonomous systems still have challenges operating in complex human environments, such as an autonomous vehicle in a cluttered, dynamic urban setting. A key obstacle to deploying autonomous systems on the road is understanding, anticipating, and making inferences about human behaviors. Autonomous perception builds a general understanding of the environment for a robot. This includes making inferences about human behaviors in both space and time. Humans are difficult to model due to their vastly diverse behaviors and rapidly evolving objectives. Moreover, in cluttered settings, there are computational and visibility limitations. However, humans also possess desirable capabilities, such as their ability to generalize beyond their observed environment. Although learning-based systems have had success in recent years in modeling and imitating human behavior, efficiently capturing the data and model uncertainty for these systems remains an open problem. This thesis proposes algorithmic advances to uncertainty-aware autonomous perception systems in human environments. We make system-level contributions to spatiotemporal robot perception that reasons about human behavior, and foundational advancements in uncertainty-aware machine learning models for trajectory prediction. These contributions enable robotic systems to make uncertainty- and socially-aware spatiotemporal inferences about human behavior. Traditional robot perception is object-centric and modular, consisting of object detection, tracking, and trajectory prediction stages. These systems can fail prior to the prediction stage due to partial occlusions in the environment. We thus propose an alternative end-to-end paradigm for spatiotemporal environment prediction from a map-centric occupancy grid representation. Occupancy grids are robust to partial occlusions, can handle an arbitrary number of human agents in the scene, and do not require a priori information regarding the environment. We investigate the performance of computer vision techniques in this context and develop new mechanisms tailored to the task of spatiotemporal environment prediction. Spatially, robots also need to reason about fully occluded agents in their environment, which may occur due to sensor limitations or other agents on the road obstructing the field of view. Humans excel at extrapolating from their experiences by making inferences from observed social behaviors. We draw inspiration from human intuition to fill in portions of the robot's map that are not observable by traditional sensors. We infer occupancy in these occluded regions by learning a multimodal mapping from observed human driver behaviors to the environment ahead of them, thus treating people as sensors. Our system handles multiple observed agents to maximally inform the occupancy map around the robot. In order to safely integrate human behavior modeling into the robot autonomy stack, the perception system must efficiently account for uncertainty. Human behavior is often modeled using discrete latent spaces in learning-based models to capture the multimodality in the distribution. For example, in a trajectory prediction task, there may be multiple valid future predictions given a past trajectory. To accurately model this latent distribution, the latent space needs to be sufficiently large, leading to tractability concerns for downstream tasks, such as path planning. We address this issue by proposing a sparsification algorithm for discrete latent sample spaces that can be applied post hoc without sacrificing model performance. Our approach successfully balances multimodality and sparsity to achieve efficient data uncertainty estimation. Aside from modeling data uncertainty, learning-based autonomous systems must be aware of their model uncertainty or what they do not know. Flagging out-of-distribution or unknown scenarios encountered in the real world could be helpful to downstream autonomy stack components and to engineers for further system development. Although the machine learning community has been prolific in model uncertainty estimation for small benchmark problems, relatively little work has been done on estimating this uncertainty in complex, learning-based robotic systems. We propose efficiently learning the model uncertainty over an interpretable, low-dimensional latent space in the context of a trajectory prediction task. The algorithms presented in this thesis were validated on real-world autonomous driving data and baselined against state-of-the-art techniques. We show that drawing inspiration from human-level reasoning while modeling the associated uncertainty can inform environment understanding for autonomous perception systems. The contributions made in this thesis are a step towards uncertainty- and socially-aware autonomous systems that can function seamlessly in human environments.




Autonomous Driving Perception


Book Description

Discover the captivating world of computer vision and deep learning for autonomous driving with our comprehensive and in-depth guide. Immerse yourself in an in-depth exploration of cutting-edge topics, carefully crafted to engage tertiary students and ignite the curiosity of researchers and professionals in the field. From fundamental principles to practical applications, this comprehensive guide offers a gentle introduction, expert evaluations of state-of-the-art methods, and inspiring research directions. With a broad range of topics covered, it is also an invaluable resource for university programs offering computer vision and deep learning courses. This book provides clear and simplified algorithm descriptions, making it easy for beginners to understand the complex concepts. We also include carefully selected problems and examples to help reinforce your learning. Don't miss out on this essential guide to computer vision and deep learning for autonomous driving.




Multimodal Scene Understanding


Book Description

Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful.




Robust Environmental Perception and Reliability Control for Intelligent Vehicles


Book Description

This book presents the most recent state-of-the-art algorithms on robust environmental perception and reliability control for intelligent vehicle systems. By integrating object detection, semantic segmentation, trajectory prediction, multi-object tracking, multi-sensor fusion, and reliability control in a systematic way, this book is aimed at guaranteeing that intelligent vehicles can run safely in complex road traffic scenes. Adopts the multi-sensor data fusion-based neural networks to environmental perception fault tolerance algorithms, solving the problem of perception reliability when some sensors fail by using data redundancy. Presents the camera-based monocular approach to implement the robust perception tasks, which introduces sequential feature association and depth hint augmentation, and introduces seven adaptive methods. Proposes efficient and robust semantic segmentation of traffic scenes through real-time deep dual-resolution networks and representation separation of vision transformers. Focuses on trajectory prediction and proposes phased and progressive trajectory prediction methods that is more consistent with human psychological characteristics, which is able to take both social interactions and personal intentions into account. Puts forward methods based on conditional random field and multi-task segmentation learning to solve the robust multi-object tracking problem for environment perception in autonomous vehicle scenarios. Presents the novel reliability control strategies of intelligent vehicles to optimize the dynamic tracking performance and investigates the completely unknown autonomous vehicle tracking issues with actuator faults.




Multimodal Perception and Secure State Estimation for Robotic Mobility Platforms


Book Description

Multimodal Perception and Secure State Estimation for Robotic Mobility Platforms Enables readers to understand important new trends in multimodal perception for mobile robotics This book provides a novel perspective on secure state estimation and multimodal perception for robotic mobility platforms such as autonomous vehicles. It thoroughly evaluates filter-based secure dynamic pose estimation approaches for autonomous vehicles over multiple attack signals and shows that they outperform conventional Kalman filtered results. As a modern learning resource, it contains extensive simulative and experimental results that have been successfully implemented on various models and real platforms. To aid in reader comprehension, detailed and illustrative examples on algorithm implementation and performance evaluation are also presented. Written by four qualified authors in the field, sample topics covered in the book include: Secure state estimation that focuses on system robustness under cyber-attacks Multi-sensor fusion that helps improve system performance based on the complementary characteristics of different sensors A geometric pose estimation framework to incorporate measurements and constraints into a unified fusion scheme, which has been validated using public and self-collected data How to achieve real-time road-constrained and heading-assisted pose estimation This book will appeal to graduate-level students and professionals in the fields of ground vehicle pose estimation and perception who are looking for modern and updated insight into key concepts related to the field of robotic mobility platforms.







Robustness of Multimodal 3D Object Detection Using Deep Learning Approach for Autonomous Vehicles


Book Description

In this thesis, we study the robustness of a multimodal 3D object detection model in the context of autonomous vehicles. Self-driving cars need to accurately detect and localize pedestrians and other vehicles in their 3D surrounding environment to drive on the roads safely. Robustness is one of the most critical aspects of an algorithm in the self-driving car 3D perception problem. Therefore, in this work, we proposed a method to evaluate a 3D object detector’s robustness. To this end, we have trained a representative multimodal 3D object detector on three different datasets. Afterward, we evaluated the trained model on datasets that we have proposed and made to assess the robustness of the trained models in diverse weather and lighting conditions. Our method uses two different approaches for building the proposed datasets for evaluating the robustness. In one approach, we used artificially corrupted images, and in the other one, we used the real images captured in diverse weather and lighting conditions. To detect objects such as cars and pedestrians in the traffic scenes, the multimodal model relies on images and 3D point clouds. Multimodal approaches for 3D object detection exploit different sensors such as camera and range detectors for detecting the objects of interest in the surrounding environment. We leveraged three well-known datasets in the domain of autonomous driving consist of KITTI, nuScenes, and Waymo. We conducted extensive experiments to investigate the proposed method for evaluating the model’s robustness and provided quantitative and qualitative results. We observed that our proposed method can measure the robustness of the model effectively.