Computational Model for Neural Architecture Search


Book Description

"A long-standing goal in Deep Learning (DL) research is to design efficient architectures for a given dataset that are both accurate and computationally inexpensive. At present, designing deep learning architectures for a real-world application requires both human expertise and considerable effort as they are either handcrafted by careful experimentation or modified from a handful of existing models. This method is inefficient as the process of architecture design is highly time-consuming and computationally expensive. The research presents an approach to automate the process of deep learning architecture design through a modeling procedure. In particular, it first introduces a framework that treats the deep learning architecture design problem as a systems architecting problem. The framework provides the ability to utilize novel and intuitive search spaces to find efficient architectures using evolutionary methodologies. Secondly, it uses a parameter sharing approach to speed up the search process and explores its limitations with search space. Lastly, it introduces a multi-objective approach to facilitate architecture design based on hardware constraints that are often associated with real-world deployment. From the modeling perspective, instead of designing and staging explicit algorithms to process images/sentences, the contribution lies in the design of hybrid architectures that use the deep learning literature developed so far. This approach enjoys the benefit of a single problem formulation to perform end-to-end training and architecture design with limited computational resources"--Abstract, page iii.




Automated Machine Learning


Book Description

This open access book presents the first comprehensive overview of general methods in Automated Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first series of international challenges of AutoML systems. The recent success of commercial ML applications and the rapid growth of the field has created a high demand for off-the-shelf ML methods that can be used easily and without expert knowledge. However, many of the recent machine learning successes crucially rely on human experts, who manually select appropriate ML architectures (deep learning architectures or more traditional ML workflows) and their hyperparameters. To overcome this problem, the field of AutoML targets a progressive automation of machine learning, based on principles from optimization and machine learning itself. This book serves as a point of entry into this quickly-developing field for researchers and advanced students alike, as well as providing a reference for practitioners aiming to use AutoML in their work.




Evolutionary Deep Neural Architecture Search: Fundamentals, Methods, and Recent Advances


Book Description

This book systematically narrates the fundamentals, methods, and recent advances of evolutionary deep neural architecture search chapter by chapter. This will provide the target readers with sufficient details learning from scratch. In particular, the method parts are devoted to the architecture search of unsupervised and supervised deep neural networks. The people, who would like to use deep neural networks but have no/limited expertise in manually designing the optimal deep architectures, will be the main audience. This may include the researchers who focus on developing novel evolutionary deep architecture search methods for general tasks, the students who would like to study the knowledge related to evolutionary deep neural architecture search and perform related research in the future, and the practitioners from the fields of computer vision, natural language processing, and others where the deep neural networks have been successfully and largely used in their respective fields.




Efficient Neural Architecture Search with Multiobjective Evolutionary Optimization


Book Description

Deep neural networks have become very successful at solving many complex tasks such as image classification, image segmentation, and speech recognition. These models are composed of multiple layers that have the capacity to learn increasingly higher-level features, without prior handcrafted specifications. However, the success of a deep neural network relies on finding the proper configuration for the task in hand. Given the vast number of hyperparameters and the massive search space, manually designing or fine-tuning deep learning architectures requires extensive knowledge, time, and computational resources. There is a growing interest in developing methods that automatically design a neural network ́s architecture, known as neural architecture search (NAS). NAS is usually modeled as a single-objective optimization problem where the aim is to find an architecture that maximizes the prediction ́s accuracy. However, most deep learning applications require accurate as well as efficient architectures to reduce memory consumption and enable their use in computationally-limited environments. This has led to the need to model NAS as a multiple objective problem that optimizes both the predictive performance and efficiency of the network. Furthermore, most NAS framework have focused on either optimizing the micro-structure (structure of the basic cell), or macro-structure (optimal number of cells and their connection) of the architecture. Consequently, manual engineering is required to find the topology of the non-optimized structure. Although NAS has demonstrated great potential in automatically designing an architecture, it remains a computationally expensive and time-consuming process because it requires training and evaluating many potential configurations. Recent work has focused on improving the search time of NAS algorithms, but most techniques have been developed and applied only for single-objective optimization problems. Given that optimizing multiple objectives has a higher complexity and requires more iterations to approximate the Pareto Front, it is critical to investigate algorithms that decrease the search time of multiobjective NAS. One critical application of deep learning is medical image segmentation. Segmentation of medical images provides valuable information for various critical tasks such as analyzing anatomical structures, monitoring disease progression, and predicting patient outcomes. Nonetheless, achieving accurate segmentation is challenging due to the inherent variability in appearance, shape, and location of the region of interest (ROI) between patients and the differences in imagining equipment and acquisition protocols. Therefore, neural networks are usually tailored to a specific application, anatomical region, and image modality. Moreover, medical image data is often volumetric requiring expensive 3D operations that result in large and complex architectures. Hence, training and deploying them requires considerable storage and memory bandwidth that makes them less suitable for clinical applications. To overcome these challenges, the main goal of this research is to automatically design accurate and efficient deep neural networks using multiobjective optimization algorithms for medical image segmentation. The proposed research consists of three major objectives: (1) to design a deep neural network that uses a multiobjective evolutionary based algorithm to automatically adapt to different medical image datasets while minimizing the model’s size; (2) to design a self-adaptive 2D-3D Fully Convolutional network (FCN) ensemble that incorporates volumetric information and optimizes both the performance and the size of the architecture; and (3) to design an efficient multiobjective neural architecture search framework that decreases the search time while simultaneously optimizing the micro- and macro-structure of the neural architecture. For the first objective, a multiobjective adaptive convolutional neural network named AdaResU-Net is presented for 2D medical image segmentation. The proposed AdaResU-Net is comprised of a fixed architecture and a learning framework that adjusts the hyperparameters to a particular training dataset using a multiobjective evolutionary based algorithm (MEA algorithm). The MEA algorithm evolves the AdaResU-Net network to optimize both the segmentation accuracy and model size. In the second objective, a self-adaptive ensemble of 2D-3D FCN named AdaEn-Net is proposed for 3D medical image segmentation. The AdaEn-Net is comprised of a 2D FCN that extracts intra-slice and long-range 2D context, and a 3D FCN architecture that exploits inter-slice and volumetric information. The 2D and 3D FCN architectures are automatically fitted for a specific medical image segmentation task by simultaneously optimizing the expected segmentation error and size of the network using the MEA algorithm. Finally, for the third objective, an efficient multiobjective neural architecture search framework named EMONAS is presented for 3D medical image segmentation. EMONAS has two main components, a novel search space that includes the hyperparameters that define the micro- and macro-structure of the architecture, and a Surrogate-assisted multiobjective evolutionary based algorithm (SaMEA algorithm) that efficiently searches for the best hyperparameter values using a Random Forest surrogate and guiding selection probabilities. The broader impact of the proposed research is as follows: (1) automating the design of deep neural networks’ architecture and hyperparameters to improve the performance and efficiency of the models; and (2) increase the accessibility of deep learning to a broader range of organizations and people by reducing the need of expert knowledge and GPU time when automatically designing deep neural networks. In the medical area, the proposed models aim to improve the automatic extraction of data from medical images to potentially enhance diagnosis, treatment planning and survival prediction of various diseases such as cardiac disease and prostate cancer. Although the proposed techniques are applied to medical image segmentation tasks, they can also be implemented in other applications where accurate and resource-efficient deep neural networks are needed such as autonomous navigation, augmented reality and internet-of-things.




Computational Architectures Integrating Neural and Symbolic Processes


Book Description

Computational Architectures Integrating Neural and Symbolic Processes: A Perspective on the State of the Art focuses on a currently emerging body of research. With the reemergence of neural networks in the 1980s with their emphasis on overcoming some of the limitations of symbolic AI, there is clearly a need to support some form of high-level symbolic processing in connectionist networks. As argued by many researchers, on both the symbolic AI and connectionist sides, many cognitive tasks, e.g. language understanding and common sense reasoning, seem to require high-level symbolic capabilities. How these capabilities are realized in connectionist networks is a difficult question and it constitutes the focus of this book. Computational Architectures Integrating Neural and Symbolic Processes addresses the underlying architectural aspects of the integration of neural and symbolic processes. In order to provide a basis for a deeper understanding of existing divergent approaches and provide insight for further developments in this field, this book presents: (1) an examination of specific architectures (grouped together according to their approaches), their strengths and weaknesses, why they work, and what they predict, and (2) a critique/comparison of these approaches. Computational Architectures Integrating Neural and Symbolic Processes is of interest to researchers, graduate students, and interested laymen, in areas such as cognitive science, artificial intelligence, computer science, cognitive psychology, and neurocomputing, in keeping up-to-date with the newest research trends. It is a comprehensive, in-depth introduction to this new emerging field.




Unsupervised Learning


Book Description

Since its founding in 1989 by Terrence Sejnowski, Neural Computation has become the leading journal in the field. Foundations of Neural Computation collects, by topic, the most significant papers that have appeared in the journal over the past nine years. This volume of Foundations of Neural Computation, on unsupervised learning algorithms, focuses on neural network learning algorithms that do not require an explicit teacher. The goal of unsupervised learning is to extract an efficient internal representation of the statistical structure implicit in the inputs. These algorithms provide insights into the development of the cerebral cortex and implicit learning in humans. They are also of interest to engineers working in areas such as computer vision and speech recognition who seek efficient representations of raw input data.




New Perspectives in Neural Architecture Search


Book Description

Neural architecture search (NAS) has made significant strides in recent years, but challenges remain in terms of the stability of search performance and the high computational requirements of sampling-based NAS. Studying architecture representations offers a promising solution to these challenges, as it encourages neural architectures with similar structures or computations to cluster together. This helps to map neural architectures with similar performance to the same regions in the latent space and leads to smoother transitions in the latent space, benefiting downstream search. Additionally, learning curve extrapolation can accelerate the search process by estimating the final validation accuracy of a neural network from the learning curve of a partially trained network. Overall, understanding the neural architecture representations and their associated learning curves through theoretical analysis and empirical evaluations is crucial for achieving stable and scalable NAS.This dissertation presents our contributions to the field of neural architecture search (NAS), which push the limits of NAS and achieve state-of-the-art performance. Our contributions include efficient one-shot NAS via hierarchical masking, addressing the joint optimization problem of architecture representations and search through unsupervised pre-training, improving the generalization ability of architecture representations with computation-aware self-supervised training, and developing a method for facilitating multi-fidelity NAS research.




Neural Networks with Model Compression


Book Description

Deep learning has achieved impressive results in image classification, computer vision and natural language processing. To achieve better performance, deeper and wider networks have been designed, which increase the demand for computational resources. The number of floating-point operations (FLOPs) has increased dramatically with larger networks, and this has become an obstacle for convolutional neural networks (CNNs) being developed for mobile and embedded devices. In this context, our book will focus on CNN compression and acceleration, which are important for the research community. We will describe numerous methods, including parameter quantization, network pruning, low-rank decomposition and knowledge distillation. More recently, to reduce the burden of handcrafted architecture design, neural architecture search (NAS) has been used to automatically build neural networks by searching over a vast architecture space. Our book will also introduce NAS due to its superiority and state-of-the-art performance in various applications, such as image classification and object detection. We also describe extensive applications of compressed deep models on image classification, speech recognition, object detection and tracking. These topics can help researchers better understand the usefulness and the potential of network compression on practical applications. Moreover, interested readers should have basic knowledge about machine learning and deep learning to better understand the methods described in this book.




Probabilistic Graphical Models


Book Description

A general framework for constructing and using probabilistic models of complex systems that would enable a computer to use available information for making decisions. Most tasks require a person or an automated system to reason—to reach conclusions based on available information. The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality. Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.




Advances in Neural Computation, Machine Learning, and Cognitive Research VI


Book Description

This book describes new theories and applications of artificial neural networks, with a special focus on answering questions in neuroscience, biology and biophysics and cognitive research. It covers a wide range of methods and technologies, including deep neural networks, large-scale neural models, brain–computer interface, signal processing methods, as well as models of perception, studies on emotion recognition, self-organization and many more. The book includes both selected and invited papers presented at the XXIV International Conference on Neuroinformatics, held on October 17–21, 2022, in Moscow, Russia.