Processing-in-Memory for AI


Book Description

This book provides a comprehensive introduction to processing-in-memory (PIM) technology, from its architectures to circuits implementations on multiple memory types and describes how it can be a viable computer architecture in the era of AI and big data. The authors summarize the challenges of AI hardware systems, processing-in-memory (PIM) constraints and approaches to derive system-level requirements for a practical and feasible PIM solution. The presentation focuses on feasible PIM solutions that can be implemented and used in real systems, including architectures, circuits, and implementation cases for each major memory type (SRAM, DRAM, and ReRAM).




From Artificial Intelligence to Brain Intelligence


Book Description

Research in Artificial Intelligence (AI) is not new, it has been around since 1950’s. AI resurfaced at that time while Moore’s law was on an aggressive path of scaling, with the transformation of NMOS and later bipolar technology to CMOS for high performance, low power as well as low cost applications.Several breakthroughs in the electronics industry helped to push Moore’s law in chip miniaturization along with increased computing power (parallel and distributed processing) and memory bandwidth. Once this paradigm shift occurred it naturally opened doors for AI as it required big data manipulations, and thus AI could thrive again. AI has already shown success in industries such as finance, marketing, health care, transportation, gaming, education and the defence and space, to name but a few.The human brain amazingly has a memory in the order of millions of digital bits, however it cannot compete with machines for data crunching and speed. Thus tomorrow’s world will be a World of Wonders of Artificial Intelligence (WOW- AI), to compensate the computational limitations of human beings. In short, AI research and applications will continue to grow with the development of software, algorithms and hardware accelerators.To continue the development of AI, an advanced AI Compute Symposium was launched with the sponsorship of IBM, IEEE CAS and EDS, from which this book came. Overall, the book covers two broad topics: general AI advances, and applications to neuromorphic computing.




Efficient Processing of Deep Neural Networks


Book Description

This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics—such as energy-efficiency, throughput, and latency—without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems. The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as formalization and organization of key concepts from contemporary work that provide insights that may spark new ideas.




In-/Near-Memory Computing


Book Description

This book provides a structured introduction of the key concepts and techniques that enable in-/near-memory computing. For decades, processing-in-memory or near-memory computing has been attracting growing interest due to its potential to break the memory wall. Near-memory computing moves compute logic near the memory, and thereby reduces data movement. Recent work has also shown that certain memories can morph themselves into compute units by exploiting the physical properties of the memory cells, enabling in-situ computing in the memory array. While in- and near-memory computing can circumvent overheads related to data movement, it comes at the cost of restricted flexibility of data representation and computation, design challenges of compute capable memories, and difficulty in system and software integration. Therefore, wide deployment of in-/near-memory computing cannot be accomplished without techniques that enable efficient mapping of data-intensive applications to such devices, without sacrificing accuracy or increasing hardware costs excessively. This book describes various memory substrates amenable to in- and near-memory computing, architectural approaches for designing efficient and reliable computing devices, and opportunities for in-/near-memory acceleration of different classes of applications.




Deep In-memory Architectures for Machine Learning


Book Description

This book describes the recent innovation of deep in-memory architectures for realizing AI systems that operate at the edge of energy-latency-accuracy trade-offs. From first principles to lab prototypes, this book provides a comprehensive view of this emerging topic for both the practicing engineer in industry and the researcher in academia. The book is a journey into the exciting world of AI systems in hardware.




TinyML


Book Description

Deep learning networks are getting smaller. Much smaller. The Google Assistant team can detect words with a model just 14 kilobytes in size—small enough to run on a microcontroller. With this practical book you’ll enter the field of TinyML, where deep learning and embedded systems combine to make astounding things possible with tiny devices. Pete Warden and Daniel Situnayake explain how you can train models small enough to fit into any environment. Ideal for software and hardware developers who want to build embedded systems using machine learning, this guide walks you through creating a series of TinyML projects, step-by-step. No machine learning or microcontroller experience is necessary. Build a speech recognizer, a camera that detects people, and a magic wand that responds to gestures Work with Arduino and ultra-low-power microcontrollers Learn the essentials of ML and how to train your own models Train models to understand audio, image, and accelerometer data Explore TensorFlow Lite for Microcontrollers, Google’s toolkit for TinyML Debug applications and provide safeguards for privacy and security Optimize latency, energy usage, and model and binary size




Memory Storage Patterns in Parallel Processing


Book Description

This project had its beginnings in the Fall of 1980. At that time Robert Wagner suggested that I investigate compiler optimi zation of data organization, suitable for use in a parallel or vector machine environment. We developed a scheme in which the compiler, having knowledge of the machine's access patterns, does a global analysis of a program's operations, and automatically determines optimum organization for the data. For example, for certain architectures and certain operations, large improvements in performance can be attained by storing a matrix in row major order. However a subsequent operation may require the matrix in column major order. A determination must be made whether or not it is the best solution globally to store the matrix in row order, column order, or even have two copies of it, each organized differently. We have developed two algorithms for making this determination. The technique shows promise in a vector machine environ ment, particularly if memory interleaving is used. Supercomputers such as the Cray, the CDC Cyber 205, the IBM 3090, as well as superminis such as the Convex are possible environments for implementation.




2021 18th International SoC Design Conference (ISOCC)


Book Description

SoC, Analog Circuits, Digital Circuits, Data Converters, RF Microwave Wireless Circuits, Memories, Design Methodology, Circuits and Systems for Emerging Technologies, AI




Artificial Intelligence Hardware Design


Book Description

ARTIFICIAL INTELLIGENCE HARDWARE DESIGN Learn foundational and advanced topics in Neural Processing Unit design with real-world examples from leading voices in the field In Artificial Intelligence Hardware Design: Challenges and Solutions, distinguished researchers and authors Drs. Albert Chun Chen Liu and Oscar Ming Kin Law deliver a rigorous and practical treatment of the design applications of specific circuits and systems for accelerating neural network processing. Beginning with a discussion and explanation of neural networks and their developmental history, the book goes on to describe parallel architectures, streaming graphs for massive parallel computation, and convolution optimization. The authors offer readers an illustration of in-memory computation through Georgia Tech’s Neurocube and Stanford’s Tetris accelerator using the Hybrid Memory Cube, as well as near-memory architecture through the embedded eDRAM of the Institute of Computing Technology, the Chinese Academy of Science, and other institutions. Readers will also find a discussion of 3D neural processing techniques to support multiple layer neural networks, as well as information like: A thorough introduction to neural networks and neural network development history, as well as Convolutional Neural Network (CNN) models Explorations of various parallel architectures, including the Intel CPU, Nvidia GPU, Google TPU, and Microsoft NPU, emphasizing hardware and software integration for performance improvement Discussions of streaming graph for massive parallel computation with the Blaize GSP and Graphcore IPU An examination of how to optimize convolution with UCLA Deep Convolutional Neural Network accelerator filter decomposition Perfect for hardware and software engineers and firmware developers, Artificial Intelligence Hardware Design is an indispensable resource for anyone working with Neural Processing Units in either a hardware or software capacity.




Memory-Based Language Processing


Book Description

Memory-based language processing--a machine learning and problem solving method for language technology--is based on the idea that the direct re-use of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. This book discusses the theory and practice of memory-based language processing, showing its comparative strengths over alternative methods of language modelling. Language is complex, with few generalizations, many sub-regularities and exceptions, and the advantage of memory-based language processing is that it does not abstract away from this valuable low-frequency information.