GPU Gems 2


Book Description

More useful techniques, tips, and tricks for harnessing the power of the new generation of powerful GPUs.




Efficient Execution of Irregular Dataflow Graphs


Book Description

This book focuses on the acceleration of emerging irregular sparse workloads, posed by novel artificial intelligent (AI) models and sparse linear algebra. Specifically, the book outlines several co-optimized hardware-software solutions for a highly promising class of emerging sparse AI models called Probabilistic Circuit (PC) and a similar sparse matrix workload for triangular linear systems (SpTRSV). The authors describe optimizations for the entire stack, targeting applications, compilation, hardware architecture and silicon implementation, resulting in orders of magnitude higher performance and energy-efficiency compared to the existing state-of-the-art solutions. Thus, this book provides important building blocks for the upcoming generation of edge AI platforms.




Embedded Systems Development


Book Description

This book offers readers broad coverage of techniques to model, verify and validate the behavior and performance of complex distributed embedded systems. The authors attempt to bridge the gap between the three disciplines of model-based design, real-time analysis and model-driven development, for a better understanding of the ways in which new development flows can be constructed, going from system-level modeling to the correct and predictable generation of a distributed implementation, leveraging current and future research results.




Algorithms for Optimization


Book Description

A comprehensive introduction to optimization with a focus on practical algorithms for the design of engineering systems. This book offers a comprehensive introduction to optimization with a focus on practical algorithms. The book approaches optimization from an engineering perspective, where the objective is to design a system that optimizes a set of metrics subject to constraints. Readers will learn about computational approaches for a range of challenges, including searching high-dimensional spaces, handling problems where there are multiple competing objectives, and accommodating uncertainty in the metrics. Figures, examples, and exercises convey the intuition behind the mathematical approaches. The text provides concrete implementations in the Julia programming language. Topics covered include derivatives and their generalization to multiple dimensions; local descent and first- and second-order methods that inform local descent; stochastic methods, which introduce randomness into the optimization process; linear constrained optimization, when both the objective function and the constraints are linear; surrogate models, probabilistic surrogate models, and using probabilistic surrogate models to guide optimization; optimization under uncertainty; uncertainty propagation; expression optimization; and multidisciplinary design optimization. Appendixes offer an introduction to the Julia language, test functions for evaluating algorithm performance, and mathematical concepts used in the derivation and analysis of the optimization methods discussed in the text. The book can be used by advanced undergraduates and graduate students in mathematics, statistics, computer science, any engineering field, (including electrical engineering and aerospace engineering), and operations research, and as a reference for professionals.




Big Data in Astronomy


Book Description

Big Data in Radio Astronomy: Scientific Data Processing for Advanced Radio Telescopes provides the latest research developments in big data methods and techniques for radio astronomy. Providing examples from such projects as the Square Kilometer Array (SKA), the world's largest radio telescope that generates over an Exabyte of data every day, the book offers solutions for coping with the challenges and opportunities presented by the exponential growth of astronomical data. Presenting state-of-the-art results and research, this book is a timely reference for both practitioners and researchers working in radio astronomy, as well as students looking for a basic understanding of big data in astronomy. - Bridges the gap between radio astronomy and computer science - Includes coverage of the observation lifecycle as well as data collection, processing and analysis - Presents state-of-the-art research and techniques in big data related to radio astronomy - Utilizes real-world examples, such as Square Kilometer Array (SKA) and Five-hundred-meter Aperture Spherical radio Telescope (FAST)




Performance Analysis and Tuning on Modern CPUs


Book Description

Performance tuning is becoming more important than it has been for the last 40 years. Read this book to understand your application's performance that runs on a modern CPU and learn how you can improve it. The 170+ page guide combines the knowledge of many optimization experts from different industries.




Network and Parallel Computing


Book Description

This book constitutes the proceedings of the 19th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2022, which was held in Jinan, China, during September 24-25, 2022. The 23 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 89 submissions. They were organized in topical sections as follows: computer architecture; cloud computing; deep learning; emerging applications; and storage and IO.




Parallel and High Performance Computing


Book Description

Parallel and High Performance Computing offers techniques guaranteed to boost your code’s effectiveness. Summary Complex calculations, like training deep learning models or running large-scale simulations, can take an extremely long time. Efficient parallel programming can save hours—or even days—of computing time. Parallel and High Performance Computing shows you how to deliver faster run-times, greater scalability, and increased energy efficiency to your programs by mastering parallel techniques for multicore processor and GPU hardware. About the technology Write fast, powerful, energy efficient programs that scale to tackle huge volumes of data. Using parallel programming, your code spreads data processing tasks across multiple CPUs for radically better performance. With a little help, you can create software that maximizes both speed and efficiency. About the book Parallel and High Performance Computing offers techniques guaranteed to boost your code’s effectiveness. You’ll learn to evaluate hardware architectures and work with industry standard tools such as OpenMP and MPI. You’ll master the data structures and algorithms best suited for high performance computing and learn techniques that save energy on handheld devices. You’ll even run a massive tsunami simulation across a bank of GPUs. What's inside Planning a new parallel project Understanding differences in CPU and GPU architecture Addressing underperforming kernels and loops Managing applications with batch scheduling About the reader For experienced programmers proficient with a high-performance computing language like C, C++, or Fortran. About the author Robert Robey works at Los Alamos National Laboratory and has been active in the field of parallel computing for over 30 years. Yuliana Zamora is currently a PhD student and Siebel Scholar at the University of Chicago, and has lectured on programming modern hardware at numerous national conferences. Table of Contents PART 1 INTRODUCTION TO PARALLEL COMPUTING 1 Why parallel computing? 2 Planning for parallelization 3 Performance limits and profiling 4 Data design and performance models 5 Parallel algorithms and patterns PART 2 CPU: THE PARALLEL WORKHORSE 6 Vectorization: FLOPs for free 7 OpenMP that performs 8 MPI: The parallel backbone PART 3 GPUS: BUILT TO ACCELERATE 9 GPU architectures and concepts 10 GPU programming model 11 Directive-based GPU programming 12 GPU languages: Getting down to basics 13 GPU profiling and tools PART 4 HIGH PERFORMANCE COMPUTING ECOSYSTEMS 14 Affinity: Truce with the kernel 15 Batch schedulers: Bringing order to chaos 16 File operations for a parallel world 17 Tools and resources for better code




Professional CUDA C Programming


Book Description

Break into the powerful world of parallel GPU programming with this down-to-earth, practical guide Designed for professionals across multiple industrial sectors, Professional CUDA C Programming presents CUDA -- a parallel computing platform and programming model designed to ease the development of GPU programming -- fundamentals in an easy-to-follow format, and teaches readers how to think in parallel and implement parallel algorithms on GPUs. Each chapter covers a specific topic, and includes workable examples that demonstrate the development process, allowing readers to explore both the "hard" and "soft" aspects of GPU programming. Computing architectures are experiencing a fundamental shift toward scalable parallel computing motivated by application requirements in industry and science. This book demonstrates the challenges of efficiently utilizing compute resources at peak performance, presents modern techniques for tackling these challenges, while increasing accessibility for professionals who are not necessarily parallel programming experts. The CUDA programming model and tools empower developers to write high-performance applications on a scalable, parallel computing platform: the GPU. However, CUDA itself can be difficult to learn without extensive programming experience. Recognized CUDA authorities John Cheng, Max Grossman, and Ty McKercher guide readers through essential GPU programming skills and best practices in Professional CUDA C Programming, including: CUDA Programming Model GPU Execution Model GPU Memory model Streams, Event and Concurrency Multi-GPU Programming CUDA Domain-Specific Libraries Profiling and Performance Tuning The book makes complex CUDA concepts easy to understand for anyone with knowledge of basic software development with exercises designed to be both readable and high-performance. For the professional seeking entrance to parallel computing and the high-performance computing community, Professional CUDA C Programming is an invaluable resource, with the most current information available on the market.




Programming Massively Parallel Processors


Book Description

Programming Massively Parallel Processors: A Hands-on Approach, Second Edition, teaches students how to program massively parallel processors. It offers a detailed discussion of various techniques for constructing parallel programs. Case studies are used to demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This guide shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth. This revised edition contains more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. It also provides new coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more; increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism; and two new case studies (on MRI reconstruction and molecular visualization) that explore the latest applications of CUDA and GPUs for scientific research and high-performance computing. This book should be a valuable resource for advanced students, software engineers, programmers, and hardware engineers. - New coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more - Increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism - Two new case studies (on MRI reconstruction and molecular visualization) explore the latest applications of CUDA and GPUs for scientific research and high-performance computing