GPU Computing Gems Emerald Edition


Book Description

GPU Computing Gems Emerald Edition offers practical techniques in parallel computing using graphics processing units (GPUs) to enhance scientific research. The first volume in Morgan Kaufmann's Applications of GPU Computing Series, this book offers the latest insights and research in computer vision, electronic design automation, and emerging data-intensive applications. It also covers life sciences, medical imaging, ray tracing and rendering, scientific simulation, signal and audio processing, statistical modeling, video and image processing. This book is intended to help those who are facing the challenge of programming systems to effectively use GPUs to achieve efficiency and performance goals. It offers developers a window into diverse application areas, and the opportunity to gain insights from others' algorithm work that they may apply to their own projects. Readers will learn from the leading researchers in parallel programming, who have gathered their solutions and experience in one volume under the guidance of expert area editors. Each chapter is written to be accessible to researchers from other domains, allowing knowledge to cross-pollinate across the GPU spectrum. Many examples leverage NVIDIA's CUDA parallel computing architecture, the most widely-adopted massively parallel programming solution. The insights and ideas as well as practical hands-on skills in the book can be immediately put to use. Computer programmers, software engineers, hardware engineers, and computer science students will find this volume a helpful resource. For useful source codes discussed throughout the book, the editors invite readers to the following website: ..." Covers the breadth of industry from scientific simulation and electronic design automation to audio / video processing, medical imaging, computer vision, and more Many examples leverage NVIDIA's CUDA parallel computing architecture, the most widely-adopted massively parallel programming solution Offers insights and ideas as well as practical "hands-on" skills you can immediately put to use




High Performance Pattern Matching and Data Remanence on Graphics Processing Units


Book Description

Pattern matching is an important task in a plethora of different fields ranging from computer science to medical application, but is also a resource consuming problem.With the increase in network link speed, and the tremendous amounts of data generated, serial pattern matching on Central Processing Unit (CPU) is close to being rendered obsolete. The ubiquitous Graphics Processing Unit (GPU) have become the focus of much interest within the scientific community due to their highly parallel computing capabilities, and cost effectiveness offered by the hardware.This thesis presents an empirical investigation of massively parallel single and multi-pattern matching algorithms, as well as security and privacy concerns for data processing on GPUs. This thesis demonstrates a trie reduction algorithm that reduces the size of the data stored in GPU memory. GPUs have a limited amount of memory and with the increasing number of patterns to be searched for, space complexity is of great importance. This work addresses these challenges and investigates different memory hierarchies for different matching problems increasing the overall performance.This work also presents a Digital Forensic (DF) and Reverse Engineering (RE) methodology based on an evaluation of the different memory hierarchies present in a GPU.The results of the investigation presented in this thesis show that single and multi-pattern matching algorithms can benefit of the massively parallel capabilities of GPUs when implemented with hardware design in mind. Furthermore, it is demonstrated that data offloaded to the GPU is subject to data leaks.




Multi-GPU Graph Processing


Book Description

While modern GPU graph analytics libraries provide usable programming models and good single-node performance, the memory size and the computation power of a single GPU is still too limited for analyzing large graphs. Scaling graph analytics is challenging, however, because of the characteristics of graph applications: irregular computation, their low computation to communication ratios, and limited communication bandwidth on multi-GPU platforms. Addressing these challenges while still maintaining programmability is yet another difficulty. In this work, I target the scalability of graph analytics to multiple GPUs. I begin by targeting multiple GPUs within a single node. Compared to GPU clusters, single-node-multi-GPU platforms are easier to manage and program, but can still act as good development environments for multi-GPU graph processing. My work targets several aspects of multi-GPU graph analytics: the inputs that graph application programmers provide to the multi-GPU framework; how the graph should be distributed across GPUs; the interaction between local computation and remote communication; what and when to communicate; how to combine received and local data; and when the application should stop. I answer these questions by extending the Gunrock graph analytics framework for a single GPU to multiple GPUs, showing that most graph applications scale well in my system. I also show that direction-optimizing breadth-first search (DOBFS) is the most difficult scaling challenge because of its extremely low compute to communication ratio. To address the DOBFS scaling challenge, I demonstrate a DOBFS implementation with efficient graph representation, local computation, and remote communication, based on the idea of separating high- and low-degree vertices. I particularly target communication costs, using global reduction with bit masks on high-degree vertices and point-to-point communication to low-degree vertices. This greatly reduces overall communication cost and results in good DOBFS scaling with log-scale graphs on more than a hundred GPUs in the Sierra early access system (the testing bed for the Sierra Supercomputer). Next, I revisit the design choices I made for the single-node multi-GPU framework in view of recent hardware and software developments, such as better peer GPU access and unified virtual memory. I analyze 9 newly developed complex graph applications for the DARPA HIVE program, implemented in the Gunrock framework, and show a wide range of potential scalabilities. More importantly, the questions of when and how to do communication are more diverse than those in the single-node framework. With this analysis, I conclude that future multi-GPU frameworks, whether single- or multiple-node, need to be more flexible: instead of only communicating at iteration boundaries, they should support a more flexible, general communication model. I also propose other research directions for future heterogeneous graph processing, including asynchronous computation and communication, specialized graph representation, and heterogenous processing.




Real-time Time-warped Multiscale Signal Processing for Scientific Visualization


Book Description

This thesis considers the problem of visualizing simulations of phenomenon which span large ranges of spatial scales. These datasets tend to be extremely large presenting challenges both to human comprehension and high-performance computing. The main problems considered are how to effectively represent scale and how to efficiently compute and visualize multiscale representations for large, real-time datasets. Time-warped signal processing techniques are shown to be useful for formulating a localized notion of scale. In this case, we use time-warping in order to adapt the standard Fourier basis to local properties of the signal, giving the advantage of being localized in the frequency spectrum as compared with the standard linear notions of scale. Time-warping is also shown to have theoretical advantages in terms of signal reconstruction quality and random noise removal. In practice, these advantages are shown to only hold under certain conditions. It is then shown in the thesis how convolution-based reconstruction techniques can be mapped onto graphics processing units (GPUs) for high-performance implementation of a multiscale molecular visualization framework. We show how the same technique can likely be used for time-warped multiscale reconstruction.




Numerical Computations with GPUs


Book Description

This book brings together research on numerical methods adapted for Graphics Processing Units (GPUs). It explains recent efforts to adapt classic numerical methods, including solution of linear equations and FFT, for massively parallel GPU architectures. This volume consolidates recent research and adaptations, covering widely used methods that are at the core of many scientific and engineering computations. Each chapter is written by authors working on a specific group of methods; these leading experts provide mathematical background, parallel algorithms and implementation details leading to reusable, adaptable and scalable code fragments. This book also serves as a GPU implementation manual for many numerical algorithms, sharing tips on GPUs that can increase application efficiency. The valuable insights into parallelization strategies for GPUs are supplemented by ready-to-use code fragments. Numerical Computations with GPUs targets professionals and researchers working in high performance computing and GPU programming. Advanced-level students focused on computer science and mathematics will also find this book useful as secondary text book or reference.




Medical Imaging 2008


Book Description




Shape Analysis in Medical Image Analysis


Book Description

This book contains thirteen contributions from invited experts of international recognition addressing important issues in shape analysis in medical image analysis, including techniques for image segmentation, registration, modelling and classification and applications in biology, as well as in cardiac, brain, spine, chest, lung and clinical practice. This volume treats topics such as for example, anatomic and functional shape representation and matching; shape-based medical image segmentation; shape registration; statistical shape analysis; shape deformation; shape-based abnormity detection; shape tracking and longitudinal shape analysis; machine learning for shape modeling and analysis; shape-based computer-aided-diagnosis; shape-based medical navigation; benchmark and validation of shape representation, analysis and modeling algorithms. This work will be of interest to researchers, students and manufacturers in the fields of artificial intelligence, bioengineering, biomechanics, computational mechanics, computational vision, computer sciences, human motion, mathematics, medical imaging, medicine, pattern recognition and physics.




Deep Learning and Data Labeling for Medical Applications


Book Description

This book constitutes the refereed proceedings of two workshops held at the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2016, in Athens, Greece, in October 2016: the First Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2016, and the Second International Workshop on Deep Learning in Medical Image Analysis, DLMIA 2016. The 28 revised regular papers presented in this book were carefully reviewed and selected from a total of 52 submissions. The 7 papers selected for LABELS deal with topics from the following fields: crowd-sourcing methods; active learning; transfer learning; semi-supervised learning; and modeling of label uncertainty.The 21 papers selected for DLMIA span a wide range of topics such as image description; medical imaging-based diagnosis; medical signal-based diagnosis; medical image reconstruction and model selection using deep learning techniques; meta-heuristic techniques for fine-tuning parameter in deep learning-based architectures; and applications based on deep learning techniques.




Brain, Body and Machine


Book Description

The reader will find here papers on human-robot interaction as well as human safety algorithms; haptic interfaces; innovative instruments and algorithms for the sensing of motion and the identification of brain neoplasms; and, even a paper on a saxophone-playing robot.