Scalable Shared-Memory Multiprocessing


Book Description

Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.







Cache and Interconnect Architectures in Multiprocessors


Book Description

Cache And Interconnect Architectures In Multiprocessors Eilat, Israel May 25-261989 Michel Dubois UniversityofSouthernCalifornia Shreekant S. Thakkar SequentComputerSystems The aim of the workshop was to bring together researchers working on cache coherence protocols for shared-memory multiprocessors with various interconnect architectures. Shared-memory multiprocessors have become viable systems for many applications. Bus based shared-memory systems (Eg. Sequent's Symmetry, Encore's Multimax) are currently limited to 32 processors. The fIrst goal of the workshop was to learn about the performance ofapplications on current cache-based systems. The second goal was to learn about new network architectures and protocols for future scalable systems. These protocols and interconnects would allow shared-memory architectures to scale beyond current imitations. The workshop had 20 speakers who talked about their current research. The discussions were lively and cordial enough to keep the participants away from the wonderful sand and sun for two days. The participants got to know each other well and were able to share their thoughts in an informal manner. The workshop was organized into several sessions. The summary of each session is described below. This book presents revisions of some of the papers presented at the workshop.







Distributed Shared Memory


Book Description

The papers present in this text survey both distributed shared memory (DSM) efforts and commercial DSM systems. The book discusses relevant issues that make the concept of DSM one of the most attractive approaches for building large-scale, high-performance multiprocessor systems. The authors provide a general introduction to the DSM field as well as a broad survey of the basic DSM concepts, mechanisms, design issues, and systems. The book concentrates on basic DSM algorithms, their enhancements, and their performance evaluation. In addition, it details implementations that employ DSM solutions at the software and the hardware level. This guide is a research and development reference that provides state-of-the art information that will be useful to architects, designers, and programmers of DSM systems.







Cache and Memory Hierarchy Design


Book Description

An authoritative book for hardware and software designers. Caches are by far the simplest and most effective mechanism for improving computer performance. This innovative book exposes the characteristics of performance-optimal single and multi-level cache hierarchies by approaching the cache design process through the novel perspective of minimizing execution times. It presents useful data on the relative performance of a wide spectrum of machines and offers empirical and analytical evaluations of the underlying phenomena. This book will help computer professionals appreciate the impact of caches and enable designers to maximize performance given particular implementation constraints.




Reducing Memory Access Delays in Large-scale Shared-memory Multiprocessors


Book Description

Memory access time is a key factor limiting the performance of large-scale, shared-memory multiprocessors. In such systems, limited bandwidth in the interconnection between the processors and the memories, coupled with long delays resulting from network and memory conflicts, can produce serious memory access delays. Incorporating memory hierarchies and asynchronous block transfer mechanisms are common methods for reducing these delays. However, for these two mechanisms to be wed advantageously, they must be managed effectively, either in hardware or in software. Although this memory management problem is becoming increasingly important, good techniques are still lacking. The problem of reducing memory access delays can be attacked at several levels. The first is to attempt to improve the performance of the shared-memory system itself, where the shared-memory system includes implicitly both the network and the memory modules themselves. The second is to develop techniques to manage the memory hierarchy more effectively and to make use of the block transfer mechanisms. This thesis addresses this problem at both of these levels. The first part examines the behavior of a realistic shared-memory system and evaluates cost-effective hardware modifications for improving this balance. An additional goal is to achieve memory system scalability, where the term scalable describes systems whose per-processor performance is roughly constant across the range of system sizes examined. The remainder of this thesis addresses the problem of improving utilization of local storage in shared-memory systems where, at the very least, each processor has access to local (private) storage in addition to the global (shared) memory. A combined flow-and-dependence analysis algorithm is developed which produces the analytical information needed to optimize data accesses. It is shown how this information can be used as part of an intergrated hardware/software approach to eliminating redundant (unnecessary) memory accesses and prefetching data.




The Cache Coherence Problem in Shared-Memory Multiprocessors


Book Description

The book illustrates state-of-the-art software solutions for cache coherence maintenance in shared-memory multiprocessors. It begins with a brief overview of the cache coherence problem and introduces software solutions to the problem. The text defines and details static and dynamic software schemes, techniques for modeling performance evaluation mechanisms, and performance evaluation studies.