Final Report -- Center for Programmng Models for Scalable Parallel Computing (UIUC Subgroup).


Book Description

The mission of the Center for Scalable Programming Models (Pmodels) was to create new ways of programming parallel computers that are much easier for humans to conceptualize, that allow programs to be written, updated and debugged quickly, and that run extremely efficiently--even on computers with thousands or even millions of processors. At UIUC, our work for Pmodels focused on support for I/O in a massively parallel environment, and included both research and technology transfer activities.




Final Report, Center for Programming Models for Scalable Parallel Computing


Book Description

The major accomplishment of this project is the production of CafLib, an 'object-oriented' parallel numerical library written in Co-Array Fortran. CafLib contains distributed objects such as block vectors and block matrices along with procedures, attached to each object, that perform basic linear algebra operations such as matrix multiplication, matrix transpose and LU decomposition. It also contains constructors and destructors for each object that hide the details of data decomposition from the programmer, and it contains collective operations that allow the programmer to calculate global reductions, such as global sums, global minima and global maxima, as well as vector and matrix norms of several kinds. CafLib is designed to be extensible in such a way that programmers can define distributed grid and field objects, based on vector and matrix objects from the library, for finite difference algorithms to solve partial differential equations. A very important extra benefit that resulted from the project is the inclusion of the co-array programming model in the next Fortran standard called Fortran 2008. It is the first parallel programming model ever included as a standard part of the language. Co-arrays will be a supported feature in all Fortran compilers, and the portability provided by standardization will encourage a large number of programmers to adopt it for new parallel application development. The combination of object-oriented programming in Fortran 2003 with co-arrays in Fortran 2008 provides a very powerful programming model for high-performance scientific computing. Additional benefits from the project, beyond the original goal, include a programto provide access to the co-array model through access to the Cray compiler as a resource for teaching and research. Several academics, for the first time, included the co-array model as a topic in their courses on parallel computing. A separate collaborative project with LANL and PNNL showed how to extend the co-array model to other languages in a small experimental version of Co-array Python. Another collaborative project defined a Fortran 95 interface to ARMCI to encourage Fortran programmers to use the one-sided communication model in anticipation of their conversion to the co-array model later. A collaborative project with the Earth Sciences community at NASA Goddard and GFDL experimented with the co-array model within computational kernels related to their climate models, first using CafLib and then extending the co-array model to use design patterns. Future work will build on the design-pattern idea with a redesign of CafLib as a true object-oriented library using Fortran 2003 and as a parallel numerical library using Fortran 2008.




Center for Programming Models for Scalable Parallel Computing


Book Description

The mission of the pmodel center project is to develop software technology to support scalable parallel programming models for terascale systems. The goal of the specific UD subproject is in the context developing an efficient and robust methodology and tools for HPC programming. More specifically, the focus is on developing new programming models which facilitate programmers in porting their application onto parallel high performance computing systems. During the course of the research in the past 5 years, the landscape of microprocessor chip architecture has witnessed a fundamental change - the emergence of multi-core/many-core chip architecture appear to become the mainstream technology and will have a major impact to for future generation parallel machines. The programming model for shared-address space machines is becoming critical to such multi-core architectures. Our research highlight is the in-depth study of proposed fine-grain parallelism/multithreading support on such future generation multi-core architectures. Our research has demonstrated the significant impact such fine-grain multithreading model can have on the productivity of parallel programming models and their efficient implementation.




Programming Models for Parallel Computing


Book Description

An overview of the most prominent contemporary parallel processing programming models, written in a unique tutorial style. With the coming of the parallel computing era, computer scientists have turned their attention to designing programming models that are suited for high-performance parallel computing and supercomputing systems. Programming parallel systems is complicated by the fact that multiple processing units are simultaneously computing and moving data. This book offers an overview of some of the most prominent parallel programming models used in high-performance computing and supercomputing systems today. The chapters describe the programming models in a unique tutorial style rather than using the formal approach taken in the research literature. The aim is to cover a wide range of parallel programming models, enabling the reader to understand what each has to offer. The book begins with a description of the Message Passing Interface (MPI), the most common parallel programming model for distributed memory computing. It goes on to cover one-sided communication models, ranging from low-level runtime libraries (GASNet, OpenSHMEM) to high-level programming models (UPC, GA, Chapel); task-oriented programming models (Charm++, ADLB, Scioto, Swift, CnC) that allow users to describe their computation and data units as tasks so that the runtime system can manage computation and data movement as necessary; and parallel programming models intended for on-node parallelism in the context of multicore architecture or attached accelerators (OpenMP, Cilk Plus, TBB, CUDA, OpenCL). The book will be a valuable resource for graduate students, researchers, and any scientist who works with data sets and large computations. Contributors Timothy Armstrong, Michael G. Burke, Ralph Butler, Bradford L. Chamberlain, Sunita Chandrasekaran, Barbara Chapman, Jeff Daily, James Dinan, Deepak Eachempati, Ian T. Foster, William D. Gropp, Paul Hargrove, Wen-mei Hwu, Nikhil Jain, Laxmikant Kale, David Kirk, Kath Knobe, Ariram Krishnamoorthy, Jeffery A. Kuehn, Alexey Kukanov, Charles E. Leiserson, Jonathan Lifflander, Ewing Lusk, Tim Mattson, Bruce Palmer, Steven C. Pieper, Stephen W. Poole, Arch D. Robison, Frank Schlimbach, Rajeev Thakur, Abhinav Vishnu, Justin M. Wozniak, Michael Wilde, Kathy Yelick, Yili Zheng




Center for Programming Models for Scalable Parallel Computing


Book Description

Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.




Introduction to Parallel Computing


Book Description

A complete source of information on almost all aspects of parallel computing from introduction, to architectures, to programming paradigms, to algorithms, to programming standards. It covers traditional Computer Science algorithms, scientific computing algorithms and data intensive algorithms.




Programming Massively Parallel Processors


Book Description

Programming Massively Parallel Processors: A Hands-on Approach, Second Edition, teaches students how to program massively parallel processors. It offers a detailed discussion of various techniques for constructing parallel programs. Case studies are used to demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This guide shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth. This revised edition contains more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. It also provides new coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more; increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism; and two new case studies (on MRI reconstruction and molecular visualization) that explore the latest applications of CUDA and GPUs for scientific research and high-performance computing. This book should be a valuable resource for advanced students, software engineers, programmers, and hardware engineers. - New coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more - Increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism - Two new case studies (on MRI reconstruction and molecular visualization) explore the latest applications of CUDA and GPUs for scientific research and high-performance computing




Final Report


Book Description

Most large parallel computers now built use a hybrid architecture called a shared memory cluster. In this design, a computer consists of several nodes connected by an interconnection network. Each node contains a pool of memory and multiple processors that share direct access to it. Because shared memory clusters combine architectural features of shared memory computers and distributed memory computers, they support several different styles of parallel programming or programming models. (Further information on the design of these systems and their programming models appears in Section 2.) The purpose of this project was to investigate the programming models available on these systems and to answer three questions: (1) How easy to use are the different programming models in real applications? (2) How do the hardware and system software on different computers affect the performance of these programming models? (3) What are the performance characteristics of different programming models for typical LLNL applications on various shared memory clusters?




High-Level Parallel Programming Models and Supportive Environments


Book Description

On the 23rd of April, 2001, the 6th Workshop on High-Level Parallel P- gramming Models and Supportive Environments (LCTES’98) was held in San Francisco. HIPShas been held over the past six years in conjunction with IPDPS, the Internation Parallel and Distributed Processing Symposium. The HIPSworkshop focuses on high-level programming of networks of wo- stations, computing clusters and of massively-parallel machines. Its goal is to bring together researchers working in the areas of applications, language design, compilers, system architecture and programming tools to discuss new devel- ments in programming such systems. In recent years, several standards have emerged with an increasing demand of support for parallel and distributed processing. On one end, message-passing frameworks, such as PVM, MPI and VIA, provide support for basic commu- cation. On the other hand, distributed object standards, such as CORBA and DCOM, provide support for handling remote objects in a client-server fashion but also ensure certain guarantees for the quality of services. The key issues for the success of programming parallel and distributed en- ronments are high-level programming concepts and e?ciency. In addition, other quality categories have to be taken into account, such as scalability, security, bandwidth guarantees and fault tolerance, just to name a few. Today’s challenge is to provide high-level programming concepts without s- ri?cing e?ciency. This is only possible by carefully designing for those concepts and by providing supportive programming environments that facilitate program development and tuning.