Databases on Modern Hardware


Book Description

Data management systems enable various influential applications from high-performance online services (e.g., social networks like Twitter and Facebook or financial markets) to big data analytics (e.g., scientific exploration, sensor networks, business intelligence). As a result, data management systems have been one of the main drivers for innovations in the database and computer architecture communities for several decades. Recent hardware trends require software to take advantage of the abundant parallelism existing in modern and future hardware. The traditional design of the data management systems, however, faces inherent scalability problems due to its tightly coupled components. In addition, it cannot exploit the full capability of the aggressive micro-architectural features of modern processors. As a result, today's most commonly used server types remain largely underutilized leading to a huge waste of hardware resources and energy. In this book, we shed light on the challenges present while running DBMS on modern multicore hardware. We divide the material into two dimensions of scalability: implicit/vertical and explicit/horizontal. The first part of the book focuses on the vertical dimension: it describes the instruction- and data-level parallelism opportunities in a core coming from the hardware and software side. In addition, it examines the sources of under-utilization in a modern processor and presents insights and hardware/software techniques to better exploit the microarchitectural resources of a processor by improving cache locality at the right level of the memory hierarchy. The second part focuses on the horizontal dimension, i.e., scalability bottlenecks of database applications at the level of multicore and multisocket multicore architectures. It first presents a systematic way of eliminating such bottlenecks in online transaction processing workloads, which is based on minimizing unbounded communication, and shows several techniques that minimize bottlenecks in major components of database management systems. Then, it demonstrates the data and work sharing opportunities for analytical workloads, and reviews advanced scheduling mechanisms that are aware of nonuniform memory accesses and alleviate bandwidth saturation.




Databases on Modern Hardware


Book Description

Data management systems enable various influential applications from high-performance online services (e.g., social networks like Twitter and Facebook or financial markets) to big data analytics (e.g., scientific exploration, sensor networks, business intelligence). As a result, data management systems have been one of the main drivers for innovations in the database and computer architecture communities for several decades. Recent hardware trends require software to take advantage of the abundant parallelism existing in modern and future hardware. The traditional design of the data management systems, however, faces inherent scalability problems due to its tightly coupled components. In addition, it cannot exploit the full capability of the aggressive micro-architectural features of modern processors. As a result, today's most commonly used server types remain largely underutilized leading to a huge waste of hardware resources and energy. In this book, we shed light on the challenges present while running DBMS on modern multicore hardware. We divide the material into two dimensions of scalability: implicit/vertical and explicit/horizontal. The first part of the book focuses on the vertical dimension: it describes the instruction- and data-level parallelism opportunities in a core coming from the hardware and software side. In addition, it examines the sources of under-utilization in a modern processor and presents insights and hardware/software techniques to better exploit the microarchitectural resources of a processor by improving cache locality at the right level of the memory hierarchy. The second part focuses on the horizontal dimension, i.e., scalability bottlenecks of database applications at the level of multicore and multisocket multicore architectures. It first presents a systematic way of eliminating such bottlenecks in online transaction processing workloads, which is based on minimizing unbounded communication, and shows several techniques that minimize bottlenecks in major components of database management systems. Then, it demonstrates the data and work sharing opportunities for analytical workloads, and reviews advanced scheduling mechanisms that are aware of nonuniform memory accesses and alleviate bandwidth saturation.




Transaction Processing on Modern Hardware


Book Description

The last decade has brought groundbreaking developments in transaction processing. This resurgence of an otherwise mature research area has spurred from the diminishing cost per GB of DRAM that allows many transaction processing workloads to be entirely memory-resident. This shift demanded a pause to fundamentally rethink the architecture of database systems. The data storage lexicon has now expanded beyond spinning disks and RAID levels to include the cache hierarchy, memory consistency models, cache coherence and write invalidation costs, NUMA regions, and coherence domains. New memory technologies promise fast non-volatile storage and expose unchartered trade-offs for transactional durability, such as exploiting byte-addressable hot and cold storage through persistent programming that promotes simpler recovery protocols. In the meantime, the plateauing single-threaded processor performance has brought massive concurrency within a single node, first in the form of multi-core, and now with many-core and heterogeneous processors. The exciting possibility to reshape the storage, transaction, logging, and recovery layers of next-generation systems on emerging hardware have prompted the database research community to vigorously debate the trade-offs between specialized kernels that narrowly focus on transaction processing performance vs. designs that permit transactionally consistent data accesses from decision support and analytical workloads. In this book, we aim to classify and distill the new body of work on transaction processing that has surfaced in the last decade to navigate researchers and practitioners through this intricate research subject.




Data Processing Techniques on Modern Hardware Architectures


Book Description

The last decade has been characterized by radical changes in the computing landscape. We have witnessed the advent of multi-core processors, flash-based storage systems and the proliferation of scale out architectures, such as map-reduce-based systems and massively parallel databases. Although data management systems have embraced modern hardware technologies to some extent, they have not realized their full potential.The goal of this thesis is two-fold. Primarily, it demonstrates the staggering potential for performance improvement offered by modern hardware architectures and, then, proposes how data management systems must alter in order to realize this potential. Additionally, this thesis demonstrates that fully utilizing modern hardware architectures is important both for performance and energy-efficiency. Towards this goal, we propose query processing and indexing techniques for chip multiprocessors and we analyze the trade-offs of executing complex database queries on modern processor technologies. Subsequently, we propose query processing methods tailored to flash-based storage systems. Finally, we analyze the power consumption of database systems and we reveal opportunities for improving their energy efficiency.




Data Management on New Hardware


Book Description

This book contains selected papers from the 7th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS 2016, and the 4th International Workshop on In-Memory Data Management and Analytics, IMDM 2016, held in New Dehli, India, in September 2016. The joint Workshops were co-located with VLDB 2016. The 9 papers presented were carefully reviewed and selected from 18 submissions. They investigate opportunities in accelerating analytics/data management systems and workloads (including traditional OLTP, data warehousing/OLAP, ETL streaming/real-time, business analytics, and XML/RDF processing) running memory-only environments, using processors (e.g. commodity and specialized multi-core, GPUs and FPGAs, storage systems (e.g. storage-class memories like SSDs and phase-change memory), and hybrid programming models like CUDA, OpenCL, and Open ACC. The papers also explore the interplay between overall system design, core algorithms, query optimization strategies, programming approaches, performance modeling and evaluation, from the perspective of data management applications.




Data Processing in the Modern Hardware Landscape


Book Description

Enterprises are increasingly relying on data-driven decisions in nearly every aspect of their business functions, and databases are at the center of the computing infrastructure that enables this capability. Naturally, the demand for faster analytics continues to grow unabated. Database engineers and researchers rely on effectively harnessing the power of hardware to achieve high data processing performance. In recent years, the hardware landscape has been changing drastically as a result of two key technological trends. Adapting data processing algorithms and systems to the new hardware reality is crucial in order to meet the processing needs of the future. The first key technological trend is termed the ``Memory Wall'', a name that captures that processors are being increasingly bottlenecked by memory performance in traditional hardware architectures. We propose new algorithms for the core data operation of searching sorted data and the class of top-K queries with large outputs. The proposed algorithms are designed around modern processing needs and the modern hardware performance characteristics as they are shaped by the "Memory Wall''. The second key technological trend we study is the result of Dennard's scaling breaking down. We will soon reach the limit of being able to improve the performance of processors by simply scaling down the manufacturing scale. Novel architectures and emerging hardware technologies are being studied as solutions to keep improving performance. In the last chapter of this thesis, we take a codesign approach to develop a database software system and an associative processor for analytic workloads. Associative processors have re-emerged as an attractive architecture that offers very large data-level parallelism. We show that following a codesign approach associative processors can be used to accelerate analytic query processing.




Transaction Processing on Modern Hardware


Book Description

The last decade has brought groundbreaking developments in transaction processing. This resurgence of an otherwise mature research area has spurred from the diminishing cost per GB of DRAM that allows many transaction processing workloads to be entirely memory-resident. This shift demanded a pause to fundamentally rethink the architecture of database systems. The data storage lexicon has now expanded beyond spinning disks and RAID levels to include the cache hierarchy, memory consistency models, cache coherence and write invalidation costs, NUMA regions, and coherence domains. New memory technologies promise fast non-volatile storage and expose unchartered trade-offs for transactional durability, such as exploiting byte-addressable hot and cold storage through persistent programming that promotes simpler recovery protocols. In the meantime, the plateauing single-threaded processor performance has brought massive concurrency within a single node, first in the form of multi-core, and now with many-core and heterogeneous processors. The exciting possibility to reshape the storage, transaction, logging, and recovery layers of next-generation systems on emerging hardware have prompted the database research community to vigorously debate the trade-offs between specialized kernels that narrowly focus on transaction processing performance vs. designs that permit transactionally consistent data accesses from decision support and analytical workloads. In this book, we aim to classify and distill the new body of work on transaction processing that has surfaced in the last decade to navigate researchers and practitioners through this intricate research subject.




Database Machines


Book Description

This volume consists of a collection of 28 papers presented at the NATO Advanced Study Institute held July 14-27, 1985 in the beautiful resort at Les Arcs, France. The director of this ASI was A. K. Sood and A. H. Qureshi was the co-director. Since its introduction in the early 1970s the relational data model has been widely accepted. Several research and industrial efforts are being undertaken to develop special purpose database machines to implement the relational model. In addition, database machines are being explored for applications such as image processing and information retrieval. In this NATO-ASI the lecturers discussed special purpose database machine architectures from the viewpoint of architecture and hardware detail, software, user needs, theoretical framework and applications. The papers presented were of two types - regular papers and short papers. The research in database machines is being conducted in several countries. This fact is under scored when it is noted that papers in this volume are authored by researchers in France, Germany, Italy, Japan, Portugal, Turkey, U.K. and U.S.A. The first paper discusses the experience and applications of users with a commercially available database machine. In the following eight papers the characteristics of six database machines are discussed. The second, third and fourth papers deal with the RDBM project at the Technical University of Braunschweig (Germany). Zeidler discusses the design objectives, architecture and system design of RDBM. Teich presents the hardware utilized for sorting.




Information Systems for Business and Beyond


Book Description

"Information Systems for Business and Beyond introduces the concept of information systems, their use in business, and the larger impact they are having on our world."--BC Campus website.




The Design and Implementation of Modern Column-Oriented Database Systems


Book Description

The Design and Implementation of Modern Column-Oriented Database Systems discusses modern column-stores, their architecture and evolution as well the benefits they can bring in data analytics.