Main Memory Database Systems


Book Description

With growing memory sizes and memory prices dropping by a factor of 10 every 5 years, data having a "primary home" in memory is now a reality. Main-memory databases eschew many of the traditional architectural pillars of relational database systems that optimized for disk-resident data. The result of these memory-optimized designs are systems that feature several innovative approaches to fundamental issues (e.g., concurrency control, query processing) that achieve orders of magnitude performance improvements over traditional designs. This monograph provides an overview of recent developments in main-memory database systems. It covers five main issues and architectural choices that need to be made when building a high performance main-memory optimized database: data organization and storage, indexing, concurrency control, durability and recovery techniques, and query processing and compilation. The monograph focuses on four commercial and research systems: H-Store/VoltDB, Hekaton, HyPer, and SAPHANA. These systems are diverse in their design choices and form a representative sample of the state of the art in main-memory database systems. It also covers other commercial and academic systems, along with current and future research trends.




A Course in In-Memory Data Management


Book Description

Recent achievements in hardware and software development, such as multi-core CPUs and DRAM capacities of multiple terabytes per server, enabled the introduction of a revolutionary technology: in-memory data management. This technology supports the flexible and extremely fast analysis of massive amounts of enterprise data. Professor Hasso Plattner and his research group at the Hasso Plattner Institute in Potsdam, Germany, have been investigating and teaching the corresponding concepts and their adoption in the software industry for years. This book is based on an online course that was first launched in autumn 2012 with more than 13,000 enrolled students and marked the successful starting point of the openHPI e-learning platform. The course is mainly designed for students of computer science, software engineering, and IT related subjects, but addresses business experts, software developers, technology experts, and IT analysts alike. Plattner and his group focus on exploring the inner mechanics of a column-oriented dictionary-encoded in-memory database. Covered topics include - amongst others - physical data storage and access, basic database operators, compression mechanisms, and parallel join algorithms. Beyond that, implications for future enterprise applications and their development are discussed. Step by step, readers will understand the radical differences and advantages of the new technology over traditional row-oriented, disk-based databases. In this completely revised 2nd edition, we incorporate the feedback of thousands of course participants on openHPI and take into account latest advancements in hard- and software. Improved figures, explanations, and examples further ease the understanding of the concepts presented. We introduce advanced data management techniques such as transparent aggregate caches and provide new showcases that demonstrate the potential of in-memory databases for two diverse industries: retail and life sciences.




Main Memory Database Systems


Book Description

This article provides an overview of recent developments in main-memory database systems. With growing memory sizes and memory prices dropping by a factor of 10 every 5 years, data having a "primary home" in memory is now a reality. Main-memory databases eschew many of the traditional architectural pillars of relational database systems that optimized for disk-resident data. The result of these memory-optimized designs are systems that feature several innovative approaches to fundamental issues (e.g., concurrency control, query processing) that achieve orders of magnitude performance improvements over traditional designs. Our survey covers five main issues and architectural choices that need to be made when building a high performance main-memory optimized database: data organization and storage, indexing, concurrency control, durability and recovery techniques, and query processing and compilation. We focus our survey on four commercial and research systems: H-Store/VoltDB, Hekaton, HyPer, and SAP HANA. These systems are diverse in their design choices and form a representative sample of the state of the art in main-memory database systems. We also cover other commercial and academic systems, along with current and future research trends.




Main Memory Databases


Book Description

"The declining cost of main memory coupled with higher performance expectations has inspired researchers and developers to look into systems with massive amounts of memory, especially for systems with databases. When memory is large enough, an entire database can permanently reside in main memory. These database systems are known as memory resident database or Main Memory DataBase (MMDB) systems. When a database is a main memory resident, systems can achieve significant performance improvements over traditional disk-based database systems by eliminating the overhead required to perform Input/Output operations. However, the problem of crash recovery becomes by far the most challenging issue in MMDB systems, since it can occur frequently and cause a drastic crisis by wiping out the entire content of the database. The kind of crisis that can be encountered due to memory failures, the way to prevent them from occuring in the first place, methods of dealing with them upon their occurance, strategies to make the system more reliable to avoid them in the future, and several other issues like these must be well understood. Approaches to deal with crash recoveries must be sound, well designed, and correctly implemented. The objective of this thesis is to study some of the pioneering works of researchers and developers in the area of MMDB recovery and to perform an in-depth analysis of MMDB recovery issues and approaches taken by several major recovery techniques. The analysis examines the reasons for these techniques, design and architecture, and their benefits and limitations over one another. My independent evaluation of the relative benefits and limitations of each technique coupled with marketablitiy analysis is also presented. The approach used in this study is different from that of other studies that have been published so far in this area. In this study, the reader is given a basic introduction to the topic and then gradually taken through the final stages of analysis and comparison of several recovery techniques."--Author's abstract.




Main Memory Management on Relational Database Systems


Book Description

This book provides basic knowledge about main memory management in relational databases as it is needed to support large-scale applications processed completely in memory. In business operations, real-time predictability and high speed is a must. Hence every opportunity must be exploited to improve performance, including reducing dependency on the hard disk, adding more memory to make more data resident in the memory, and even deploying an in-memory system where all data can be kept in memory. The book provides one chapter for each of the main related topics, i.e. the memory system, memory management, virtual memory, and databases and their memory systems, and it is complemented by a short survey of six commercial systems: TimesTen, MySQL, VoltDB, Hekaton, HyPer/ScyPer, and SAP HANA.




Implementation techniques for main memory database systems


Book Description

With the availability of very large, relatively inexpensive main memories, it is becoming possible to keep large databases resident in main memory. This paper considers the changes necessary to permit a relational database system to take advantage of large amounts of main memory. The authors evaluate AVL vs. B+- tree access methods for main memory databases, hash-based query processing strategies vs. sort-merge, and study recovery issues when most or all of the database fits in main memory. As expected, B+- trees are the preferred storage mechanism unless more than 80-90% of the database fits in main memory. A somewhat surprising result is that hash based query processing strategies are advantageous for large memory situations. Keywords: Access; Algorithms. (Author).




In Memory Data Management and Analysis


Book Description

This book constitutes the thoroughly refereed post conference proceedings of the First and Second International Workshops on In Memory Data Management and Analysis held in Riva del Garda, Italy, August 2013 and Hangzhou, China, in September 2014. The 11 revised full papers were carefully reviewed and selected from 18 submissions and cover topics from main-memory graph analytics platforms to main-memory OLTP applications.




In-Memory Data Management


Book Description

This book examines for the first time, the ways that in-memory computing is changing the way businesses are run. The authors describe techniques that allow analytical and transactional processing at the speed of thought and enable new ways of doing business.







Database Internals


Book Description

When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals. Throughout the book, you’ll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You’ll discover that the most significant distinctions among many modern databases reside in subsystems that determine how storage is organized and how data is distributed. This book examines: Storage engines: Explore storage classification and taxonomy, and dive into B-Tree-based and immutable Log Structured storage engines, with differences and use-cases for each Storage building blocks: Learn how database files are organized to build efficient storage, using auxiliary data structures such as Page Cache, Buffer Pool and Write-Ahead Log Distributed systems: Learn step-by-step how nodes and processes connect and build complex communication patterns Database clusters: Which consistency models are commonly used by modern databases and how distributed storage systems achieve consistency