Data Warehouses and OLAP


Book Description

Data warehouses and online analytical processing (OLAP) are emerging key technologies for enterprise decision support systems. They provide sophisticated technologies from data integration, data collection and retrieval, query optimization, and data analysis to advanced user interfaces. New research and technological achievements in the area of data warehousing are implemented in commercial database management systems, and organizations are developing data warehouse systems into their information system infrastructures. Data Warehouses and OLAP: Concepts, Architectures and Solutions covers a wide range of technical, technological, and research issues. It provides theoretical frameworks, presents challenges and their possible solutions, and examines the latest empirical research findings in the area. It is a resource of possible solutions and technologies that can be applied when designing, implementing, and deploying a data warehouse, and assists in the dissemination of knowledge in this field.




Oracle 10g Data Warehousing


Book Description

Oracle 10g Data Warehousing is a guide to using the Data Warehouse features in the latest version of Oracle —Oracle Database 10g. Written by people on the Oracle development team that designed and implemented the code and by people with industry experience implementing warehouses using Oracle technology, this thoroughly updated and extended edition provides an insider's view of how the Oracle Database 10g software is best used for your application.It provides a detailed look at the new features of Oracle Database 10g and other Oracle products and how these are used in the data warehouse. This book will show you how to deploy the Oracle database and correctly use the new Oracle Database 10g features for your data warehouse. It contains walkthroughs and examples on how to use tools such as Oracle Discoverer and Reports to query the warehouse and generate reports that can be deployed over the web and gain better insight into your business.This how-to guide provides step by step instructions including screen captures to make it easier to design, build and optimize performance of the data warehouse or data mart. It is a 'must have' reference for database developers, administrators and IT professionals who want to get to work now with all of the newest features of Oracle Database 10g.It provides a detailed look at the new features of Oracle Database 10g and other Oracle products and how these are used in the data warehouse - How to use the Summary Management features, including Materialized Views and query rewrite, to best effect to radically improve query performance - How to deploy business intelligence to the Web to satisfy today's changing and demanding business requirements - Using Oracle OLAP and Data Mining options - How to understand the warehouse hardware environment and how it is used by new features in the database including how to implement a high availability warehouse environment - Using the new management infrastructure in Oracle Database 10g and how this helps you to manage your warehouse environment




Exploratory Data Analytics for Healthcare


Book Description

Exploratory data analysis helps to recognize natural patterns hidden in the data. This book describes the tools for hypothesis generation by visualizing data through graphical representation and provides insight into advanced analytics concepts in an easy way. The book addresses the complete data visualization technologies workflow, explores basic and high-level concepts of computer science and engineering in medical science, and provides an overview of the clinical scientific research areas that enables smart diagnosis equipment. It will discuss techniques and tools used to explore large volumes of medical data and offers case studies that focus on the innovative technological upgradation and challenges faced today. The primary audience for the book includes specialists, researchers, graduates, designers, experts, physicians, and engineers who are doing research in this domain.




High-Performance Big Data Computing


Book Description

An in-depth overview of an emerging field that brings together high-performance computing, big data processing, and deep lLearning. Over the last decade, the exponential explosion of data known as big data has changed the way we understand and harness the power of data. The emerging field of high-performance big data computing, which brings together high-performance computing (HPC), big data processing, and deep learning, aims to meet the challenges posed by large-scale data processing. This book offers an in-depth overview of high-performance big data computing and the associated technical issues, approaches, and solutions. The book covers basic concepts and necessary background knowledge, including data processing frameworks, storage systems, and hardware capabilities; offers a detailed discussion of technical issues in accelerating big data computing in terms of computation, communication, memory and storage, codesign, workload characterization and benchmarking, and system deployment and management; and surveys benchmarks and workloads for evaluating big data middleware systems. It presents a detailed discussion of big data computing systems and applications with high-performance networking, computing, and storage technologies, including state-of-the-art designs for data processing and storage systems. Finally, the book considers some advanced research topics in high-performance big data computing, including designing high-performance deep learning over big data (DLoBD) stacks and HPC cloud technologies.




Database Internals


Book Description

When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals. Throughout the book, you’ll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You’ll discover that the most significant distinctions among many modern databases reside in subsystems that determine how storage is organized and how data is distributed. This book examines: Storage engines: Explore storage classification and taxonomy, and dive into B-Tree-based and immutable Log Structured storage engines, with differences and use-cases for each Storage building blocks: Learn how database files are organized to build efficient storage, using auxiliary data structures such as Page Cache, Buffer Pool and Write-Ahead Log Distributed systems: Learn step-by-step how nodes and processes connect and build complex communication patterns Database clusters: Which consistency models are commonly used by modern databases and how distributed storage systems achieve consistency




Database and Expert Systems Applications


Book Description

This book constitutes the refereed proceedings of the 17th International Conference on Database and Expert Systems Applications, DEXA 2006. The book presents 90 revised full papers together with 1 invited paper. The papers are organized in topical sections on XML, data and information, data mining and data warehouses, database applications, WWW, bioinformatics, process automation and workflow, knowledge management and expert systems, database theory, query processing, and privacy and security.







Encyclopedia of Data Warehousing and Mining


Book Description

Data Warehousing and Mining (DWM) is the science of managing and analyzing large datasets and discovering novel patterns and in recent years has emerged as a particularly exciting and industrially relevant area of research. Prodigious amounts of data are now being generated in domains as diverse as market research, functional genomics and pharmaceuticals; intelligently analyzing these data, with the aim of answering crucial questions and helping make informed decisions, is the challenge that lies ahead. The Encyclopedia of Data Warehousing and Mining provides a comprehensive, critical and descriptive examination of concepts, issues, trends, and challenges in this rapidly expanding field of data warehousing and mining (DWM). This encyclopedia consists of more than 350 contributors from 32 countries, 1,800 terms and definitions, and more than 4,400 references. This authoritative publication offers in-depth coverage of evolutions, theories, methodologies, functionalities, and applications of DWM in such interdisciplinary industries as healthcare informatics, artificial intelligence, financial modeling, and applied statistics, making it a single source of knowledge and latest discoveries in the field of DWM.




Proceedings of the Fall 2010 Future SOC Lab Day


Book Description

In Kooperation mit Partnern aus der Industrie etabliert das Hasso-Plattner-Institut (HPI) ein "HPI Future SOC Lab", das eine komplette Infrastruktur von hochkomplexen on-demand Systemen auf neuester, am Markt noch nicht verfügbarer, massiv paralleler (multi-/many-core) Hardware mit enormen Hauptspeicherkapazitäten und dafür konzipierte Software bereitstellt. Das HPI Future SOC Lab verfügt über prototypische 4- und 8-way Intel 64-Bit Serversysteme von Fujitsu und Hewlett-Packard mit 32- bzw. 64-Cores und 1 - 2 TB Hauptspeicher. Es kommen weiterhin hochperformante Speichersysteme von EMC2 sowie Virtualisierungslösungen von VMware zum Einsatz. SAP stellt ihre neueste Business by Design (ByD) Software zur Verfügung und auch komplexe reale Unternehmensdaten stehen zur Verfügung, auf die für Forschungszwecke zugegriffen werden kann. Interessierte Wissenschaftler aus universitären und außeruniversitären Forschungsinstitutionen können im HPI Future SOC Lab zukünftige hoch-komplexe IT-Systeme untersuchen, neue Ideen / Datenstrukturen / Algorithmen entwickeln und bis hin zur praktischen Erprobung verfolgen. Dieser Technische Bericht stellt erste Ergebnisse der im Rahmen der Eröffnung des Future SOC Labs im Juni 2010 gestarteten Forschungsprojekte vor. Ausgewählte Projekte stellten ihre Ergebnisse am 27. Oktober 2010 im Rahmen der Future SOC Lab Tag Veranstaltung vor.