Very Large Data Bases


Book Description




Proceedings of the 7th International Conference on Emerging Databases


Book Description

This proceedings volume presents selected papers from the 7th International Conference on Emerging Databases: Technologies, Applications, and Theory (EDB 2017), which was held in Busan, Korea from 7 to 9 August, 2017. This conference series was launched by the Korean Institute of Information Scientists and Engineers (KIISE) Database Society of Korea as an annual forum for exploring novel technologies, applications, and research advances in the field of emerging databases. This forum has evolved into the premier international venue for researchers and practitioners to discuss current research issues, challenges, new technologies, and solutions.




Mining of Data with Complex Structures


Book Description

Mining of Data with Complex Structures: - Clarifies the type and nature of data with complex structure including sequences, trees and graphs - Provides a detailed background of the state-of-the-art of sequence mining, tree mining and graph mining. - Defines the essential aspects of the tree mining problem: subtree types, support definitions, constraints. - Outlines the implementation issues one needs to consider when developing tree mining algorithms (enumeration strategies, data structures, etc.) - Details the Tree Model Guided (TMG) approach for tree mining and provides the mathematical model for the worst case estimate of complexity of mining ordered induced and embedded subtrees. - Explains the mechanism of the TMG framework for mining ordered/unordered induced/embedded and distance-constrained embedded subtrees. - Provides a detailed comparison of the different tree mining approaches highlighting the characteristics and benefits of each approach. - Overviews the implications and potential applications of tree mining in general knowledge management related tasks, and uses Web, health and bioinformatics related applications as case studies. - Details the extension of the TMG framework for sequence mining - Provides an overview of the future research direction with respect to technical extensions and application areas The primary audience is 3rd year, 4th year undergraduate students, Masters and PhD students and academics. The book can be used for both teaching and research. The secondary audiences are practitioners in industry, business, commerce, government and consortiums, alliances and partnerships to learn how to introduce and efficiently make use of the techniques for mining of data with complex structures into their applications. The scope of the book is both theoretical and practical and as such it will reach a broad market both within academia and industry. In addition, its subject matter is a rapidly emerging field that is critical for efficient analysis of knowledge stored in various domains.




Machine Learning for Data Streams


Book Description

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.




Data Warehousing and Knowledge Discovery


Book Description

The Second International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2000) was held in Greenwich, UK 4–6 September. DaWaK 2000 was a forum where researchers from data warehousing and knowledge discovery disciplines could exchange ideas on improving next generation decision support and data mining systems. The conference focused on the logical and physical design of data warehousing and knowledge discovery systems. The scope of the papers covered the most recent and relevant topics in the areas of data warehousing, multidimensional databases, OLAP, knowledge discovery and mining complex databases. These proceedings contain the technical papers selected for presentation at the conference. We received more than 90 papers from over 20 countries and the program committee finally selected 31 long papers and 11 short papers. The conference program included three invited talks, namely, “A Foolish Consistency: Technical Challenges in Consistency Management” by Professor Anthony Finkelstein, University College London, UK; “European Plan for Research in Data Warehousing and Knowledge Discovery” by Dr. Harald Sonnberger (Head of Unit A4, Eurostat, European Commission); and “Security in Data Warehousing” by Professor Bharat Bhargava, Purdue University, USA.




11th International Symposium on High Performance Distributed Computing


Book Description

Forty-two full papers from the July 2002 conference in Edinburgh discuss data servers and grid storage, adapting to grid behavior, grid resource management, applications frameworks, parallel application analysis optimizing grid performance, grid practice and experience, communication and RPC protocols, grid job submission and scheduling, and adapti







Adaptive Query Processing


Book Description

Adaptive Query Processing surveys the fundamental issues, techniques, costs, and benefits of adaptive query processing. It begins with a broad overview of the field, identifying the dimensions of adaptive techniques. It then looks at the spectrum of approaches available to adapt query execution at runtime - primarily in a non-streaming context. The emphasis is on simplifying and abstracting the key concepts of each technique, rather than reproducing the full details available in the papers. The authors identify the strengths and limitations of the different techniques, demonstrate when they are most useful, and suggest possible avenues of future research. Adaptive Query Processing serves as a valuable reference for students of databases, providing a thorough survey of the area. Database researchers will benefit from a more complete point of view, including a number of approaches which they may not have focused on within the scope of their own research.




Databases Theory and Applications


Book Description

This book constitutes the refereed proceedings of the 25th Australasian Database Conference, ADC 2014, held in Brisbane, NSW, Australia, in July 2014. The 15 full papers presented together with 6 short papers and 2 keynotes were carefully reviewed and selected from 38 submissions. A large variety of subjects are covered, including hot topics such as data warehousing; database integration; mobile databases; cloud, distributed, and parallel databases; high dimensional and temporal data; image/video retrieval and databases; database performance and tuning; privacy and security in databases; query processing and optimization; semi-structured data and XML; spatial data processing and management; stream and sensor data management; uncertain and probabilistic databases; web databases; graph databases; web service management; and social media data management.