Mining Sequential Patterns from Large Data Sets


Book Description

In many applications, e.g., bioinformatics, web access traces, system u- lization logs, etc., the data is naturally in the form of sequences. It has been of great interests to analyze the sequential data to find their inherent char- teristics. The sequential pattern is one of the most widely studied models to capture such characteristics. Examples of sequential patterns include but are not limited to protein sequence motifs and web page navigation traces. In this book, we focus on sequential pattern mining. To meet different needs of various applications, several models of sequential patterns have been proposed. We do not only study the mathematical definitions and application domains of these models, but also the algorithms on how to effectively and efficiently find these patterns. The objective of this book is to provide computer scientists and domain - perts such as life scientists with a set of tools in analyzing and understanding the nature of various sequences by : (1) identifying the specific model(s) of - quential patterns that are most suitable, and (2) providing an efficient algorithm for mining these patterns. Chapter 1 INTRODUCTION Data Mining is the process of extracting implicit knowledge and discovery of interesting characteristics and patterns that are not explicitly represented in the databases. The techniques can play an important role in understanding data and in capturing intrinsic relationships among data instances. Data mining has been an active research area in the past decade and has been proved to be very useful.




Proceedings of the Third SIAM International Conference on Data Mining


Book Description

The third SIAM International Conference on Data Mining provided an open forum for the presentation, discussion and development of innovative algorithms, software and theories for data mining applications and data intensive computation. This volume includes 21 research papers.




Mining Sequential Patterns from Large Data Sets


Book Description

In many applications, e.g., bioinformatics, web access traces, system u- lization logs, etc., the data is naturally in the form of sequences. It has been of great interests to analyze the sequential data to find their inherent char- teristics. The sequential pattern is one of the most widely studied models to capture such characteristics. Examples of sequential patterns include but are not limited to protein sequence motifs and web page navigation traces. In this book, we focus on sequential pattern mining. To meet different needs of various applications, several models of sequential patterns have been proposed. We do not only study the mathematical definitions and application domains of these models, but also the algorithms on how to effectively and efficiently find these patterns. The objective of this book is to provide computer scientists and domain - perts such as life scientists with a set of tools in analyzing and understanding the nature of various sequences by : (1) identifying the specific model(s) of - quential patterns that are most suitable, and (2) providing an efficient algorithm for mining these patterns. Chapter 1 INTRODUCTION Data Mining is the process of extracting implicit knowledge and discovery of interesting characteristics and patterns that are not explicitly represented in the databases. The techniques can play an important role in understanding data and in capturing intrinsic relationships among data instances. Data mining has been an active research area in the past decade and has been proved to be very useful.




Frequent Pattern Mining


Book Description

This comprehensive reference consists of 18 chapters from prominent researchers in the field. Each chapter is self-contained, and synthesizes one aspect of frequent pattern mining. An emphasis is placed on simplifying the content, so that students and practitioners can benefit from the book. Each chapter contains a survey describing key research on the topic, a case study and future directions. Key topics include: Pattern Growth Methods, Frequent Pattern Mining in Data Streams, Mining Graph Patterns, Big Data Frequent Pattern Mining, Algorithms for Data Clustering and more. Advanced-level students in computer science, researchers and practitioners from industry will find this book an invaluable reference.




Pattern Discovery Using Sequence Data Mining


Book Description

"This book provides a comprehensive view of sequence mining techniques, and present current research and case studies in Pattern Discovery in Sequential data authored by researchers and practitioners"--




High-Utility Pattern Mining


Book Description

This book presents an overview of techniques for discovering high-utility patterns (patterns with a high importance) in data. It introduces the main types of high-utility patterns, as well as the theory and core algorithms for high-utility pattern mining, and describes recent advances, applications, open-source software, and research opportunities. It also discusses several types of discrete data, including customer transaction data and sequential data. The book consists of twelve chapters, seven of which are surveys presenting the main subfields of high-utility pattern mining, including itemset mining, sequential pattern mining, big data pattern mining, metaheuristic-based approaches, privacy-preserving pattern mining, and pattern visualization. The remaining five chapters describe key techniques and applications, such as discovering concise representations and regular patterns.




Sequence Data Mining


Book Description

Understanding sequence data, and the ability to utilize this hidden knowledge, will create a significant impact on many aspects of our society. Examples of sequence data include DNA, protein, customer purchase history, web surfing history, and more. This book provides thorough coverage of the existing results on sequence data mining as well as pattern types and associated pattern mining methods. It offers balanced coverage on data mining and sequence data analysis, allowing readers to access the state-of-the-art results in one place.




Mining of Massive Datasets


Book Description

Now in its second edition, this book focuses on practical algorithms for mining data from even the largest datasets.




Advances in Database Technology EDBT '96


Book Description

This book presents the refereed proceedings of the Fifth International Conference on Extending Database Technology, EDBT'96, held in Avignon, France in March 1996. The 31 full revised papers included were selected from a total of 178 submissions; also included are some industrial-track papers, contributed by partners of several ESPRIT projects. The volume is organized in topical sections on data mining, active databases, design tools, advanced DBMS, optimization, warehousing, system issues, temporal databases, the web and hypermedia, performance, workflow management, database design, and parallel databases.




Periodic Pattern Mining


Book Description

This book provides an introduction to the field of periodic pattern mining, reviews state-of-the-art techniques, discusses recent advances, and reviews open-source software. Periodic pattern mining is a popular and emerging research area in the field of data mining. It involves discovering all regularly occurring patterns in temporal databases. One of the major applications of periodic pattern mining is the analysis of customer transaction databases to discover sets of items that have been regularly purchased by customers. Discovering such patterns has several implications for understanding the behavior of customers. Since the first work on periodic pattern mining, numerous studies have been published and great advances have been made in this field. The book consists of three main parts: introduction, algorithms, and applications. The first chapter is an introduction to pattern mining and periodic pattern mining. The concepts of periodicity, periodic support, search space exploration techniques, and pruning strategies are discussed. The main types of algorithms are also presented such as periodic-frequent pattern growth, partial periodic pattern-growth, and periodic high-utility itemset mining algorithm. Challenges and research opportunities are reviewed. The chapters that follow present state-of-the-art techniques for discovering periodic patterns in (1) transactional databases, (2) temporal databases, (3) quantitative temporal databases, and (4) big data. Then, the theory on concise representations of periodic patterns is presented, as well as hiding sensitive information using privacy-preserving data mining techniques. The book concludes with several applications of periodic pattern mining, including applications in air pollution data analytics, accident data analytics, and traffic congestion analytics.