Knowledge Discovery from Data Streams


Book Description

Since the beginning of the Internet age and the increased use of ubiquitous computing devices, the large volume and continuous flow of distributed data have imposed new constraints on the design of learning algorithms. Exploring how to extract knowledge structures from evolving and time-changing data, Knowledge Discovery from Data Streams presents




Machine Learning for Data Streams


Book Description

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.




Adaptive and Intelligent Systems


Book Description

This book constitutes the proceedings of the International Conference on Adaptive and Intelligent Systems, ICAIS 2011, held in Klagenfurt, Austria, in September 2011. The 36 full papers included in these proceedings together with the abstracts of 4 invited talks, were carefully reviewed and selected from 72 submissions. The contributions are organized under the following topical sections: incremental learning; adaptive system architecture; intelligent system engineering; data mining and pattern recognition; intelligent agents; and computational intelligence.




Learning from Data Streams


Book Description

Processing data streams has raised new research challenges over the last few years. This book provides the reader with a comprehensive overview of stream data processing, including famous prototype implementations like the Nile system and the TinyOS operating system. Applications in security, the natural sciences, and education are presented. The huge bibliography offers an excellent starting point for further reading and future research.




Machine Learning and Knowledge Discovery in Databases


Book Description

This book constitutes the refereed proceedings of the joint conference on Machine Learning and Knowledge Discovery in Databases: ECML PKDD 2009, held in Bled, Slovenia, in September 2009. The 106 papers presented in two volumes, together with 5 invited talks, were carefully reviewed and selected from 422 paper submissions. In addition to the regular papers the volume contains 14 abstracts of papers appearing in full version in the Machine Learning Journal and the Knowledge Discovery and Databases Journal of Springer. The conference intends to provide an international forum for the discussion of the latest high quality research results in all areas related to machine learning and knowledge discovery in databases. The topics addressed are application of machine learning and data mining methods to real-world problems, particularly exploratory research that describes novel learning and mining tasks and applications requiring non-standard techniques.




Learning from Data Streams in Evolving Environments


Book Description

This edited book covers recent advances of techniques, methods and tools treating the problem of learning from data streams generated by evolving non-stationary processes. The goal is to discuss and overview the advanced techniques, methods and tools that are dedicated to manage, exploit and interpret data streams in non-stationary environments. The book includes the required notions, definitions, and background to understand the problem of learning from data streams in non-stationary environments and synthesizes the state-of-the-art in the domain, discussing advanced aspects and concepts and presenting open problems and future challenges in this field. Provides multiple examples to facilitate the understanding data streams in non-stationary environments; Presents several application cases to show how the methods solve different real world problems; Discusses the links between methods to help stimulate new research and application directions.




Learning in Non-Stationary Environments


Book Description

Recent decades have seen rapid advances in automatization processes, supported by modern machines and computers. The result is significant increases in system complexity and state changes, information sources, the need for faster data handling and the integration of environmental influences. Intelligent systems, equipped with a taxonomy of data-driven system identification and machine learning algorithms, can handle these problems partially. Conventional learning algorithms in a batch off-line setting fail whenever dynamic changes of the process appear due to non-stationary environments and external influences. Learning in Non-Stationary Environments: Methods and Applications offers a wide-ranging, comprehensive review of recent developments and important methodologies in the field. The coverage focuses on dynamic learning in unsupervised problems, dynamic learning in supervised classification and dynamic learning in supervised regression problems. A later section is dedicated to applications in which dynamic learning methods serve as keystones for achieving models with high accuracy. Rather than rely on a mathematical theorem/proof style, the editors highlight numerous figures, tables, examples and applications, together with their explanations. This approach offers a useful basis for further investigation and fresh ideas and motivates and inspires newcomers to explore this promising and still emerging field of research.




Learning from Imbalanced Data Sets


Book Description

This book provides a general and comprehensible overview of imbalanced learning. It contains a formal description of a problem, and focuses on its main features, and the most relevant proposed solutions. Additionally, it considers the different scenarios in Data Science for which the imbalanced classification can create a real challenge. This book stresses the gap with standard classification tasks by reviewing the case studies and ad-hoc performance metrics that are applied in this area. It also covers the different approaches that have been traditionally applied to address the binary skewed class distribution. Specifically, it reviews cost-sensitive learning, data-level preprocessing methods and algorithm-level solutions, taking also into account those ensemble-learning solutions that embed any of the former alternatives. Furthermore, it focuses on the extension of the problem for multi-class problems, where the former classical methods are no longer to be applied in a straightforward way. This book also focuses on the data intrinsic characteristics that are the main causes which, added to the uneven class distribution, truly hinders the performance of classification algorithms in this scenario. Then, some notes on data reduction are provided in order to understand the advantages related to the use of this type of approaches. Finally this book introduces some novel areas of study that are gathering a deeper attention on the imbalanced data issue. Specifically, it considers the classification of data streams, non-classical classification problems, and the scalability related to Big Data. Examples of software libraries and modules to address imbalanced classification are provided. This book is highly suitable for technical professionals, senior undergraduate and graduate students in the areas of data science, computer science and engineering. It will also be useful for scientists and researchers to gain insight on the current developments in this area of study, as well as future research directions.




Optimum-Path Forest


Book Description

Optimum-Path Forest: Theory, Algorithms, and Applications was first published in 2008 in its supervised and unsupervised versions with applications in medicine and image classification. Since then, it has expanded to a variety of other applications such as remote sensing, electrical and petroleum engineering, and biology. In recent years, multi-label and semi-supervised versions were also developed to handle video classification problems. The book presents the principles, algorithms and applications of Optimum-Path Forest, giving the theory and state-of-the-art as well as insights into future directions. Presents the first book on Optimum-path Forest Shows how it can be used with Deep Learning Gives a wide range of applications Includes the methods, underlying theory and applications of Optimum-Path Forest (OPF)




Evaluating Learning Algorithms


Book Description

The field of machine learning has matured to the point where many sophisticated learning approaches can be applied to practical applications. Thus it is of critical importance that researchers have the proper tools to evaluate learning approaches and understand the underlying issues. This book examines various aspects of the evaluation process with an emphasis on classification algorithms. The authors describe several techniques for classifier performance assessment, error estimation and resampling, obtaining statistical significance as well as selecting appropriate domains for evaluation. They also present a unified evaluation framework and highlight how different components of evaluation are both significantly interrelated and interdependent. The techniques presented in the book are illustrated using R and WEKA, facilitating better practical insight as well as implementation. Aimed at researchers in the theory and applications of machine learning, this book offers a solid basis for conducting performance evaluations of algorithms in practical settings.