Outlier Ensembles


Book Description

This book discusses a variety of methods for outlier ensembles and organizes them by the specific principles with which accuracy improvements are achieved. In addition, it covers the techniques with which such methods can be made more effective. A formal classification of these methods is provided, and the circumstances in which they work well are examined. The authors cover how outlier ensembles relate (both theoretically and practically) to the ensemble techniques used commonly for other data mining problems like classification. The similarities and (subtle) differences in the ensemble techniques for the classification and outlier detection problems are explored. These subtle differences do impact the design of ensemble algorithms for the latter problem. This book can be used for courses in data mining and related curricula. Many illustrative examples and exercises are provided in order to facilitate classroom teaching. A familiarity is assumed to the outlier detection problem and also to generic problem of ensemble analysis in classification. This is because many of the ensemble methods discussed in this book are adaptations from their counterparts in the classification domain. Some techniques explained in this book, such as wagging, randomized feature weighting, and geometric subsampling, provide new insights that are not available elsewhere. Also included is an analysis of the performance of various types of base detectors and their relative effectiveness. The book is valuable for researchers and practitioners for leveraging ensemble methods into optimal algorithmic design.




Outlier Analysis


Book Description

This book provides comprehensive coverage of the field of outlier analysis from a computer science point of view. It integrates methods from data mining, machine learning, and statistics within the computational framework and therefore appeals to multiple communities. The chapters of this book can be organized into three categories: Basic algorithms: Chapters 1 through 7 discuss the fundamental algorithms for outlier analysis, including probabilistic and statistical methods, linear methods, proximity-based methods, high-dimensional (subspace) methods, ensemble methods, and supervised methods. Domain-specific methods: Chapters 8 through 12 discuss outlier detection algorithms for various domains of data, such as text, categorical data, time-series data, discrete sequence data, spatial data, and network data. Applications: Chapter 13 is devoted to various applications of outlier analysis. Some guidance is also provided for the practitioner. The second edition of this book is more detailed and is written to appeal to both researchers and practitioners. Significant new material has been added on topics such as kernel methods, one-class support-vector machines, matrix factorization, neural networks, outlier ensembles, time-series methods, and subspace methods. It is written as a textbook and can be used for classroom teaching.




Towards Extensible and Adaptable Methods in Computing


Book Description

This book addresses extensible and adaptable computing, a broad range of methods and techniques used to systematically tackle the future growth of systems and respond proactively and seamlessly to change. The book is divided into five main sections: Agile Software Development, Data Management, Web Intelligence, Machine Learning and Computing in Education. These sub-domains of computing work together in mutually complementary ways to build systems and applications that scale well, and which can successfully meet the demands of changing times and contexts. The topics under each track have been carefully selected to highlight certain qualitative aspects of applications and systems, such as scalability, flexibility, integration, efficiency and context awareness. The first section (Agile Software Development) includes six contributions that address related issues, including risk management, test case prioritization and tools, open source software reliability and predicting the change proneness of software. The second section (Data Management) includes discussions on myriad issues, such as extending database caches using solid-state devices, efficient data transmission, healthcare applications and data security. In turn, the third section (Machine Learning) gathers papers that investigate ML algorithms and present their specific applications such as portfolio optimization, disruption classification and outlier detection. The fourth section (Web Intelligence) covers emerging applications such as metaphor detection, language identification and sentiment analysis, and brings to the fore web security issues such as fraud detection and trust/reputation systems. In closing, the fifth section (Computing in Education) focuses on various aspects of computer-aided pedagogical methods.




Artificial Intelligence Applications and Innovations


Book Description

This two-volume set of IFIP-AICT 675 and 676 constitutes the refereed proceedings of the 19th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2023, held in León, Spain, during June 14–17, 2023. This event was held in hybrid mode. The 75 regular papers and 17 short papers presented in this two-volume set were carefully reviewed and selected from 185 submissions. The papers cover the following topics: Deep Learning (Reinforcement/Recurrent Gradient Boosting/Adversarial); Agents/Case Based Reasoning/Sentiment Analysis; Biomedical - Image Analysis; CNN - Convolutional Neural Networks YOLO CNN; Cyber Security/Anomaly Detection; Explainable AI/Social Impact of AI; Graph Neural Networks/Constraint Programming; IoT/Fuzzy Modeling/Augmented Reality; LEARNING (Active-AutoEncoders-Federated); Machine Learning; Natural Language; Optimization-Genetic Programming; Robotics; Spiking NN; and Text Mining /Transfer Learning.




Computational Intelligence in Data Mining - Volume 2


Book Description

The contributed volume aims to explicate and address the difficulties and challenges that of seamless integration of the two core disciplines of computer science, i.e., computational intelligence and data mining. Data Mining aims at the automatic discovery of underlying non-trivial knowledge from datasets by applying intelligent analysis techniques. The interest in this research area has experienced a considerable growth in the last years due to two key factors: (a) knowledge hidden in organizations’ databases can be exploited to improve strategic and managerial decision-making; (b) the large volume of data managed by organizations makes it impossible to carry out a manual analysis. The book addresses different methods and techniques of integration for enhancing the overall goal of data mining. The book helps to disseminate the knowledge about some innovative, active research directions in the field of data mining, machine and computational intelligence, along with some current issues and applications of related topics.




Encyclopedia of Bioinformatics and Computational Biology


Book Description

Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, Three Volume Set combines elements of computer science, information technology, mathematics, statistics and biotechnology, providing the methodology and in silico solutions to mine biological data and processes. The book covers Theory, Topics and Applications, with a special focus on Integrative –omics and Systems Biology. The theoretical, methodological underpinnings of BCB, including phylogeny are covered, as are more current areas of focus, such as translational bioinformatics, cheminformatics, and environmental informatics. Finally, Applications provide guidance for commonly asked questions. This major reference work spans basic and cutting-edge methodologies authored by leaders in the field, providing an invaluable resource for students, scientists, professionals in research institutes, and a broad swath of researchers in biotechnology and the biomedical and pharmaceutical industries. Brings together information from computer science, information technology, mathematics, statistics and biotechnology Written and reviewed by leading experts in the field, providing a unique and authoritative resource Focuses on the main theoretical and methodological concepts before expanding on specific topics and applications Includes interactive images, multimedia tools and crosslinking to further resources and databases




Data Mining


Book Description

This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues. It goes beyond the traditional focus on data mining problems to introduce advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. Until now, no single book has addressed all these topics in a comprehensive and integrated way. The chapters of this book fall into one of three categories: Fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems. Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data. Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor. Appropriate for both introductory and advanced data mining courses, Data Mining: The Textbook balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners (including those with a limited mathematical background). Numerous illustrations, examples, and exercises are included, with an emphasis on semantically interpretable examples. Praise for Data Mining: The Textbook - “As I read through this book, I have already decided to use it in my classes. This is a book written by an outstanding researcher who has made fundamental contributions to data mining, in a way that is both accessible and up to date. The book is complete with theory and practical use cases. It’s a must-have for students and professors alike!" -- Qiang Yang, Chair of Computer Science and Engineering at Hong Kong University of Science and Technology "This is the most amazing and comprehensive text book on data mining. It covers not only the fundamental problems, such as clustering, classification, outliers and frequent patterns, and different data types, including text, time series, sequences, spatial data and graphs, but also various applications, such as recommenders, Web, social network and privacy. It is a great book for graduate students and researchers as well as practitioners." -- Philip S. Yu, UIC Distinguished Professor and Wexler Chair in Information Technology at University of Illinois at Chicago




Information, Communication and Computing Technology


Book Description

This book constitutes the refereed proceedings of the 5th International Conference on Information, Communication and Computing Technology, ICICCT 2020, held in New Delhi, India*, in May 2020. The 24 full papers and one short paper presented in this volume were carefully reviewed and selected from 220 submissions. The papers are organized in topical sections on data communication & networking; advanced computing using machine learning. *The conference was held virutally due to the COVID-19 pandemic.




Proceedings of Data Analytics and Management


Book Description

This book includes original unpublished contributions presented at the International Conference on Data Analytics and Management (ICDAM 2021), held at Jan Wyzykowski University, Poland, during June 2021. The book covers the topics in data analytics, data management, big data, computational intelligence, and communication networks. The book presents innovative work by leading academics, researchers, and experts from industry which is useful for young researchers and students.




Hands-On Ensemble Learning with R


Book Description

Explore powerful R packages to create predictive models using ensemble methods Key Features Implement machine learning algorithms to build ensemble-efficient models Explore powerful R packages to create predictive models using ensemble methods Learn to build ensemble models on large datasets using a practical approach Book Description Ensemble techniques are used for combining two or more similar or dissimilar machine learning algorithms to create a stronger model. Such a model delivers superior prediction power and can give your datasets a boost in accuracy. Hands-On Ensemble Learning with R begins with the important statistical resampling methods. You will then walk through the central trilogy of ensemble techniques – bagging, random forest, and boosting – then you'll learn how they can be used to provide greater accuracy on large datasets using popular R packages. You will learn how to combine model predictions using different machine learning algorithms to build ensemble models. In addition to this, you will explore how to improve the performance of your ensemble models. By the end of this book, you will have learned how machine learning algorithms can be combined to reduce common problems and build simple efficient ensemble models with the help of real-world examples. What you will learn Carry out an essential review of re-sampling methods, bootstrap, and jackknife Explore the key ensemble methods: bagging, random forests, and boosting Use multiple algorithms to make strong predictive models Enjoy a comprehensive treatment of boosting methods Supplement methods with statistical tests, such as ROC Walk through data structures in classification, regression, survival, and time series data Use the supplied R code to implement ensemble methods Learn stacking method to combine heterogeneous machine learning models Who this book is for This book is for you if you are a data scientist or machine learning developer who wants to implement machine learning techniques by building ensemble models with the power of R. You will learn how to combine different machine learning algorithms to perform efficient data processing. Basic knowledge of machine learning techniques and programming knowledge of R would be an added advantage.