Automatic Design of Decision-Tree Induction Algorithms


Book Description

Presents a detailed study of the major design components that constitute a top-down decision-tree induction algorithm, including aspects such as split criteria, stopping criteria, pruning and the approaches for dealing with missing values. Whereas the strategy still employed nowadays is to use a 'generic' decision-tree induction algorithm regardless of the data, the authors argue on the benefits that a bias-fitting strategy could bring to decision-tree induction, in which the ultimate goal is the automatic generation of a decision-tree induction algorithm tailored to the application domain of interest. For such, they discuss how one can effectively discover the most suitable set of components of decision-tree induction algorithms to deal with a wide variety of applications through the paradigm of evolutionary computation, following the emergence of a novel field called hyper-heuristics. "Automatic Design of Decision-Tree Induction Algorithms" would be highly useful for machine learning and evolutionary computation students and researchers alike.




Meta-Learning in Decision Tree Induction


Book Description

The book focuses on different variants of decision tree induction but also describes the meta-learning approach in general which is applicable to other types of machine learning algorithms. The book discusses different variants of decision tree induction and represents a useful source of information to readers wishing to review some of the techniques used in decision tree learning, as well as different ensemble methods that involve decision trees. It is shown that the knowledge of different components used within decision tree learning needs to be systematized to enable the system to generate and evaluate different variants of machine learning algorithms with the aim of identifying the top-most performers or potentially the best one. A unified view of decision tree learning enables to emulate different decision tree algorithms simply by setting certain parameters. As meta-learning requires running many different processes with the aim of obtaining performance results, a detailed description of the experimental methodology and evaluation framework is provided. Meta-learning is discussed in great detail in the second half of the book. The exposition starts by presenting a comprehensive review of many meta-learning approaches explored in the past described in literature, including for instance approaches that provide a ranking of algorithms. The approach described can be related to other work that exploits planning whose aim is to construct data mining workflows. The book stimulates interchange of ideas between different, albeit related, approaches.




Automated Design of Machine Learning and Search Algorithms


Book Description

This book presents recent advances in automated machine learning (AutoML) and automated algorithm design and indicates the future directions in this fast-developing area. Methods have been developed to automate the design of neural networks, heuristics and metaheuristics using techniques such as metaheuristics, statistical techniques, machine learning and hyper-heuristics. The book first defines the field of automated design, distinguishing it from the similar but different topics of automated algorithm configuration and automated algorithm selection. The chapters report on the current state of the art by experts in the field and include reviews of AutoML and automated design of search, theoretical analyses of automated algorithm design, automated design of control software for robot swarms, and overfitting as a benchmark and design tool. Also covered are automated generation of constructive and perturbative low-level heuristics, selection hyper-heuristics for automated design, automated design of deep-learning approaches using hyper-heuristics, genetic programming hyper-heuristics with transfer knowledge and automated design of classification algorithms. The book concludes by examining future research directions of this rapidly evolving field. The information presented here will especially interest researchers and practitioners in the fields of artificial intelligence, computational intelligence, evolutionary computation and optimisation.







Automating the Design of Data Mining Algorithms


Book Description

Data mining is a very active research area with many successful real-world app- cations. It consists of a set of concepts and methods used to extract interesting or useful knowledge (or patterns) from real-world datasets, providing valuable support for decision making in industry, business, government, and science. Although there are already many types of data mining algorithms available in the literature, it is still dif cult for users to choose the best possible data mining algorithm for their particular data mining problem. In addition, data mining al- rithms have been manually designed; therefore they incorporate human biases and preferences. This book proposes a new approach to the design of data mining algorithms. - stead of relying on the slow and ad hoc process of manual algorithm design, this book proposes systematically automating the design of data mining algorithms with an evolutionary computation approach. More precisely, we propose a genetic p- gramming system (a type of evolutionary computation method that evolves c- puter programs) to automate the design of rule induction algorithms, a type of cl- si cation method that discovers a set of classi cation rules from data. We focus on genetic programming in this book because it is the paradigmatic type of machine learning method for automating the generation of programs and because it has the advantage of performing a global search in the space of candidate solutions (data mining algorithms in our case), but in principle other types of search methods for this task could be investigated in the future.







Data Mining with Decision Trees


Book Description

This is the first comprehensive book dedicated entirely to the field of decision trees in data mining and covers all aspects of this important technique. Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining, the science and technology of exploring large and complex bodies of data in order to discover useful patterns. The area is of great importance because it enables modeling and knowledge extraction from the abundance of data available. Both theoreticians and practitioners are continually seeking techniques to make the process more efficient, cost-effective and accurate. Decision trees, originally implemented in decision theory and statistics, are highly effective tools in other areas such as data mining, text mining, information extraction, machine learning, and pattern recognition. This book invites readers to explore the many benefits in data mining that decision trees offer:: Self-explanatory and easy to follow when compacted; Able to handle a variety of input data: nominal, numeric and textual; Able to process datasets that may have errors or missing values; High predictive performance for a relatively small computational effort; Available in many data mining packages over a variety of platforms; Useful for various tasks, such as classification, regression, clustering and feature selection . Sample Chapter(s). Chapter 1: Introduction to Decision Trees (245 KB). Chapter 6: Advanced Decision Trees (409 KB). Chapter 10: Fuzzy Decision Trees (220 KB). Contents: Introduction to Decision Trees; Growing Decision Trees; Evaluation of Classification Trees; Splitting Criteria; Pruning Trees; Advanced Decision Trees; Decision Forests; Incremental Learning of Decision Trees; Feature Selection; Fuzzy Decision Trees; Hybridization of Decision Trees with Other Techniques; Sequence Classification Using Decision Trees. Readership: Researchers, graduate and undergraduate students in information systems, engineering, computer science, statistics and management.




Advanced Hybrid Information Processing


Book Description

This book constitutes the refereed proceedings of the First International Conference on Advanced Hybrid Information Processing, ADHIB 2017, held in Harbin, China, in July 2017. The 64 full papers were selected from 134 submissions and focus on advanced methods and applications for hybrid information processing.




Advanced Methodologies and Technologies in Business Operations and Management


Book Description

Businesses consistently work on new projects, products, and workflows to remain competitive and successful in the modern business environment. To remain zealous, businesses must employ the most effective methods and tools in human resources, project management, and overall business plan execution as competitors work to succeed as well. Advanced Methodologies and Technologies in Business Operations and Management provides emerging research on business tools such as employee engagement, payout policies, and financial investing to promote operational success. While highlighting the challenges facing modern organizations, readers will learn how corporate social responsibility and utilizing artificial intelligence improve a company’s culture and management. This book is an ideal resource for executives and managers, researchers, accountants, and financial investors seeking current research on business operations and management.




Computational Methods for Data Analysis


Book Description

This graduate text covers a variety of mathematical and statistical tools for the analysis of big data coming from biology, medicine and economics. Neural networks, Markov chains, tools from statistical physics and wavelet analysis are used to develop efficient computational algorithms, which are then used for the processing of real-life data using Matlab.