Instance Selection and Construction for Data Mining


Book Description

The ability to analyze and understand massive data sets lags far behind the ability to gather and store the data. To meet this challenge, knowledge discovery and data mining (KDD) is growing rapidly as an emerging field. However, no matter how powerful computers are now or will be in the future, KDD researchers and practitioners must consider how to manage ever-growing data which is, ironically, due to the extensive use of computers and ease of data collection with computers. Many different approaches have been used to address the data explosion issue, such as algorithm scale-up and data reduction. Instance, example, or tuple selection pertains to methods or algorithms that select or search for a representative portion of data that can fulfill a KDD task as if the whole data is used. Instance selection is directly related to data reduction and becomes increasingly important in many KDD applications due to the need for processing efficiency and/or storage efficiency. One of the major means of instance selection is sampling whereby a sample is selected for testing and analysis, and randomness is a key element in the process. Instance selection also covers methods that require search. Examples can be found in density estimation (finding the representative instances - data points - for a cluster); boundary hunting (finding the critical instances to form boundaries to differentiate data points of different classes); and data squashing (producing weighted new data with equivalent sufficient statistics). Other important issues related to instance selection extend to unwanted precision, focusing, concept drifts, noise/outlier removal, data smoothing, etc. Instance Selection and Construction for Data Mining brings researchers and practitioners together to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection. This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD.




Instance-Specific Algorithm Configuration


Book Description

This book presents a modular and expandable technique in the rapidly emerging research area of automatic configuration and selection of the best algorithm for the instance at hand. The author presents the basic model behind ISAC and then details a number of modifications and practical applications. In particular, he addresses automated feature generation, offline algorithm configuration for portfolio generation, algorithm selection, adaptive solvers, online tuning, and parallelization. The author's related thesis was honorably mentioned (runner-up) for the ACP Dissertation Award in 2014, and this book includes some expanded sections and notes on recent developments. Additionally, the techniques described in this book have been successfully applied to a number of solvers competing in the SAT and MaxSAT International Competitions, winning a total of 18 gold medals between 2011 and 2014. The book will be of interest to researchers and practitioners in artificial intelligence, in particular in the area of machine learning and constraint programming.




Multiple Instance Learning


Book Description

This book provides a general overview of multiple instance learning (MIL), defining the framework and covering the central paradigms. The authors discuss the most important algorithms for MIL such as classification, regression and clustering. With a focus on classification, a taxonomy is set and the most relevant proposals are specified. Efficient algorithms are developed to discover relevant information when working with uncertainty. Key representative applications are included. This book carries out a study of the key related fields of distance metrics and alternative hypothesis. Chapters examine new and developing aspects of MIL such as data reduction for multi-instance problems and imbalanced MIL data. Class imbalance for multi-instance problems is defined at the bag level, a type of representation that utilizes ambiguity due to the fact that bag labels are available, but the labels of the individual instances are not defined. Additionally, multiple instance multiple label learning is explored. This learning framework introduces flexibility and ambiguity in the object representation providing a natural formulation for representing complicated objects. Thus, an object is represented by a bag of instances and is allowed to have associated multiple class labels simultaneously. This book is suitable for developers and engineers working to apply MIL techniques to solve a variety of real-world problems. It is also useful for researchers or students seeking a thorough overview of MIL literature, methods, and tools.




Essays on Realist Instance Ontology and its Logic


Book Description

Structure or system is a ubiquitous and uneliminable feature of all our experience and theory, and requires an ontological analysis. The essays collected in this volume provide an account of structure founded upon the proper analysis of polyadic relations as the irreducible and defining elements of structure. It is argued that polyadic relations are ontic predicates in the insightful sense of intension-determined agent-combinators, monadic properties being the limiting and historically misleading case. This assay of ontic predicates has a number of powerful explanatory implications, including fundamentally: providing ontology with a principium individuationis, demonstrating the perennial theory that properties and relations are individuated as unit attributes or ‘instances’, giving content to the ontology of facts or states of affairs, and providing a means to precisely differentiate identity from indiscernibility. The differentiation of the unrepeatable combinatorial and repeatable intension aspects of ontic predicates makes it possible to properly diagnose and disarm the classis Bradley Regress Argument aimed against attributes and universals, an argument that trades on confusing these aspects. It is argued that these two aspects of ontic predicates form a ‘composite simple’, an explanation that sheds light on the nature and necessity of the medieval formal distinction, e.g., the distinctio formalis a parte rei of Scotus. Following from this analysis of ontic predication there is given a number of principles delineating realist instance ontology, together with a critique of both nominalistic trope theory and modern revivals of Aristotle’s instance ontology of the Categories. It is shown how the resulting theory of facts can, via ‘horizontal’ and ‘vertical’ composition, account for all the hierarchical structuring of our experience and theory, and, importantly, how this can rest upon an atomic ontic level composed of only dependent ontic predicates. The latter is a desideratum for the proposed ‘Structural Realism’ ontology for micro-physics where at its lowest level the physical is said to be totally relational/structural. Nullified is the classic and insidious assumption that dependent entities presuppose a class of independent substrata or ‘substances’, and with this any pressure to admit ‘bare particulars’ and intensionless relations or ‘ties’. The logic inherent in realist instance ontology-termed ‘PPL’-is formalized in detail and given a consistency proof. Demonstrated is the logic’s power to distinguish legitimate from illegitimate impredicative definitions, and in this how it provides a general solution to the classic self-referential paradoxes. PPL corresponds to Gödel’s programmatic ‘Theory of Concepts’. The last essay, not previously published, provides a detailed differentiation of identity from indiscernibility, preliminary to which is given an explanation of in what sense a predicate logic presupposes an ontology of predication. The principles needed for the differentiation have the significant implication (e.g., for the foundations of mathematics) of implying an infinity of logical entities, viz., instances of the identity relation.




Climate Engineering as an Instance of Politicization


Book Description

This book examines the academic discussion on climate engineering as an instance of politicization – as a subject of deliberation and decision-making. It traces legitimizing and delegitimizing frames applied to discuss both Carbon Dioxide Removal and Solar Radiation Management approaches in academic publications, and their implications for political decision-making. Moreover, it offers insights into how academic discourse on climate technology can influence political decision-making – especially at a technological stage where a socio-technical system with a high degree of inertia does not (yet) exist. The high degree of diversity of frames in the academic discussion is understood as an opportunity for deliberate decision-making concerning the future roles of these approaches in global climate policy. This book demonstrates how insights from science and technology studies can be operationalized in empirical political analysis. It appeals to scholars in both political science and environmental science who are interested in climate change policy-making and the science–policy nexus.




Online Visual Tracking ofWeighted Multiple Instance Learning via Neutrosophic Similarity-Based Objectness Estimation


Book Description

An online neutrosophic similarity-based objectness tracking with a weighted multiple instance learning algorithm (NeutWMIL) is proposed. Each training sample is extracted surrounding the object location, and the distribution of these samples is symmetric. To provide a more robust weight for each sample in the positive bag, the asymmetry of the importance of the samples is considered. The neutrosophic similarity-based objectness estimation with object properties (super straddling) is applied.




Semantic Web and Web Science


Book Description

The book will focus on exploiting state of the art research in semantic web and web science. The rapidly evolving world-wide-web has led to revolutionary changes in the whole of society. The research and development of the semantic web covers a number of global standards of the web and cutting edge technologies, such as: linked data, social semantic web, semantic web search, smart data integration, semantic web mining and web scale computing. These proceedings are from the 6th Chinese Semantics Web Symposium.




Factories and Workshops


Book Description







Abstracts of Theses


Book Description