Relational Data Clustering


Book Description

A culmination of the authors' years of extensive research on this topic, Relational Data Clustering: Models, Algorithms, and Applications addresses the fundamentals and applications of relational data clustering. It describes theoretic models and algorithms and, through examples, shows how to apply these models and algorithms to solve real-world problems. After defining the field, the book introduces different types of model formulations for relational data clustering, presents various algorithms for the corresponding models, and demonstrates applications of the models and algorithms through extensive experimental results. The authors cover six topics of relational data clustering: Clustering on bi-type heterogeneous relational data Multi-type heterogeneous relational data Homogeneous relational data clustering Clustering on the most general case of relational data Individual relational clustering framework Recent research on evolutionary clustering This book focuses on both practical algorithm derivation and theoretical framework construction for relational data clustering. It provides a complete, self-contained introduction to advances in the field.




Clustering in Relational Data and Ontologies


Book Description

This dissertation studies the problem of clustering objects represented by relational data. This is a pertinent problem as many real-world data sets can only be represented by relational data for which object-based clustering algorithms are not designed. Relational data are encountered in many fields including biology, management, industrial engineering, and social sciences. Unlike numerical object data, which are represented by a set of feature values (e.g. height, weight, shoe size) of an object, relational object data are the numerical values of (dis) similarity between objects. For this reason, conventional cluster analysis methods such as k-means and fuzzy c-means cannot be used directly with relational data. I focus on three main problems of cluster analysis of relational data: (i) tendency prior to clustering -- how many clusters are there?; (ii) partitioning of objects -- which objects belong to which cluster?; and (iii) validity of the resultant clusters -- are the partitions \good"?Analyses are included in this dissertation that prove that the Visual Assessment of cluster Tendency (VAT) algorithm has a direct relation to single-linkage hierarchical clustering and Dunn's cluster validity index. These analyses are important to the development of two novel clustering algorithms, CLODD-CLustering in Ordered Dissimilarity Data and ReSL-Rectangular Single-Linkage clustering. Last, this dissertation addresses clustering in ontologies; examples include the Gene Ontology, the MeSH ontology, patient medical records, and web documents. I apply an extension to the Self-Organizing Map (SOM) to produce a new algorithm, the OSOM-Ontological Self-Organizing Map. OSOM provides visualization and linguistic summarization of ontology-based data.




Relational Data Mining


Book Description

As the first book devoted to relational data mining, this coherently written multi-author monograph provides a thorough introduction and systematic overview of the area. The first part introduces the reader to the basics and principles of classical knowledge discovery in databases and inductive logic programming; subsequent chapters by leading experts assess the techniques in relational data mining in a principled and comprehensive way; finally, three chapters deal with advanced applications in various fields and refer the reader to resources for relational data mining. This book will become a valuable source of reference for R&D professionals active in relational data mining. Students as well as IT professionals and ambitioned practitioners interested in learning about relational data mining will appreciate the book as a useful text and gentle introduction to this exciting new field.




Constrained Clustering


Book Description

Since the initial work on constrained clustering, there have been numerous advances in methods, applications, and our understanding of the theoretical properties of constraints and constrained clustering algorithms. Bringing these developments together, Constrained Clustering: Advances in Algorithms, Theory, and Applications presents an extensive collection of the latest innovations in clustering data analysis methods that use background knowledge encoded as constraints. Algorithms The first five chapters of this volume investigate advances in the use of instance-level, pairwise constraints for partitional and hierarchical clustering. The book then explores other types of constraints for clustering, including cluster size balancing, minimum cluster size,and cluster-level relational constraints. Theory It also describes variations of the traditional clustering under constraints problem as well as approximation algorithms with helpful performance guarantees. Applications The book ends by applying clustering with constraints to relational data, privacy-preserving data publishing, and video surveillance data. It discusses an interactive visual clustering approach, a distance metric learning approach, existential constraints, and automatically generated constraints. With contributions from industrial researchers and leading academic experts who pioneered the field, this volume delivers thorough coverage of the capabilities and limitations of constrained clustering methods as well as introduces new types of constraints and clustering algorithms.







Advances in Fuzzy Clustering and its Applications


Book Description

A comprehensive, coherent, and in depth presentation of the state of the art in fuzzy clustering. Fuzzy clustering is now a mature and vibrant area of research with highly innovative advanced applications. Encapsulating this through presenting a careful selection of research contributions, this book addresses timely and relevant concepts and methods, whilst identifying major challenges and recent developments in the area. Split into five clear sections, Fundamentals, Visualization, Algorithms and Computational Aspects, Real-Time and Dynamic Clustering, and Applications and Case Studies, the book covers a wealth of novel, original and fully updated material, and in particular offers: a focus on the algorithmic and computational augmentations of fuzzy clustering and its effectiveness in handling high dimensional problems, distributed problem solving and uncertainty management. presentations of the important and relevant phases of cluster design, including the role of information granules, fuzzy sets in the realization of human-centricity facet of data analysis, as well as system modelling demonstrations of how the results facilitate further detailed development of models, and enhance interpretation aspects a carefully organized illustrative series of applications and case studies in which fuzzy clustering plays a pivotal role This book will be of key interest to engineers associated with fuzzy control, bioinformatics, data mining, image processing, and pattern recognition, while computer engineers, students and researchers, in most engineering disciplines, will find this an invaluable resource and research tool.




Artificial Intelligence and Computational Intelligence


Book Description

This three-volume proceedings contains revised selected papers from the Second International Conference on Artificial Intelligence and Computational Intelligence, AICI 2011, held in Taiyuan, China, in September 2011. The total of 265 high-quality papers presented were carefully reviewed and selected from 1073 submissions. The topics of Part I covered are: applications of artificial intelligence; applications of computational intelligence; automated problem solving; biomedical inforamtics and computation; brain models/cognitive science; data mining and knowledge discovering; distributed AI and agents; evolutionary programming; expert and decision support systems; fuzzy computation; fuzzy logic and soft computing; and genetic algorithms.




Learning in Graphical Models


Book Description

In the past decade, a number of different research communities within the computational sciences have studied learning in networks, starting from a number of different points of view. There has been substantial progress in these different communities and surprising convergence has developed between the formalisms. The awareness of this convergence and the growing interest of researchers in understanding the essential unity of the subject underlies the current volume. Two research communities which have used graphical or network formalisms to particular advantage are the belief network community and the neural network community. Belief networks arose within computer science and statistics and were developed with an emphasis on prior knowledge and exact probabilistic calculations. Neural networks arose within electrical engineering, physics and neuroscience and have emphasised pattern recognition and systems modelling problems. This volume draws together researchers from these two communities and presents both kinds of networks as instances of a general unified graphical formalism. The book focuses on probabilistic methods for learning and inference in graphical models, algorithm analysis and design, theory and applications. Exact methods, sampling methods and variational methods are discussed in detail. Audience: A wide cross-section of computationally oriented researchers, including computer scientists, statisticians, electrical engineers, physicists and neuroscientists.







Knowledge Discovery in Spatial Data


Book Description

When I ?rst came across the term data mining and knowledge discovery in databases, I was excited and curious to ?nd out what it was all about. I was excited because the term tends to convey a new ?eld that is in the making. I was curious because I wondered what it was doing that the other ?elds of research, such as statistics and the broad ?eld of arti?cial intelligence, were not doing. After reading up on the literature, I have come to realize that it is not much different from conventional data analysis. The commonly used de?nition of knowledge discovery in databases: “the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data” is actually in line with the core mission of conventional data analysis. The process employed by conventional data analysis is by no means trivial, and the patterns in data to be unraveled have, of course, to be valid, novel, useful and understandable. Therefore, what is the commotion all about? Careful scrutiny of the main lines of research in data mining and knowledge discovery again told me that they are not much different from that of conventional data analysis. Putting aside data warehousing and database m- agement aspects, again a main area of research in conventional database research, the rest of the tasks in data mining are largely the main concerns of conventional data analysis.