Algorithms on Trees and Graphs


Book Description

Graph algorithms is a well-established subject in mathematics and computer science. Beyond classical application fields, such as approximation, combinatorial optimization, graphics, and operations research, graph algorithms have recently attracted increased attention from computational molecular biology and computational chemistry. Centered around the fundamental issue of graph isomorphism, this text goes beyond classical graph problems of shortest paths, spanning trees, flows in networks, and matchings in bipartite graphs. Advanced algorithmic results and techniques of practical relevance are presented in a coherent and consolidated way. This book introduces graph algorithms on an intuitive basis followed by a detailed exposition in a literate programming style, with correctness proofs as well as worst-case analyses. Furthermore, full C++ implementations of all algorithms presented are given using the LEDA library of efficient data structures and algorithms.




Phylogenetic Supertrees


Book Description

This is the first book on "phylogenetic supertrees", a recent, but controversial development for inferring evolutionary trees. Rather than analyze the combined primary character data directly, supertree construction proceeds by combining the tree topologies derived from those data. This difference in strategy has allowed for the exciting possibility of larger, more complete phylogenies than are otherwise currently possible, with the potential to revolutionize evolutionarily-based research. This book provides a comprehensive look at supertrees, ranging from the methods used to build supertrees to the significance of supertrees to bioinformatic and biological research. Reviews of many the major supertree methods are provided and four new techniques, including a Bayesian implementation of supertrees, are described for the first time. The far-reaching impact of supertrees on biological research is highlighted both in general terms and through specific examples from diverse clades such as flowering plants, even-toed ungulates, and primates. The book also critically examines the many outstanding challenges and problem areas for this relatively new field, showing the way for supertree construction in the age of genomics. Interdisciplinary contributions from the majority of the leading authorities on supertree construction in all areas of the bioinformatic community (biology, computer sciences, and mathematics) will ensure that this book is a valuable reference with wide appeal to anyone interested in phylogenetic inference.




Case-Based Reasoning Research and Development


Book Description

This book constitutes the refereed proceedings of the 8th International Conference on Case-Based Reasoning, ICCBR 2009, held in Seattle, WA, USA, in July 2009. The 17 revised full papers and 17 revised poster papers presented together with 2 invited talks were carefully reviewed and selected from 55 submissions. Covering a wide range of CBR topics of interest both to practitioners and researchers, the papers are devoted to theoretical/methodological as well as to applicative aspects of current CBR analysis.




Tree Balance Indices


Book Description

Whether you are looking for an introduction to the field of tree balance, a reference work on the multitude of available balance indices or inspiration for your future research, this book offers all three. It delves into the significance of tree balance in phylogenetics and other research domains, where numerous indices have been introduced over the years. While the variations in definitions and underlying principles among these indices have long remained a challenge, this survey addresses the problem by presenting formal definitions of balance and imbalance indices and establishing desirable properties. The book is comprehensive both in the inclusion of a variety of indices and in the information provided on them: the authors meticulously analyze and categorize established indices, shedding light on their general, statistical and combinatorial properties. They reveal that, while some known balance indices fail to meet the most basic criteria, certain tree shape statistics from other contexts prove to be effective balance measures. The collected properties are neatly presented, numerous new results are established, open research questions are highlighted, and possible applications are discussed. Reviewing over twenty (im)balance indices, a wealth of mathematical insights is provided, accompanied by real-world examples showcasing the importance of tree balance in diverse research areas. Catering to researchers, students, mathematicians, and biologists, the book can be used as a textbook for university seminars, a reference on tree balance, and as a source of inspiration for future research. It is accompanied by the free R package 'treebalance', a powerful tool to further explore and apply the discussed concepts, and a website allowing quick access to the main information and the latest developments in the field.




Bioconsensus


Book Description

In the rapidly developing field of bioconsensus, consensus methods from the social and behavioral sciences are used when choosing among alternative solutions to problems in the biological sciences. This text for mathematical and evolutionary biologists and computer scientists contains 16 contributions on various topics in bioconsensus. Included are papers originally presented at meetings held in 2000 and 2001 at the DIMACS Center as well as some from experts who did not attend the meetings. These papers provide historical background, discuss the axiomatic foundations of the field of bioconsensus, and apply consensus methods to real data. The volume is not indexed. Annotation (c)2003 Book News, Inc., Portland, OR (booknews.com).




Classification, 2nd Edition


Book Description

As the amount of information recorded and stored electronically grows ever larger, it becomes increasingly useful, if not essential, to develop better and more efficient ways to summarize and extract information from these large, multivariate data sets. The field of classification does just that-investigates sets of "objects" to see if they can be summarized into a small number of classes comprising similar objects. Researchers have made great strides in the field over the last twenty years, and classification is no longer perceived as being concerned solely with exploratory analyses. The second edition of Classification incorporates many of the new and powerful methodologies developed since its first edition. Like its predecessor, this edition describes both clustering and graphical methods of representing data, and offers advice on how to decide which methods of analysis best apply to a particular data set. It goes even further, however, by providing critical overviews of recent developments not widely known, including efficient clustering algorithms, cluster validation, consensus classifications, and the classification of symbolic data. The author has taken an approach accessible to researchers in the wide variety of disciplines that can benefit from classification analysis and methods. He illustrates the methodologies by applying them to data sets-smaller sets given in the text, larger ones available through a Web site. Large multivariate data sets can be difficult to comprehend-the sheer volume and complexity can prove overwhelming. Classification methods provide efficient, accurate ways to make them less unwieldy and extract more information. Classification, Second Edition offers the ideal vehicle for gaining the background and learning the methodologies-and begin putting these techniques to use.




Proceedings of the Fifth SIAM International Conference on Data Mining


Book Description

The Fifth SIAM International Conference on Data Mining continues the tradition of providing an open forum for the presentation and discussion of innovative algorithms as well as novel applications of data mining. Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data. This conference results in data mining, including applications, algorithms, software, and systems.




Analytic Combinatorics


Book Description

Analytic combinatorics aims to enable precise quantitative predictions of the properties of large combinatorial structures. The theory has emerged over recent decades as essential both for the analysis of algorithms and for the study of scientific models in many disciplines, including probability theory, statistical physics, computational biology, and information theory. With a careful combination of symbolic enumeration methods and complex analysis, drawing heavily on generating functions, results of sweeping generality emerge that can be applied in particular to fundamental structures such as permutations, sequences, strings, walks, paths, trees, graphs and maps. This account is the definitive treatment of the topic. The authors give full coverage of the underlying mathematics and a thorough treatment of both classical and modern applications of the theory. The text is complemented with exercises, examples, appendices and notes to aid understanding. The book can be used for an advanced undergraduate or a graduate course, or for self-study.




Exploratory Data Analysis with MATLAB


Book Description

Since the publication of the bestselling first edition, many advances have been made in exploratory data analysis (EDA). Covering innovative approaches for dimensionality reduction, clustering, and visualization, Exploratory Data Analysis with MATLAB®, Second Edition uses numerous examples and applications to show how the methods are used in practice. New to the Second Edition Discussions of nonnegative matrix factorization, linear discriminant analysis, curvilinear component analysis, independent component analysis, and smoothing splines An expanded set of methods for estimating the intrinsic dimensionality of a data set Several clustering methods, including probabilistic latent semantic analysis and spectral-based clustering Additional visualization methods, such as a rangefinder boxplot, scatterplots with marginal histograms, biplots, and a new method called Andrews’ images Instructions on a free MATLAB GUI toolbox for EDA Like its predecessor, this edition continues to focus on using EDA methods, rather than theoretical aspects. The MATLAB codes for the examples, EDA toolboxes, data sets, and color versions of all figures are available for download at http://pi-sigma.info




Random Trees


Book Description

The aim of this book is to provide a thorough introduction to various aspects of trees in random settings and a systematic treatment of the mathematical analysis techniques involved. It should serve as a reference book as well as a basis for future research.