Nearest Neighbor Search:


Book Description

Modern applications are both data and computationally intensive and require the storage and manipulation of voluminous traditional (alphanumeric) and nontraditional data sets (images, text, geometric objects, time-series). Examples of such emerging application domains are: Geographical Information Systems (GIS), Multimedia Information Systems, CAD/CAM, Time-Series Analysis, Medical Information Sstems, On-Line Analytical Processing (OLAP), and Data Mining. These applications pose diverse requirements with respect to the information and the operations that need to be supported. From the database perspective, new techniques and tools therefore need to be developed towards increased processing efficiency. This monograph explores the way spatial database management systems aim at supporting queries that involve the space characteristics of the underlying data, and discusses query processing techniques for nearest neighbor queries. It provides both basic concepts and state-of-the-art results in spatial databases and parallel processing research, and studies numerous applications of nearest neighbor queries.




Dimensionality Reduction with Unsupervised Nearest Neighbors


Book Description

This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustrate the introduced concepts and to highlight the experimental results.




Data Algorithms


Book Description

If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You’ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects. Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark. Topics include: Market basket analysis for a large set of transactions Data mining algorithms (K-means, KNN, and Naive Bayes) Using huge genomic data to sequence DNA and RNA Naive Bayes theorem and Markov chains for data and market prediction Recommendation algorithms and pairwise document similarity Linear regression, Cox regression, and Pearson correlation Allelic frequency and mining DNA Social network analysis (recommendation systems, counting triangles, sentiment analysis)




Nearest-neighbor Methods in Learning and Vision


Book Description

This text presents theoretical and practical discussions of nearest neighbour (NN) methods in machine learning and examines computer vision as an application domain in which the benefit of these advanced methods is often dramatic.




Lectures on the Nearest Neighbor Method


Book Description

This text presents a wide-ranging and rigorous overview of nearest neighbor methods, one of the most important paradigms in machine learning. Now in one self-contained volume, this book systematically covers key statistical, probabilistic, combinatorial and geometric ideas for understanding, analyzing and developing nearest neighbor methods. Gérard Biau is a professor at Université Pierre et Marie Curie (Paris). Luc Devroye is a professor at the School of Computer Science at McGill University (Montreal).




Proceedings Of The International Congress Of Mathematicians 2018 (Icm 2018) (In 4 Volumes)


Book Description

The Proceedings of the ICM publishes the talks, by invited speakers, at the conference organized by the International Mathematical Union every 4 years. It covers several areas of Mathematics and it includes the Fields Medal and Nevanlinna, Gauss and Leelavati Prizes and the Chern Medal laudatios.







Fundamentals of Database Indexing and Searching


Book Description

Fundamentals of Database Indexing and Searching presents well-known database searching and indexing techniques. It focuses on similarity search queries, showing how to use distance functions to measure the notion of dissimilarity. After defining database queries and similarity search queries, the book organizes the most common and representative index structures according to their characteristics. The author first describes low-dimensional index structures, memory-based index structures, and hierarchical disk-based index structures. He then outlines useful distance measures and index structures that use the distance information to efficiently solve similarity search queries. Focusing on the difficult dimensionality phenomenon, he also presents several indexing methods that specifically deal with high-dimensional spaces. In addition, the book covers data reduction techniques, including embedding, various data transforms, and histograms. Through numerous real-world examples, this book explores how to effectively index and search for information in large collections of data. Requiring only a basic computer science background, it is accessible to practitioners and advanced undergraduate students.




Data Mining in Agriculture


Book Description

Data Mining in Agriculture represents a comprehensive effort to provide graduate students and researchers with an analytical text on data mining techniques applied to agriculture and environmental related fields. This book presents both theoretical and practical insights with a focus on presenting the context of each data mining technique rather intuitively with ample concrete examples represented graphically and with algorithms written in MATLAB®.




Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms


Book Description

Annotation Proceedings of a conference that took place in Austin, Texas in January 1993. Contributors are impressive names from the field of computer science, including Donald Knuth, author of several computer books of "biblical" importance. The diverse selection of paper topics includes dynamic point location, ray shooting, and the shortest paths in planar maps; optimistic sorting and information theoretic complexity; and an optimal randomized algorithm for the cow-path problem. No index. Annotation copyright by Book News, Inc., Portland, OR.