Graph-Based Clustering and Data Visualization Algorithms


Book Description

This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on the synergistic combination of clustering, graph-theory, neural networks, data visualization, dimensionality reduction, fuzzy methods, and topology learning. The work contains numerous examples to aid in the understanding and implementation of the proposed algorithms, supported by a MATLAB toolbox available at an associated website.







Data Clustering: Theory, Algorithms, and Applications, Second Edition


Book Description

Data clustering, also known as cluster analysis, is an unsupervised process that divides a set of objects into homogeneous groups. Since the publication of the first edition of this monograph in 2007, development in the area has exploded, especially in clustering algorithms for big data and open-source software for cluster analysis. This second edition reflects these new developments, covers the basics of data clustering, includes a list of popular clustering algorithms, and provides program code that helps users implement clustering algorithms. Data Clustering: Theory, Algorithms and Applications, Second Edition will be of interest to researchers, practitioners, and data scientists as well as undergraduate and graduate students.




Managing and Mining Graph Data


Book Description

Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science and engineering.




Graph Algorithms


Book Description

Discover how graph algorithms can help you leverage the relationships within your data to develop more intelligent solutions and enhance your machine learning models. You’ll learn how graph analytics are uniquely suited to unfold complex structures and reveal difficult-to-find patterns lurking in your data. Whether you are trying to build dynamic network models or forecast real-world behavior, this book illustrates how graph algorithms deliver value—from finding vulnerabilities and bottlenecks to detecting communities and improving machine learning predictions. This practical book walks you through hands-on examples of how to use graph algorithms in Apache Spark and Neo4j—two of the most common choices for graph analytics. Also included: sample code and tips for over 20 practical graph algorithms that cover optimal pathfinding, importance through centrality, and community detection. Learn how graph analytics vary from conventional statistical analysis Understand how classic graph algorithms work, and how they are applied Get guidance on which algorithms to use for different types of questions Explore algorithm examples with working code and sample datasets from Spark and Neo4j See how connected feature extraction can increase machine learning accuracy and precision Walk through creating an ML workflow for link prediction combining Neo4j and Spark




Euro-Par 2015: Parallel Processing Workshops


Book Description

This book constitutes the thoroughly refereed post-conference proceedings of 12 workshops held at the 21st International Conference on Parallel and Distributed Computing, Euro-Par 2015, in Vienna, Austria, in August 2015. The 67 revised full papers presented were carefully reviewed and selected from 121 submissions. The volume includes papers from the following workshops: BigDataCloud: 4th Workshop on Big Data Management in Clouds - Euro-EDUPAR: First European Workshop on Parallel and Distributed Computing Education for Undergraduate Students - Hetero Par: 13th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms - LSDVE: Third Workshop on Large Scale Distributed Virtual Environments - OMHI: 4th International Workshop on On-chip Memory Hierarchies and Interconnects - PADAPS: Third Workshop on Parallel and Distributed Agent-Based Simulations - PELGA: Workshop on Performance Engineering for Large-Scale Graph Analytics - REPPAR: Second International Workshop on Reproducibility in Parallel Computing - Resilience: 8th Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids - ROME: Third Workshop on Runtime and Operating Systems for the Many Core Era - UCHPC: 8th Workshop on UnConventional High Performance Computing - and VHPC: 10th Workshop on Virtualization in High-Performance Cloud Computing.




Algorithms and Models for Network Data and Link Analysis


Book Description

A hands-on, entry-level guide to algorithms for extracting information about social and economic behavior from network data.




Graph-theoretic Techniques for Web Content Mining


Book Description

This book describes exciting new opportunities for utilizing robust graph representations of data with common machine learning algorithms. Graphs can model additional information which is often not present in commonly used data representations, such as vectors. Through the use of graph distance ? a relatively new approach for determining graph similarity ? the authors show how well-known algorithms, such as k-means clustering and k-nearest neighbors classification, can be easily extended to work with graphs instead of vectors. This allows for the utilization of additional information found in graph representations, while at the same time employing well-known, proven algorithms.To demonstrate and investigate these novel techniques, the authors have selected the domain of web content mining, which involves the clustering and classification of web documents based on their textual substance. Several methods of representing web document content by graphs are introduced; an interesting feature of these representations is that they allow for a polynomial time distance computation, something which is typically an NP-complete problem when using graphs. Experimental results are reported for both clustering and classification in three web document collections using a variety of graph representations, distance measures, and algorithm parameters.In addition, this book describes several other related topics, many of which provide excellent starting points for researchers and students interested in exploring this new area of machine learning further. These topics include creating graph-based multiple classifier ensembles through random node selection and visualization of graph-based data using multidimensional scaling.




Handbook of Artificial Intelligence in Healthcare


Book Description

Artificial Intelligence (AI) has transformed many aspects of our daily activities. Health and well-being of humans stand as one of the key domains where AI has achieved significant progresses, saving time, costs, and potentially lives, as well as fostering economic resilience, particularly under the COVID-19 pandemic environments. This book is a sequel of the Handbook of Artificial Intelligence in Healthcare. The first volume of the Handbook is dedicated to present advances and applications of AI methodologies in several specific areas, i.e., signal, image, and video processing as well as information and data analytics. In this second volume of the Handbook, general practicality challenges and future prospects of AI methodologies pertaining to healthcare and related domains are presented in Part 1 and Part 2, respectively. It is envisaged that the selected studies will provide readers a general perspective on the issues, challenges, and opportunities in designing, developing, and implementing AI-based tools and solutions in the healthcare sector, bringing benefits to transform and advance health and well-being development of humans..




Principal Manifolds for Data Visualization and Dimension Reduction


Book Description

The book starts with the quote of the classical Pearson definition of PCA and includes reviews of various methods: NLPCA, ICA, MDS, embedding and clustering algorithms, principal manifolds and SOM. New approaches to NLPCA, principal manifolds, branching principal components and topology preserving mappings are described. Presentation of algorithms is supplemented by case studies. The volume ends with a tutorial PCA deciphers genome.