Statistical Network Analysis: Models, Issues, and New Directions


Book Description

This book constitutes the thoroughly refereed post-proceedings of the International Workshop on Statistical Network Analysis: Models, Issues, and New Directions held in Pittsburgh, PA, USA in June 2006 as associated event of the 23rd International Conference on Machine Learning, ICML 2006. It covers probabilistic methods for network analysis, paying special attention to model design and computational issues of learning and inference.




A Survey of Statistical Network Models


Book Description

Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.




Inference in Weak Factor Models


Book Description

In this paper, we consider statistical inference for high-dimensional approximate factor models. We posit a weak factor structure, in which the factor loading matrix can be sparse and the signal eigenvalues may diverge more slowly than the cross-sectional dimension, N. We propose a novel inferential procedure to decide whether each component of the factor loadings is zero or not, and prove that this controls the false discovery rate (FDR) below a pre-assigned level, while the power tends to unity. This "factor selection" procedure is primarily based on a de-sparsified (or debiased) version of the WF-SOFAR estimator of Uematsu and Yamagata (2020), but is also applicable to the principal component (PC) estimator. After the factor selection, the re-sparsified WFSOFAR and sparsified PC estimators are proposed and their consistency is established. Finite sample evidence supports the theoretical results. We apply our procedure to the FRED-MD macroeconomic and financial data, consisting of 128 series from June 1999 to May 2019. The results strongly suggest the existence of sparse factor loadings and exhibit a clear association of each of the extracted factors with a group of macroeconomic variables. In particular, we find a price factor, housing factor, output and income factor, and a money, credit and stock market factor.




Advances in Statistical Bioinformatics


Book Description

Providing genome-informed personalized treatment is a goal of modern medicine. Identifying new translational targets in nucleic acid characterizations is an important step toward that goal. The information tsunami produced by such genome-scale investigations is stimulating parallel developments in statistical methodology and inference, analytical frameworks, and computational tools. Within the context of genomic medicine and with a strong focus on cancer research, this book describes the integration of high-throughput bioinformatics data from multiple platforms to inform our understanding of the functional consequences of genomic alterations. This includes rigorous and scalable methods for simultaneously handling diverse data types such as gene expression array, miRNA, copy number, methylation, and next-generation sequencing data. This material is written for statisticians who are interested in modeling and analyzing high-throughput data. Chapters by experts in the field offer a thorough introduction to the biological and technical principles behind multiplatform high-throughput experimentation.




Statistical Modeling and Computation


Book Description

This textbook on statistical modeling and statistical inference will assist advanced undergraduate and graduate students. Statistical Modeling and Computation provides a unique introduction to modern Statistics from both classical and Bayesian perspectives. It also offers an integrated treatment of Mathematical Statistics and modern statistical computation, emphasizing statistical modeling, computational techniques, and applications. Each of the three parts will cover topics essential to university courses. Part I covers the fundamentals of probability theory. In Part II, the authors introduce a wide variety of classical models that include, among others, linear regression and ANOVA models. In Part III, the authors address the statistical analysis and computation of various advanced models, such as generalized linear, state-space and Gaussian models. Particular attention is paid to fast Monte Carlo techniques for Bayesian inference on these models. Throughout the book the authors include a large number of illustrative examples and solved problems. The book also features a section with solutions, an appendix that serves as a MATLAB primer, and a mathematical supplement.​




Statistical Network Analysis: Models, Issues, and New Directions


Book Description

This book constitutes the thoroughly refereed post-proceedings of the International Workshop on Statistical Network Analysis: Models, Issues, and New Directions held in Pittsburgh, PA, USA in June 2006 as associated event of the 23rd International Conference on Machine Learning, ICML 2006. It covers probabilistic methods for network analysis, paying special attention to model design and computational issues of learning and inference.




Statistical Learning Using Neural Networks


Book Description

Statistical Learning using Neural Networks: A Guide for Statisticians and Data Scientists with Python introduces artificial neural networks starting from the basics and increasingly demanding more effort from readers, who can learn the theory and its applications in statistical methods with concrete Python code examples. It presents a wide range of widely used statistical methodologies, applied in several research areas with Python code examples, which are available online. It is suitable for scientists and developers as well as graduate students. Key Features: Discusses applications in several research areas Covers a wide range of widely used statistical methodologies Includes Python code examples Gives numerous neural network models This book covers fundamental concepts on Neural Networks including Multivariate Statistics Neural Networks, Regression Neural Network Models, Survival Analysis Networks, Time Series Forecasting Networks, Control Chart Networks, and Statistical Inference Results. This book is suitable for both teaching and research. It introduces neural networks and is a guide for outsiders of academia working in data mining and artificial intelligence (AI). This book brings together data analysis from statistics to computer science using neural networks.




Modern Analysis of Customer Surveys


Book Description

Customer survey studies deals with customers, consumers and user satisfaction from a product or service. In practice, many of the customer surveys conducted by business and industry are analyzed in a very simple way, without using models or statistical methods. Typical reports include descriptive statistics and basic graphical displays. As demonstrated in this book, integrating such basic analysis with more advanced tools, provides insights on non-obvious patterns and important relationships between the survey variables. This knowledge can significantly affect the conclusions derived from a survey. Key features: Provides an integrated, case-studies based approach to analysing customer survey data. Presents a general introduction to customer surveys, within an organization’s business cycle. Contains classical techniques with modern and non standard tools. Focuses on probabilistic techniques from the area of statistics/data analysis and covers all major recent developments. Accompanied by a supporting website containing datasets and R scripts. Customer survey specialists, quality managers and market researchers will benefit from this book as well as specialists in marketing, data mining and business intelligence fields.




Statistical and Machine Learning Approaches for Network Analysis


Book Description

Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internationally renowned researchers in the field of interdisciplinary network theory, the book presents current and classical methods to analyze networks statistically. Methods from machine learning, data mining, and information theory are strongly emphasized throughout. Real data sets are used to showcase the discussed methods and topics, which include: A survey of computational approaches to reconstruct and partition biological networks An introduction to complex networks—measures, statistical properties, and models Modeling for evolving biological networks The structure of an evolving random bipartite graph Density-based enumeration in structured data Hyponym extraction employing a weighted graph kernel Statistical and Machine Learning Approaches for Network Analysis is an excellent supplemental text for graduate-level, cross-disciplinary courses in applied discrete mathematics, bioinformatics, pattern recognition, and computer science. The book is also a valuable reference for researchers and practitioners in the fields of applied discrete mathematics, machine learning, data mining, and biostatistics.