Sufficient Dimension Reduction


Book Description

Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.




Sufficient Dimension Reduction


Book Description

Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.




Regression Graphics


Book Description

Zur graphischen Präsentation von Regressionsdaten gibt es seit dem Vormarsch der Computertechnik vielfältige neue Möglichkeiten, die über die klassischen Ansätze hinausgehen. Der Autor betritt mit seinen Ideen häufig Neuland; er illustriert sie mit zahlreichen Beispielen, Diagrammen und Abbildungen (die entsprechenden 3D- und Farbversionen sind über Internet abrufbar). (11/98)




Dimension Reduction


Book Description

We give a tutorial overview of several foundational methods for dimension reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, canonical correlation analysis (CCA), kernel CCA, Fisher discriminant analysis, oriented PCA, and several techniques for sufficient dimension reduction. For the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps, and spectral clustering. Although the review focuses on foundations, we also provide pointers to some more modern techniques. We also describe the correlation dimension as one method for estimating the intrinsic dimension, and we point out that the notion of dimension can be a scale-dependent quantity. The Nystr m method, which links several of the manifold algorithms, is also reviewed. We use a publicly available dataset to illustrate some of the methods. The goal is to provide a self-contained overview of key concepts underlying many of these algorithms, and to give pointers for further reading.




Advances in Data Science


Book Description

Data science unifies statistics, data analysis and machine learning to achieve a better understanding of the masses of data which are produced today, and to improve prediction. Special kinds of data (symbolic, network, complex, compositional) are increasingly frequent in data science. These data require specific methodologies, but there is a lack of reference work in this field. Advances in Data Science fills this gap. It presents a collection of up-to-date contributions by eminent scholars following two international workshops held in Beijing and Paris. The 10 chapters are organized into four parts: Symbolic Data, Complex Data, Network Data and Clustering. They include fundamental contributions, as well as applications to several domains, including business and the social sciences.




Partially Linear Models


Book Description

In the last ten years, there has been increasing interest and activity in the general area of partially linear regression smoothing in statistics. Many methods and techniques have been proposed and studied. This monograph hopes to bring an up-to-date presentation of the state of the art of partially linear regression techniques. The emphasis is on methodologies rather than on the theory, with a particular focus on applications of partially linear regression techniques to various statistical problems. These problems include least squares regression, asymptotically efficient estimation, bootstrap resampling, censored data analysis, linear measurement error models, nonlinear measurement models, nonlinear and nonparametric time series models.




Modern Dimension Reduction


Book Description

Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.




Robust Statistics


Book Description

A new edition of this popular text on robust statistics, thoroughly updated to include new and improved methods and focus on implementation of methodology using the increasingly popular open-source software R. Classical statistics fail to cope well with outliers associated with deviations from standard distributions. Robust statistical methods take into account these deviations when estimating the parameters of parametric models, thus increasing the reliability of fitted models and associated inference. This new, second edition of Robust Statistics: Theory and Methods (with R) presents a broad coverage of the theory of robust statistics that is integrated with computing methods and applications. Updated to include important new research results of the last decade and focus on the use of the popular software package R, it features in-depth coverage of the key methodology, including regression, multivariate analysis, and time series modeling. The book is illustrated throughout by a range of examples and applications that are supported by a companion website featuring data sets and R code that allow the reader to reproduce the examples given in the book. Unlike other books on the market, Robust Statistics: Theory and Methods (with R) offers the most comprehensive, definitive, and up-to-date treatment of the subject. It features chapters on estimating location and scale; measuring robustness; linear regression with fixed and with random predictors; multivariate analysis; generalized linear models; time series; numerical algorithms; and asymptotic theory of M-estimates. Explains both the use and theoretical justification of robust methods Guides readers in selecting and using the most appropriate robust methods for their problems Features computational algorithms for the core methods Robust statistics research results of the last decade included in this 2nd edition include: fast deterministic robust regression, finite-sample robustness, robust regularized regression, robust location and scatter estimation with missing data, robust estimation with independent outliers in variables, and robust mixed linear models. Robust Statistics aims to stimulate the use of robust methods as a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. It is an ideal resource for researchers, practitioners, and graduate students in statistics, engineering, computer science, and physical and social sciences.




Active Subspaces


Book Description

Scientists and engineers use computer simulations to study relationships between a model's input parameters and its outputs. However, thorough parameter studies are challenging, if not impossible, when the simulation is expensive and the model has several inputs. To enable studies in these instances, the engineer may attempt to reduce the dimension of the model's input parameter space. Active subspaces are an emerging set of dimension reduction tools that identify important directions in the parameter space. This book describes techniques for discovering a model's active subspace and proposes methods for exploiting the reduced dimension to enable otherwise infeasible parameter studies. Readers will find new ideas for dimension reduction, easy-to-implement algorithms, and several examples of active subspaces in action.




Theory of Spatial Statistics


Book Description

Theory of Spatial Statistics: A Concise Introduction presents the most important models used in spatial statistics, including random fields and point processes, from a rigorous mathematical point of view and shows how to carry out statistical inference. It contains full proofs, real-life examples and theoretical exercises. Solutions to the latter are available in an appendix. Assuming maturity in probability and statistics, these concise lecture notes are self-contained and cover enough material for a semester course. They may also serve as a reference book for researchers. Features * Presents the mathematical foundations of spatial statistics. * Contains worked examples from mining, disease mapping, forestry, soil and environmental science, and criminology. * Gives pointers to the literature to facilitate further study. * Provides example code in R to encourage the student to experiment. * Offers exercises and their solutions to test and deepen understanding. The book is suitable for postgraduate and advanced undergraduate students in mathematics and statistics.