High-Dimensional Data Analysis with Low-Dimensional Models


Book Description

Connecting theory with practice, this systematic and rigorous introduction covers the fundamental principles, algorithms and applications of key mathematical models for high-dimensional data analysis. Comprehensive in its approach, it provides unified coverage of many different low-dimensional models and analytical techniques, including sparse and low-rank models, and both convex and non-convex formulations. Readers will learn how to develop efficient and scalable algorithms for solving real-world problems, supported by numerous examples and exercises throughout, and how to use the computational tools learnt in several application contexts. Applications presented include scientific imaging, communication, face recognition, 3D vision, and deep networks for classification. With code available online, this is an ideal textbook for senior and graduate students in computer science, data science, and electrical engineering, as well as for those taking courses on sparsity, low-dimensional structures, and high-dimensional data. Foreword by Emmanuel Candès.




High-Dimensional Data Analysis in Cancer Research


Book Description

Multivariate analysis is a mainstay of statistical tools in the analysis of biomedical data. It concerns with associating data matrices of n rows by p columns, with rows representing samples (or patients) and columns attributes of samples, to some response variables, e.g., patients outcome. Classically, the sample size n is much larger than p, the number of variables. The properties of statistical models have been mostly discussed under the assumption of fixed p and infinite n. The advance of biological sciences and technologies has revolutionized the process of investigations of cancer. The biomedical data collection has become more automatic and more extensive. We are in the era of p as a large fraction of n, and even much larger than n. Take proteomics as an example. Although proteomic techniques have been researched and developed for many decades to identify proteins or peptides uniquely associated with a given disease state, until recently this has been mostly a laborious process, carried out one protein at a time. The advent of high throughput proteome-wide technologies such as liquid chromatography-tandem mass spectroscopy make it possible to generate proteomic signatures that facilitate rapid development of new strategies for proteomics-based detection of disease. This poses new challenges and calls for scalable solutions to the analysis of such high dimensional data. In this volume, we will present the systematic and analytical approaches and strategies from both biostatistics and bioinformatics to the analysis of correlated and high-dimensional data.




Statistics for High-Dimensional Data


Book Description

Modern statistics deals with large and complex data sets, and consequently with models containing a large number of parameters. This book presents a detailed account of recently developed approaches, including the Lasso and versions of it for various models, boosting methods, undirected graphical modeling, and procedures controlling false positive selections. A special characteristic of the book is that it contains comprehensive mathematical theory on high-dimensional statistics combined with methodology, algorithms and illustrations with real data examples. This in-depth approach highlights the methods’ great potential and practical applicability in a variety of settings. As such, it is a valuable resource for researchers, graduate students and experts in statistics, applied mathematics and computer science.




High-Dimensional Statistics


Book Description

A coherent introductory text from a groundbreaking researcher, focusing on clarity and motivation to build intuition and understanding.




Analysis of Multivariate and High-Dimensional Data


Book Description

This modern approach integrates classical and contemporary methods, fusing theory and practice and bridging the gap to statistical learning.




Statistical Analysis for High-Dimensional Data


Book Description

This book features research contributions from The Abel Symposium on Statistical Analysis for High Dimensional Data, held in Nyvågar, Lofoten, Norway, in May 2014. The focus of the symposium was on statistical and machine learning methodologies specifically developed for inference in “big data” situations, with particular reference to genomic applications. The contributors, who are among the most prominent researchers on the theory of statistics for high dimensional inference, present new theories and methods, as well as challenging applications and computational solutions. Specific themes include, among others, variable selection and screening, penalised regression, sparsity, thresholding, low dimensional structures, computational challenges, non-convex situations, learning graphical models, sparse covariance and precision matrices, semi- and non-parametric formulations, multiple testing, classification, factor models, clustering, and preselection. Highlighting cutting-edge research and casting light on future research directions, the contributions will benefit graduate students and researchers in computational biology, statistics and the machine learning community.




High-dimensional Data Analysis


Book Description

Over the last few years, significant developments have been taking place in highdimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from highdimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, cla.




High-Dimensional Single Cell Analysis


Book Description

This volume highlights the most interesting biomedical and clinical applications of high-dimensional flow and mass cytometry. It reviews current practical approaches used to perform high-dimensional experiments and addresses key bioinformatic techniques for the analysis of data sets involving dozens of parameters in millions of single cells. Topics include single cell cancer biology; studies of the human immunome; exploration of immunological cell types such as CD8+ T cells; decipherment of signaling processes of cancer; mass-tag cellular barcoding; analysis of protein interactions by proximity ligation assays; Cytobank, a platform for the analysis of cytometry data; computational analysis of high-dimensional flow cytometric data; computational deconvolution approaches for the description of intracellular signaling dynamics and hyperspectral cytometry. All 10 chapters of this book have been written by respected experts in their fields. It is an invaluable reference book for both basic and clinical researchers.




Geometric Structure of High-Dimensional Data and Dimensionality Reduction


Book Description

"Geometric Structure of High-Dimensional Data and Dimensionality Reduction" adopts data geometry as a framework to address various methods of dimensionality reduction. In addition to the introduction to well-known linear methods, the book moreover stresses the recently developed nonlinear methods and introduces the applications of dimensionality reduction in many areas, such as face recognition, image segmentation, data classification, data visualization, and hyperspectral imagery data analysis. Numerous tables and graphs are included to illustrate the ideas, effects, and shortcomings of the methods. MATLAB code of all dimensionality reduction algorithms is provided to aid the readers with the implementations on computers. The book will be useful for mathematicians, statisticians, computer scientists, and data analysts. It is also a valuable handbook for other practitioners who have a basic background in mathematics, statistics and/or computer algorithms, like internet search engine designers, physicists, geologists, electronic engineers, and economists. Jianzhong Wang is a Professor of Mathematics at Sam Houston State University, U.S.A.




Introduction to High-Dimensional Statistics


Book Description

Praise for the first edition: "[This book] succeeds singularly at providing a structured introduction to this active field of research. ... it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. ... recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research." —Journal of the American Statistical Association Introduction to High-Dimensional Statistics, Second Edition preserves the philosophy of the first edition: to be a concise guide for students and researchers discovering the area and interested in the mathematics involved. The main concepts and ideas are presented in simple settings, avoiding thereby unessential technicalities. High-dimensional statistics is a fast-evolving field, and much progress has been made on a large variety of topics, providing new insights and methods. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this new edition: Offers revised chapters from the previous edition, with the inclusion of many additional materials on some important topics, including compress sensing, estimation with convex constraints, the slope estimator, simultaneously low-rank and row-sparse linear regression, or aggregation of a continuous set of estimators. Introduces three new chapters on iterative algorithms, clustering, and minimax lower bounds. Provides enhanced appendices, minimax lower-bounds mainly with the addition of the Davis-Kahan perturbation bound and of two simple versions of the Hanson-Wright concentration inequality. Covers cutting-edge statistical methods including model selection, sparsity and the Lasso, iterative hard thresholding, aggregation, support vector machines, and learning theory. Provides detailed exercises at the end of every chapter with collaborative solutions on a wiki site. Illustrates concepts with simple but clear practical examples.