Estimation and Inference of Change Points in High Dimensional Factor Models


Book Description

In this paper, we consider the estimation of break points in high-dimensional factor models where the unobserved factors are estimated by principal component analysis (PCA). The factor loading matrix is assumed to have a structural break at an unknown time. We establish the conditions under which the least squares (LS) estimator is consistent for the break date. Our consistency result holds for both large and smaller breaks. We also find the LS estimator's asymptotic distribution. Simulation results confirm that the break date can be accurately estimated by the LS even if the breaks are small. In two empirical applications, we implement our method to estimate break points in the U.S. stock market and U.S. macroeconomy, respectively.




Large Dimensional Factor Analysis


Book Description

Large Dimensional Factor Analysis provides a survey of the main theoretical results for large dimensional factor models, emphasizing results that have implications for empirical work. The authors focus on the development of the static factor models and on the use of estimated factors in subsequent estimation and inference. Large Dimensional Factor Analysis discusses how to determine the number of factors, how to conduct inference when estimated factors are used in regressions, how to assess the adequacy pf observed variables as proxies for latent factors, how to exploit the estimated factors to test unit root tests and common trends, and how to estimate panel cointegration models.




Inference on Structural Changes in High Dimensional Linear Regression Models


Book Description

This dissertation is dedicated to studying the problem of constructing asymptotically valid confidence intervals for change points in high-dimensional linear models, where the number of parameters may vastly exceed the sampling period.In Chapter 2, we develop an algorithmic estimator for a single change point and establish the optimal rate of estimation, Op(Îl 8́22 ), where Îl represents the jump size under a high dimensional scaling. The optimal result ensures the existence of limiting distributions. Asymptotic distributions are derived under both vanishing and non-vanishing regimes of jump size. In the former case, it corresponds to the argmax of a two-sided Brownian motion, while in the latter case to the argmax of a two-sided random walk, both with negative drifts. We also provide the relationship between the two distributions, which allows construction of regime (vanishing vs non-vanishing) adaptive confidence intervals.In Chapter 3, we extend our analysis to the statistical inference for multiple change points in high-dimensional linear regression models. We develop locally refitted estimators and evaluate their convergence rates both component-wise and simultaneously. Following similar manner as in Chapter 2, we achieve an optimal rate of estimation under the component-wise scenario, which guarantees the existence of limiting distributions. While we also establish the simultaneous rate which is the sharpest available by a logarithmic factor. Component-wise and joint limiting distributions are derived under vanishing and non-vanishing regimes of jump sizes, demonstrating the relationship between distributions in the two regimes.Lastly in Chapter 4, we introduce a novel implementation method for finding preliminary change points estimates via integer linear programming, which has not yet been explored in the current literature.Overall, this dissertation provides a comprehensive framework for inference on single and multiple change points in high-dimensional linear models, offering novel and efficient algorithms with strong theoretical guarantees. All theoretical results are supported by Monte Carlo simulations.




Methods for Estimation and Inference for High-dimensional Models


Book Description

This thesis tackles three different problems in high-dimensional statistics. The first two parts of the thesis focus on estimation of sparse high-dimensional undirected graphical models under non-standard conditions, specifically, non-Gaussianity and missingness, when observations are continuous. To address estimation under non-Gaussianity, we propose a general framework involving augmenting the score matching losses introduced in Hyva ̈rinen [2005, 2007] with an l1-regularizing penalty. This method, which we refer to as regularized score matching, allows for computationally efficient treatment of Gaussian and non-Gaussian continuous exponential family models because the considered loss becomes a penalized quadratic and thus yields piecewise linear solution paths. Under suitable irrepresentability conditions and distributional assumptions, we show that regularized score matching generates consistent graph estimates in sparse high-dimensional settings. Through numerical experiments and an application to RNAseq data, we confirm that regularized score matching achieves state-of- the-art performance in the Gaussian case and provides a valuable tool for computationally efficient estimation in non-Gaussian graphical models. To address estimation of sparse high-dimensional undirected graphical models with missing observations, we propose adapting the regularized score matching framework by substituting in surrogates of relevant statistics to accommodate these circumstances, as in Loh and Wainwright [2012] and Kolar and Xing [2012]. For Gaussian and non-Gaussian continuous exponential family models, the use of these surrogates may result in a loss of semi-definiteness, and thus nonconvexity, in the objective. Nevertheless, under suitable distributional assumptions, the global optimum is close to the truth in matrix l1 norm with high probability in sparse high-dimensional settings. Furthermore, under the same set of assumptions, we show that the composite gradient descent algorithm we propose for minimizing the modified objective converges at a geometric rate to a solution close to the global optimum with high probability. The last part of the thesis moves away from undirected graphical models, and is instead concerned with inference in high-dimensional regression models. Specifically, we investigate how to construct asymptotically valid confidence intervals and p-values for the fixed effects in a high-dimensional linear mixed effect model. The framework we propose, largely founded on a recent work [Bu ̈hlmann, 2013], entails de-biasing a ‘naive’ ridge estimator. We show via numerical experiments that the method controls for Type I error in hypothesis testing and generates confidence intervals that achieve target coverage, outperforming competitors that assume observations are homogeneous when observations are, in fact, correlated within group.







Estimation and Inference in High-dimensional Models


Book Description

A wide variety of problems that are encountered in different fields can be formulated as an inference problem. Common examples of such inference problems include estimating parameters of a model from some observations, inverse problems where an unobserved signal is to be estimated based on a given model and some measurements, or a combination of the two where hidden signals along with some parameters of the model are to be estimated jointly. For example, various tasks in machine learning such as image inpainting and super-resolution can be cast as an inverse problem over deep neural networks. Similarly, in computational neuroscience, a common task is to estimate the parameters of a nonlinear dynamical system from neuronal activities. Despite wide application of different models and algorithms to solve these problems, our theoretical understanding of how these algorithms work is often incomplete. In this work, we try to bridge the gap between theory and practice by providing theoretical analysis of three different estimation problems. First, we consider the problem of estimating the input and hidden layer signals in a given multi-layer stochastic neural network with all the signals being matrix valued. Various problems such as multitask regression and classification, and inverse problems that use deep generative priors can be modeled as inference problem over multi-layer neural networks. We consider different types of estimators for such problems and exactly analyze the performance of these estimators in a certain high-dimensional regime known as the large system limit. Our analysis allows us to obtain the estimation error of all the hidden signals in the deep neural network as expectations over low-dimensional random variables that are characterized via a set of equations called the state evolution. Next, we analyze the problem of estimating a signal from convolutional observations via ridge estimation. Such convolutional inverse problems arise naturally in several fields such as imaging and seismology. The shared weights of the convolution operator introduces dependencies in the observations that makes analysis of such estimators difficult. By looking at the problem in the Fourier domain and using results about Fourier transform of a class of random processes, we show that this problem can be reduced to analysis of multiple ordinary ridge estimators, one for each frequency. This allows us to write the estimation error of the ridge estimator as an integral that depends on the spectrum of the underlying random process that generates the input features. Finally, we conclude this work by considering the problem of estimating the parameters of a multi-dimensional autoregressive generalized linear model with discrete values. Such processes take a linear combination of the past outputs of the process as the mean parameter of a generalized linear model that generates the future values. The coefficients of the linear combination are the parameters of the model and we seek to estimate these parameters under the assumption that they are sparse. This model can be used for example to model the spiking activity of neurons. In this problem, we obtain a high-probability upper bound for the estimation error of the parameters. Our experiments further support these theoretical results.




Estimation and Inference in Changepoint Models


Book Description

This thesis is motivated by statistical challenges that arise in the analysis of calcium imaging data, a new technology in neuroscience that makes it possible to record from huge numbers of neurons at single-neuron resolution. We consider the problem of estimating a neuron’s spike times from calcium imaging data. A simple and natural model suggests a non-convex optimization problem for this task. We show that by recasting the non-convex problem as a changepoint detection problem, we can efficiently solve it for the global optimum using a clever dynamic programming strategy. Furthermore, we introduce a new framework to quantify the uncertainty associated with a set of estimated changepoints in a change-in-mean model. In particular, we propose a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. This framework can be efficiently carried out in the case of changepoints estimated by binary segmentation and its variants, l0 segmentation, or the fused lasso, and is valid in finite samples. Our setup allows us to condition on much less information than existing approaches, thereby yielding higher powered tests. These ideas can be generalized to the spike estimation problem.




The Estimation and Inference of Complex Models


Book Description

In this thesis, we investigate the estimation problem and inference problem for the complex models. Two major categories of complex models are emphasized by us, one is generalized linear models, the other is time series models. For the generalized linear models, we consider one fundamental problem about sure screening for interaction terms in ultra-high dimensional feature space; for time series models, an important model assumption about Markov property is considered by us. The first part of this thesis illustrates the significant interaction pursuit problem for ultra-high dimensional models with two-way interaction effects. We propose a simple sure screening procedure (SSI) to detect significant interactions between the explanatory variables and the response variable in the high or ultra-high dimensional generalized linear regression models. Sure screening method is a simple, but powerful tool for the first step of feature selection or variable selection for ultra-high dimensional data. We investigate the sure screening properties of the proposal method from theoretical insight. Furthermore, we indicate that our proposed method can control the false discovery rate at a reasonable size, so the regularized variable selection methods can be easily applied to get more accurate feature selection in the following model selection procedures. Moreover, from the viewpoint of computational efficiency, we suggest a much more efficient algorithm-discretized SSI (DSSI) to realize our proposed sure screening method in practice. And we also investigate the properties of these two algorithms SSI and DSSI in simulation studies and apply them to some real data analyses for illustration. For the second part, our concern is the testing of the Markov property in time series processes. Markovian assumption plays an extremely important role in time series analysis and is also a fundamental assumption in economic and financial models. However, few existing research mainly focused on how to test the Markov properties for the time series processes. Therefore, for the Markovian assumption, we propose a new test procedure to check if the time series with beta-mixing possesses the Markov property. Our test is based on the Conditional Distance Covariance (CDCov). We investigate the theoretical properties of the proposed method. The asymptotic distribution of the proposed test statistic under the null hypothesis is obtained, and the power of the test procedure under local alternative hypothesizes have been studied. Simulation studies are conducted to demonstrate the finite sample performance of our test.