Optimal Inference in a Class of Regression Models


Book Description

We consider the problem of constructing confidence intervals (CIs) for a linear functional of a regression function, such as its value at a point, the regression discontinuity parameter, or a regression coefficient in a linear or partly linear regression. Our main assumption is that the regression function is known to lie in a convex function class, which covers most smoothness and/or shape assumptions used in econometrics. We derive finite-sample optimal CIs and sharp efficiency bounds under normal errors with known variance. We show that these results translate to uniform (over the function class) asymptotic results when the error distribution is not known. When the function class is centrosymmetric, these efficiency bounds imply that minimax CIs are close to efficient at smooth regression functions. This implies, in particular, that it is impossible to form CIs that are tighter using data-dependent tuning parameters, and maintain coverage over the whole function class. We specialize our results to inference in a linear regression, and inference on the regression discontinuity parameter, and illustrate them in simulations and an empirical application.




Optimal Inference in Regression Models with Nearly Integrated Regressors


Book Description

This paper considers the problem of conducting inference on the regression coefficient in a bivariate regression model with a highly persistent regressor. Gaussian power envelopes are obtained for a class of testing procedures satisfying a conditionality restriction. In addition, the paper proposes feasible testing procedures that attain these Gaussian power envelopes whether or not the innovations of the regression model are normally distributed.




On Optimal Inference in the Linear IV Model


Book Description

This paper considers tests and confidence sets (CS's) concerning the coefficient on the endogenous variable in the linear IV regression model with homoskedastic normal errors and one right-hand side endogenous variable. The paper derives a finite-sample lower bound function for the probability that a CS constructed using a two-sided invariant similar test has infinite length and shows numerically that the conditional likelihood ratio (CLR) CS of Moreira (2003) is not always “very close,” say .005 or less, to this lower bound function. This implies that the CLR test is not always very close to the two-sided asymptotically-efficient (AE) power envelope for invariant similar tests of Andrews, Moreira, and Stock (2006) (AMS).On the other hand, the paper establishes the finite-sample optimality of the CLR test when the correlation between the structural and reduced-form errors, or between the two reduced-form errors, goes to 1 or -1 and other parameters are held constant, where optimality means achievement of the two-sided AE power envelope of AMS. These results cover the full range of (non-zero) IV strength.The paper investigates in detail scenarios in which the CLR test is not on the two-sided AE power envelope of AMS. Also, theory and numerical results indicate that the CLR test is close to having greatest average power, where the average is over a grid of concentration parameter values and over pairs alternative hypothesis values of the parameter of interest, uniformly over pairs of alternative hypothesis values and uniformly over the correlation between the structural and reduced-form errors. Here, “close” means .015 or less for k≤20, where k denotes the number of IV's, and .025 or less for 0




A Note on Optimal Inference in the Linear IV Model


Book Description

This paper considers tests and confidence sets (CSņs) concerning the coefficient on the endogenous variable in the linear IV regression model with homoskedastic normal errors and one right-hand side endogenous variable. The paper derives a finite-sample lower bound function for the probability that a CS constructed using a two-sided invariant similar test has infinite length and shows numerically that the conditional likelihood ratio (CLR) CS of Moreira (2003) is not always "very close" to this lower bound function. This implies that the CLR test is not always very close to the two-sided asymptotically-efficient (AE) power envelope for invariant similar tests of Andrews, Moreira, and Stock (2006) (AMS). On the other hand, the paper establishes the finite-sample optimality of the CLR test when the correlation between the structural and reduced-form errors, or between the two reduced-form errors, goes to 1 or -1 and other parameters are held constant, where optimality means achievement of the two-sided AE power envelope of AMS. These results cover the full range of (non-zero) IV strength. The paper investigates in detail scenarios in which the CLR test is not on the two-sided AE power envelope of AMS. Also, the paper shows via theory and numerical work that the CLR test is close to having greatest average power, where the average is over a grid of concentration parameter values and over pairs alternative hypothesis values of the parameter of interest, uniformly over pairs of alternative hypothesis values and uniformly over the correlation between the structural and reduced-form errors. The paper concludes that, although the CLR test is not always very close to the two-sided AE power envelope of AMS, CLR tests and CS's have very good overall properties.




Handbook of Econometrics


Book Description

Handbook of Econometrics, Volume 7A, examines recent advances in foundational issues and "hot" topics within econometrics, such as inference for moment inequalities and estimation of high dimensional models. With its world-class editors and contributors, it succeeds in unifying leading studies of economic models, mathematical statistics and economic data. Our flourishing ability to address empirical problems in economics by using economic theory and statistical methods has driven the field of econometrics to unimaginable places. By designing methods of inference from data based on models of human choice behavior and social interactions, econometricians have created new subfields now sufficiently mature to require sophisticated literature summaries. Presents a broader and more comprehensive view of this expanding field than any other handbook Emphasizes the connection between econometrics and economics Highlights current topics for which no good summaries exist




Optimal Designs for Minimizing Variances of Parameter Estimates in Linear Regression Models


Book Description

In statistical inference, it is important to estimate the parameters of a regression model in such a way that the variances of the estimates are as small as possible. Motivated by this fact, we have tried to address this important problem using optimal design theory. We start with some optimal design theory and determine the optimality conditions in terms of a directional derivative. We construct the optimal designs for minimizing variances of the parameter estimates in two ways. The first one is the analytic approach, in which we derive the derivatives of our criterion and solve the resulting equations. In another approach, we construct the designs using a class of algorithms. We also construct designs for minimizing the total variance of some parameter estimates. This is motivated by a practical problem in Chemistry. We attempt to improve the convergence of the algorithm by using the properties of the directional derivatives.




Simultaneous Inference in Regression


Book Description

Simultaneous confidence bands enable more intuitive and detailed inference of regression analysis than the standard inferential methods of parameter estimation and hypothesis testing. Simultaneous Inference in Regression provides a thorough overview of the construction methods and applications of simultaneous confidence bands for various inferentia




Beyond Multiple Linear Regression


Book Description

Beyond Multiple Linear Regression: Applied Generalized Linear Models and Multilevel Models in R is designed for undergraduate students who have successfully completed a multiple linear regression course, helping them develop an expanded modeling toolkit that includes non-normal responses and correlated structure. Even though there is no mathematical prerequisite, the authors still introduce fairly sophisticated topics such as likelihood theory, zero-inflated Poisson, and parametric bootstrapping in an intuitive and applied manner. The case studies and exercises feature real data and real research questions; thus, most of the data in the textbook comes from collaborative research conducted by the authors and their students, or from student projects. Every chapter features a variety of conceptual exercises, guided exercises, and open-ended exercises using real data. After working through this material, students will develop an expanded toolkit and a greater appreciation for the wider world of data and statistical modeling. A solutions manual for all exercises is available to qualified instructors at the book’s website at www.routledge.com, and data sets and Rmd files for all case studies and exercises are available at the authors’ GitHub repo (https://github.com/proback/BeyondMLR)




Inference on Structural Changes in High Dimensional Linear Regression Models


Book Description

This dissertation is dedicated to studying the problem of constructing asymptotically valid confidence intervals for change points in high-dimensional linear models, where the number of parameters may vastly exceed the sampling period.In Chapter 2, we develop an algorithmic estimator for a single change point and establish the optimal rate of estimation, Op(Îl 8́22 ), where Îl represents the jump size under a high dimensional scaling. The optimal result ensures the existence of limiting distributions. Asymptotic distributions are derived under both vanishing and non-vanishing regimes of jump size. In the former case, it corresponds to the argmax of a two-sided Brownian motion, while in the latter case to the argmax of a two-sided random walk, both with negative drifts. We also provide the relationship between the two distributions, which allows construction of regime (vanishing vs non-vanishing) adaptive confidence intervals.In Chapter 3, we extend our analysis to the statistical inference for multiple change points in high-dimensional linear regression models. We develop locally refitted estimators and evaluate their convergence rates both component-wise and simultaneously. Following similar manner as in Chapter 2, we achieve an optimal rate of estimation under the component-wise scenario, which guarantees the existence of limiting distributions. While we also establish the simultaneous rate which is the sharpest available by a logarithmic factor. Component-wise and joint limiting distributions are derived under vanishing and non-vanishing regimes of jump sizes, demonstrating the relationship between distributions in the two regimes.Lastly in Chapter 4, we introduce a novel implementation method for finding preliminary change points estimates via integer linear programming, which has not yet been explored in the current literature.Overall, this dissertation provides a comprehensive framework for inference on single and multiple change points in high-dimensional linear models, offering novel and efficient algorithms with strong theoretical guarantees. All theoretical results are supported by Monte Carlo simulations.




Biometrika


Book Description

The year 2001 marks the centenary of Biometrika, one of the world's leading academic journals in statistical theory and methodology. In celebration of this, the book brings together two sets of papers from the journal. The first comprises seven specially commissioned articles (authors: D.R. Cox, A.C. Davison, Anthony C. Atkinson and R.A. Bailey, David Oakes, Peter Hall, T.M.F. Smith, and Howell Tong). These articles review the history of the journal and the most important contributions made by appearing in the journal in a number of important areas of statitisical activity, including general theory and methodology, surveys and time sets. In the process the papers describe the general development of statistical science during the twentieth century. The second group of ten papers are a selection of particularly seminal articles form the journal's first hundred years. The book opens with an introduction by the editors Professor D.M. Titterington and Sir David Cox.