Subset Selection in Regression


Book Description

Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author ha




Optimal Subset Selection


Book Description

In the course of one's research, the expediency of meeting contractual and other externally imposed deadlines too often seems to take priority over what may be more significant research findings in the longer run. Such is the case with this volume which, despite our best intentions, has been put aside time and again since 1971 in favor of what seemed to be more urgent matters. Despite this delay, to our knowledge the principal research results and documentation presented here have not been superseded by other publications. The background of this endeavor may be of some historical interest, especially to those who agree that research is not a straightforward, mechanistic process whose outcome or even direction is known in ad vance. In the process of this brief recounting, we would like to express our gratitude to those individuals and organizations who facilitated and supported our efforts. We were introduced to the Beale, Kendall and Mann algorithm, the source of all our efforts, quite by chance. Professor Britton Harris suggested to me in April 1967 that I might like to attend a CEIR half-day seminar on optimal regression being given by Professor M. G. Kendall in Washington. D. C. I agreed that the topic seemed interesting and went along. Had it not been for Harris' suggestion and financial support, this work almost certainly would have never begun.




Feature Engineering and Selection


Book Description

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.







Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide


Book Description

This User’s Guide is a resource for investigators and stakeholders who develop and review observational comparative effectiveness research protocols. It explains how to (1) identify key considerations and best practices for research design; (2) build a protocol based on these standards and best practices; and (3) judge the adequacy and completeness of a protocol. Eleven chapters cover all aspects of research design, including: developing study objectives, defining and refining study questions, addressing the heterogeneity of treatment effect, characterizing exposure, selecting a comparator, defining and measuring outcomes, and identifying optimal data sources. Checklists of guidance and key considerations for protocols are provided at the end of each chapter. The User’s Guide was created by researchers affiliated with AHRQ’s Effective Health Care Program, particularly those who participated in AHRQ’s DEcIDE (Developing Evidence to Inform Decisions About Effectiveness) program. Chapters were subject to multiple internal and external independent reviews. More more information, please consult the Agency website: www.effectivehealthcare.ahrq.gov)




Linear Models in Statistics


Book Description

The essential introduction to the theory and application of linear models—now in a valuable new edition Since most advanced statistical tools are generalizations of the linear model, it is neces-sary to first master the linear model in order to move forward to more advanced concepts. The linear model remains the main tool of the applied statistician and is central to the training of any statistician regardless of whether the focus is applied or theoretical. This completely revised and updated new edition successfully develops the basic theory of linear models for regression, analysis of variance, analysis of covariance, and linear mixed models. Recent advances in the methodology related to linear mixed models, generalized linear models, and the Bayesian linear model are also addressed. Linear Models in Statistics, Second Edition includes full coverage of advanced topics, such as mixed and generalized linear models, Bayesian linear models, two-way models with empty cells, geometry of least squares, vector-matrix calculus, simultaneous inference, and logistic and nonlinear regression. Algebraic, geometrical, frequentist, and Bayesian approaches to both the inference of linear models and the analysis of variance are also illustrated. Through the expansion of relevant material and the inclusion of the latest technological developments in the field, this book provides readers with the theoretical foundation to correctly interpret computer software output as well as effectively use, customize, and understand linear models. This modern Second Edition features: New chapters on Bayesian linear models as well as random and mixed linear models Expanded discussion of two-way models with empty cells Additional sections on the geometry of least squares Updated coverage of simultaneous inference The book is complemented with easy-to-read proofs, real data sets, and an extensive bibliography. A thorough review of the requisite matrix algebra has been addedfor transitional purposes, and numerous theoretical and applied problems have been incorporated with selected answers provided at the end of the book. A related Web site includes additional data sets and SAS® code for all numerical examples. Linear Model in Statistics, Second Edition is a must-have book for courses in statistics, biostatistics, and mathematics at the upper-undergraduate and graduate levels. It is also an invaluable reference for researchers who need to gain a better understanding of regression and analysis of variance.




Statistical Learning with Sparsity


Book Description

Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl




Security and Intelligent Information Systems


Book Description

This book constitutes the thoroughly refereed post-conference proceedings of the Joint Meeting of the 2nd Luxembourg-Polish Symposium on Security and Trust and the 19th International Conference Intelligent Information Systems, held as International Joint Confererence on Security and Intelligent Information Systems, SIIS 2011, in Warsaw, Poland, in June 2011. The 29 revised full papers presented together with 2 invited lectures were carefully reviewed and selected from 60 initial submissions during two rounds of selection and improvement. The papers are organized in the following three thematic tracks: security and trust, data mining and machine learning, and natural language processing.




Learning Statistics with R


Book Description

"Learning Statistics with R" covers the contents of an introductory statistics class, as typically taught to undergraduate psychology students, focusing on the use of the R statistical software and adopting a light, conversational style throughout. The book discusses how to get started in R, and gives an introduction to data manipulation and writing scripts. From a statistical perspective, the book discusses descriptive statistics and graphing first, followed by chapters on probability theory, sampling and estimation, and null hypothesis testing. After introducing the theory, the book covers the analysis of contingency tables, t-tests, ANOVAs and regression. Bayesian statistics are covered at the end of the book. For more information (and the opportunity to check the book out before you buy!) visit http://ua.edu.au/ccs/teaching/lsr or http://learningstatisticswithr.com




Applications of Regression Models in Epidemiology


Book Description

A one-stop guide for public health students and practitioners learning the applications of classical regression models in epidemiology This book is written for public health professionals and students interested in applying regression models in the field of epidemiology. The academic material is usually covered in public health courses including (i) Applied Regression Analysis, (ii) Advanced Epidemiology, and (iii) Statistical Computing. The book is composed of 13 chapters, including an introduction chapter that covers basic concepts of statistics and probability. Among the topics covered are linear regression model, polynomial regression model, weighted least squares, methods for selecting the best regression equation, and generalized linear models and their applications to different epidemiological study designs. An example is provided in each chapter that applies the theoretical aspects presented in that chapter. In addition, exercises are included and the final chapter is devoted to the solutions of these academic exercises with answers in all of the major statistical software packages, including STATA, SAS, SPSS, and R. It is assumed that readers of this book have a basic course in biostatistics, epidemiology, and introductory calculus. The book will be of interest to anyone looking to understand the statistical fundamentals to support quantitative research in public health. In addition, this book: • Is based on the authors’ course notes from 20 years teaching regression modeling in public health courses • Provides exercises at the end of each chapter • Contains a solutions chapter with answers in STATA, SAS, SPSS, and R • Provides real-world public health applications of the theoretical aspects contained in the chapters Applications of Regression Models in Epidemiology is a reference for graduate students in public health and public health practitioners. ERICK SUÁREZ is a Professor of the Department of Biostatistics and Epidemiology at the University of Puerto Rico School of Public Health. He received a Ph.D. degree in Medical Statistics from the London School of Hygiene and Tropical Medicine. He has 29 years of experience teaching biostatistics. CYNTHIA M. PÉREZ is a Professor of the Department of Biostatistics and Epidemiology at the University of Puerto Rico School of Public Health. She received an M.S. degree in Statistics and a Ph.D. degree in Epidemiology from Purdue University. She has 22 years of experience teaching epidemiology and biostatistics. ROBERTO RIVERA is an Associate Professor at the College of Business at the University of Puerto Rico at Mayaguez. He received a Ph.D. degree in Statistics from the University of California in Santa Barbara. He has more than five years of experience teaching statistics courses at the undergraduate and graduate levels. MELISSA N. MARTÍNEZ is an Account Supervisor at Havas Media International. She holds an MPH in Biostatistics from the University of Puerto Rico and an MSBA from the National University in San Diego, California. For the past seven years, she has been performing analyses for the biomedical research and media advertising fields.