Missing Data and Small-Area Estimation


Book Description

This book evolved from lectures, courses and workshops on missing data and small-area estimation that I presented during my tenure as the ?rst C- pion Fellow (2000–2002). For the Fellowship I proposed these two topics as areas in which the academic statistics could contribute to the development of government statistics, in exchange for access to the operational details and background that would inform the direction and sharpen the focus of a- demic research. After a few years of involvement, I have come to realise that the separation of ‘academic’ and ‘industrial’ statistics is not well suited to either party, and their integration is the key to progress in both branches. Most of the work on this monograph was done while I was a visiting l- turer at Massey University, Palmerston North, New Zealand. The hospitality and stimulating academic environment of their Institute of Information S- ence and Technology is gratefully acknowledged. I could not name all those who commented on my lecture notes and on the presentations themselves; apart from them, I want to thank the organisers and silent attendees of all the events, and, with a modicum of reluctance, the ‘grey ?gures’ who kept inquiring whether I was any nearer the completion of whatever stage I had been foolish enough to attach a date.




Introduction to Small Area Estimation Techniques


Book Description

This guide to small area estimation aims to help users compile more reliable granular or disaggregated data in cost-effective ways. It explains small area estimation techniques with examples of how the easily accessible R analytical platform can be used to implement them, particularly to estimate indicators on poverty, employment, and health outcomes. The guide is intended for staff of national statistics offices and for other development practitioners. It aims to help them to develop and implement targeted socioeconomic policies to ensure that the vulnerable segments of societies are not left behind, and to monitor progress toward the Sustainable Development Goals.




Missing Data and Small-Area Estimation


Book Description

This book evolved from lectures, courses and workshops on missing data and small-area estimation that I presented during my tenure as the ?rst C- pion Fellow (2000–2002). For the Fellowship I proposed these two topics as areas in which the academic statistics could contribute to the development of government statistics, in exchange for access to the operational details and background that would inform the direction and sharpen the focus of a- demic research. After a few years of involvement, I have come to realise that the separation of ‘academic’ and ‘industrial’ statistics is not well suited to either party, and their integration is the key to progress in both branches. Most of the work on this monograph was done while I was a visiting l- turer at Massey University, Palmerston North, New Zealand. The hospitality and stimulating academic environment of their Institute of Information S- ence and Technology is gratefully acknowledged. I could not name all those who commented on my lecture notes and on the presentations themselves; apart from them, I want to thank the organisers and silent attendees of all the events, and, with a modicum of reluctance, the ‘grey ?gures’ who kept inquiring whether I was any nearer the completion of whatever stage I had been foolish enough to attach a date.




Statistical Analysis with Missing Data


Book Description

An up-to-date, comprehensive treatment of a classic text on missing data in statistics The topic of missing data has gained considerable attention in recent decades. This new edition by two acknowledged experts on the subject offers an up-to-date account of practical methodology for handling missing data problems. Blending theory and application, authors Roderick Little and Donald Rubin review historical approaches to the subject and describe simple methods for multivariate analysis with missing values. They then provide a coherent theory for analysis of problems based on likelihoods derived from statistical models for the data and the missing data mechanism, and then they apply the theory to a wide range of important missing data problems. Statistical Analysis with Missing Data, Third Edition starts by introducing readers to the subject and approaches toward solving it. It looks at the patterns and mechanisms that create the missing data, as well as a taxonomy of missing data. It then goes on to examine missing data in experiments, before discussing complete-case and available-case analysis, including weighting methods. The new edition expands its coverage to include recent work on topics such as nonresponse in sample surveys, causal inference, diagnostic methods, and sensitivity analysis, among a host of other topics. An updated “classic” written by renowned authorities on the subject Features over 150 exercises (including many new ones) Covers recent work on important methods like multiple imputation, robust alternatives to weighting, and Bayesian methods Revises previous topics based on past student feedback and class experience Contains an updated and expanded bibliography The authors were awarded The Karl Pearson Prize in 2017 by the International Statistical Institute, for a research contribution that has had profound influence on statistical theory, methodology or applications. Their work "has been no less than defining and transforming." (ISI) Statistical Analysis with Missing Data, Third Edition is an ideal textbook for upper undergraduate and/or beginning graduate level students of the subject. It is also an excellent source of information for applied statisticians and practitioners in government and industry.




Multiple Imputation of Missing Data in Practice


Book Description

Multiple Imputation of Missing Data in Practice: Basic Theory and Analysis Strategies provides a comprehensive introduction to the multiple imputation approach to missing data problems that are often encountered in data analysis. Over the past 40 years or so, multiple imputation has gone through rapid development in both theories and applications. It is nowadays the most versatile, popular, and effective missing-data strategy that is used by researchers and practitioners across different fields. There is a strong need to better understand and learn about multiple imputation in the research and practical community. Accessible to a broad audience, this book explains statistical concepts of missing data problems and the associated terminology. It focuses on how to address missing data problems using multiple imputation. It describes the basic theory behind multiple imputation and many commonly-used models and methods. These ideas are illustrated by examples from a wide variety of missing data problems. Real data from studies with different designs and features (e.g., cross-sectional data, longitudinal data, complex surveys, survival data, studies subject to measurement error, etc.) are used to demonstrate the methods. In order for readers not only to know how to use the methods, but understand why multiple imputation works and how to choose appropriate methods, simulation studies are used to assess the performance of the multiple imputation methods. Example datasets and sample programming code are either included in the book or available at a github site (https://github.com/he-zhang-hsu/multiple_imputation_book). Key Features Provides an overview of statistical concepts that are useful for better understanding missing data problems and multiple imputation analysis Provides a detailed discussion on multiple imputation models and methods targeted to different types of missing data problems (e.g., univariate and multivariate missing data problems, missing data in survival analysis, longitudinal data, complex surveys, etc.) Explores measurement error problems with multiple imputation Discusses analysis strategies for multiple imputation diagnostics Discusses data production issues when the goal of multiple imputation is to release datasets for public use, as done by organizations that process and manage large-scale surveys with nonresponse problems For some examples, illustrative datasets and sample programming code from popular statistical packages (e.g., SAS, R, WinBUGS) are included in the book. For others, they are available at a github site (https://github.com/he-zhang-hsu/multiple_imputation_book)




Big Data Meets Survey Science


Book Description

Offers a clear view of the utility and place for survey data within the broader Big Data ecosystem This book presents a collection of snapshots from two sides of the Big Data perspective. It assembles an array of tangible tools, methods, and approaches that illustrate how Big Data sources and methods are being used in the survey and social sciences to improve official statistics and estimates for human populations. It also provides examples of how survey data are being used to evaluate and improve the quality of insights derived from Big Data. Big Data Meets Survey Science: A Collection of Innovative Methods shows how survey data and Big Data are used together for the benefit of one or more sources of data, with numerous chapters providing consistent illustrations and examples of survey data enriching the evaluation of Big Data sources. Examples of how machine learning, data mining, and other data science techniques are inserted into virtually every stage of the survey lifecycle are presented. Topics covered include: Total Error Frameworks for Found Data; Performance and Sensitivities of Home Detection on Mobile Phone Data; Assessing Community Wellbeing Using Google Street View and Satellite Imagery; Using Surveys to Build and Assess RBS Religious Flag; and more. Presents groundbreaking survey methods being utilized today in the field of Big Data Explores how machine learning methods can be applied to the design, collection, and analysis of social science data Filled with examples and illustrations that show how survey data benefits Big Data evaluation Covers methods and applications used in combining Big Data with survey statistics Examines regulations as well as ethical and privacy issues Big Data Meets Survey Science: A Collection of Innovative Methods is an excellent book for both the survey and social science communities as they learn to capitalize on this new revolution. It will also appeal to the broader data and computer science communities looking for new areas of application for emerging methods and data sources.




Small Area Estimation and Microsimulation Modeling


Book Description

Small Area Estimation and Microsimulation Modeling is the first practical handbook that comprehensively presents modern statistical SAE methods in the framework of ultramodern spatial microsimulation modeling while providing the novel approach of creating synthetic spatial microdata. Along with describing the necessary theories and their advantages and limitations, the authors illustrate the practical application of the techniques to a large number of substantive problems, including how to build up models, organize and link data, create synthetic microdata, conduct analyses, yield informative tables and graphs, and evaluate how the findings effectively support the decision making processes in government and non-government organizations. Features Covers both theoretical and applied aspects for real-world comparative research and regional statistics production Thoroughly explains how microsimulation modeling technology can be constructed using available datasets for reliable small area statistics Provides SAS codes that allow readers to utilize these latest technologies in their own work. This book is designed for advanced graduate students, academics, professionals and applied practitioners who are generally interested in small area estimation and/or microsimulation modeling and dealing with vital issues in social and behavioural sciences, applied economics and policy analysis, government and/or social statistics, health sciences, business, psychology, environmental and agriculture modeling, computational statistics and data simulation, spatial statistics, transport and urban planning, and geospatial modeling. Dr Azizur Rahman is a Senior Lecturer in Statistics and convenor of the Graduate Program in Applied Statistics at the Charles Sturt University, and an Adjunct Associate Professor of Public Health and Biostatistics at the University of Canberra. His research encompasses small area estimation, applied economics, microsimulation modeling, Bayesian inference and public health. He has more than 60 scholarly publications including two books. Dr. Rahman’s research is funded by the Australian Federal and State Governments, and he serves on a range of editorial boards including the International Journal of Microsimulation (IJM). Professor Ann Harding, AO is an Emeritus Professor of Applied Economics and Social Policy at the National Centre for Social and Economic Modelling (NATSEM) of the University of Canberra. She was the founder and inaugural Director of this world class Research Centre for more than sixteen years, and also a co-founder of the International Microsimulation Association (IMA) and served as the inaugural elected president of IMA from 2004 to 2011. She is a fellow of the Academy of the Social Sciences in Australia. She has more than 300 publications including several books in microsimulation modeling.




Small Area Estimation


Book Description

Praise for the First Edition "This pioneering work, in which Rao provides a comprehensive and up-to-date treatment of small area estimation, will become a classic...I believe that it has the potential to turn small area estimation...into a larger area of importance to both researchers and practitioners." —Journal of the American Statistical Association Written by two experts in the field, Small Area Estimation, Second Edition provides a comprehensive and up-to-date account of the methods and theory of small area estimation (SAE), particularly indirect estimation based on explicit small area linking models. The model-based approach to small area estimation offers several advantages including increased precision, the derivation of "optimal" estimates and associated measures of variability under an assumed model, and the validation of models from the sample data. Emphasizing real data throughout, the Second Edition maintains a self-contained account of crucial theoretical and methodological developments in the field of SAE. The new edition provides extensive accounts of new and updated research, which often involves complex theory to handle model misspecifications and other complexities. Including information on survey design issues and traditional methods employing indirect estimates based on implicit linking models, Small Area Estimation, Second Edition also features: Additional sections describing the use of R code data sets for readers to use when replicating applications Numerous examples of SAE applications throughout each chapter, including recent applications in U.S. Federal programs New topical coverage on extended design issues, synthetic estimation, further refinements and solutions to the Fay-Herriot area level model, basic unit level models, and spatial and time series models A discussion of the advantages and limitations of various SAE methods for model selection from data as well as comparisons of estimates derived from models to reliable values obtained from external sources, such as previous census or administrative data Small Area Estimation, Second Edition is an excellent reference for practicing statisticians and survey methodologists as well as practitioners interested in learning SAE methods. The Second Edition is also an ideal textbook for graduate-level courses in SAE and reliable small area statistics.




A Course on Small Area Estimation and Mixed Models


Book Description

This advanced textbook explores small area estimation techniques, covers the underlying mathematical and statistical theory and offers hands-on support with their implementation. It presents the theory in a rigorous way and compares and contrasts various statistical methodologies, helping readers understand how to develop new methodologies for small area estimation. It also includes numerous sample applications of small area estimation techniques. The underlying R code is provided in the text and applied to four datasets that mimic data from labor markets and living conditions surveys, where the socioeconomic indicators include the small area estimation of total unemployment, unemployment rates, average annual household incomes and poverty indicators. Given its scope, the book will be useful for master and PhD students, and for official and other applied statisticians.




Federal Statistics, Multiple Data Sources, and Privacy Protection


Book Description

The environment for obtaining information and providing statistical data for policy makers and the public has changed significantly in the past decade, raising questions about the fundamental survey paradigm that underlies federal statistics. New data sources provide opportunities to develop a new paradigm that can improve timeliness, geographic or subpopulation detail, and statistical efficiency. It also has the potential to reduce the costs of producing federal statistics. The panel's first report described federal statistical agencies' current paradigm, which relies heavily on sample surveys for producing national statistics, and challenges agencies are facing; the legal frameworks and mechanisms for protecting the privacy and confidentiality of statistical data and for providing researchers access to data, and challenges to those frameworks and mechanisms; and statistical agencies access to alternative sources of data. The panel recommended a new approach for federal statistical programs that would combine diverse data sources from government and private sector sources and the creation of a new entity that would provide the foundational elements needed for this new approach, including legal authority to access data and protect privacy. This second of the panel's two reports builds on the analysis, conclusions, and recommendations in the first one. This report assesses alternative methods for implementing a new approach that would combine diverse data sources from government and private sector sources, including describing statistical models for combining data from multiple sources; examining statistical and computer science approaches that foster privacy protections; evaluating frameworks for assessing the quality and utility of alternative data sources; and various models for implementing the recommended new entity. Together, the two reports offer ideas and recommendations to help federal statistical agencies examine and evaluate data from alternative sources and then combine them as appropriate to provide the country with more timely, actionable, and useful information for policy makers, businesses, and individuals.