Nonparametric Kernel Density Estimation and Its Computational Aspects


Book Description

This book describes computational problems related to kernel density estimation (KDE) – one of the most important and widely used data smoothing techniques. A very detailed description of novel FFT-based algorithms for both KDE computations and bandwidth selection are presented. The theory of KDE appears to have matured and is now well developed and understood. However, there is not much progress observed in terms of performance improvements. This book is an attempt to remedy this. The book primarily addresses researchers and advanced graduate or postgraduate students who are interested in KDE and its computational aspects. The book contains both some background and much more sophisticated material, hence also more experienced researchers in the KDE area may find it interesting. The presented material is richly illustrated with many numerical examples using both artificial and real datasets. Also, a number of practical applications related to KDE are presented.




Multivariate Kernel Smoothing and Its Applications


Book Description

Kernel smoothing has greatly evolved since its inception to become an essential methodology in the data science tool kit for the 21st century. Its widespread adoption is due to its fundamental role for multivariate exploratory data analysis, as well as the crucial role it plays in composite solutions to complex data challenges. Multivariate Kernel Smoothing and Its Applications offers a comprehensive overview of both aspects. It begins with a thorough exposition of the approaches to achieve the two basic goals of estimating probability density functions and their derivatives. The focus then turns to the applications of these approaches to more complex data analysis goals, many with a geometric/topological flavour, such as level set estimation, clustering (unsupervised learning), principal curves, and feature significance. Other topics, while not direct applications of density (derivative) estimation but sharing many commonalities with the previous settings, include classification (supervised learning), nearest neighbour estimation, and deconvolution for data observed with error. For a data scientist, each chapter contains illustrative Open data examples that are analysed by the most appropriate kernel smoothing method. The emphasis is always placed on an intuitive understanding of the data provided by the accompanying statistical visualisations. For a reader wishing to investigate further the details of their underlying statistical reasoning, a graduated exposition to a unified theoretical framework is provided. The algorithms for efficient software implementation are also discussed. José E. Chacón is an associate professor at the Department of Mathematics of the Universidad de Extremadura in Spain. Tarn Duong is a Senior Data Scientist for a start-up which provides short distance carpooling services in France. Both authors have made important contributions to kernel smoothing research over the last couple of decades.




Nonparametric Econometrics


Book Description

A comprehensive, up-to-date textbook on nonparametric methods for students and researchers Until now, students and researchers in nonparametric and semiparametric statistics and econometrics have had to turn to the latest journal articles to keep pace with these emerging methods of economic analysis. Nonparametric Econometrics fills a major gap by gathering together the most up-to-date theory and techniques and presenting them in a remarkably straightforward and accessible format. The empirical tests, data, and exercises included in this textbook help make it the ideal introduction for graduate students and an indispensable resource for researchers. Nonparametric and semiparametric methods have attracted a great deal of attention from statisticians in recent decades. While the majority of existing books on the subject operate from the presumption that the underlying data is strictly continuous in nature, more often than not social scientists deal with categorical data—nominal and ordinal—in applied settings. The conventional nonparametric approach to dealing with the presence of discrete variables is acknowledged to be unsatisfactory. This book is tailored to the needs of applied econometricians and social scientists. Qi Li and Jeffrey Racine emphasize nonparametric techniques suited to the rich array of data types—continuous, nominal, and ordinal—within one coherent framework. They also emphasize the properties of nonparametric estimators in the presence of potentially irrelevant variables. Nonparametric Econometrics covers all the material necessary to understand and apply nonparametric methods for real-world problems.




Nonparametric Density Estimation


Book Description

This book gives a rigorous, systematic treatment of density estimates, their construction, use and analysis with full proofs. It develops L1 theory, rather than the classical L2, showing how L1 exposes fundamental properties of density estimates masked by L2.




Nonparametric and Semiparametric Models


Book Description

The statistical and mathematical principles of smoothing with a focus on applicable techniques are presented in this book. It naturally splits into two parts: The first part is intended for undergraduate students majoring in mathematics, statistics, econometrics or biometrics whereas the second part is intended to be used by master and PhD students or researchers. The material is easy to accomplish since the e-book character of the text gives a maximum of flexibility in learning (and teaching) intensity.




Combinatorial Methods in Density Estimation


Book Description

Density estimation has evolved enormously since the days of bar plots and histograms, but researchers and users are still struggling with the problem of the selection of the bin widths. This book is the first to explore a new paradigm for the data-based or automatic selection of the free parameters of density estimates in general so that the expected error is within a given constant multiple of the best possible error. The paradigm can be used in nearly all density estimates and for most model selection problems, both parametric and nonparametric.




Density Estimation for Statistics and Data Analysis


Book Description

Although there has been a surge of interest in density estimation in recent years, much of the published research has been concerned with purely technical matters with insufficient emphasis given to the technique's practical value. Furthermore, the subject has been rather inaccessible to the general statistician. The account presented in this book places emphasis on topics of methodological importance, in the hope that this will facilitate broader practical application of density estimation and also encourage research into relevant theoretical work. The book also provides an introduction to the subject for those with general interests in statistics. The important role of density estimation as a graphical technique is reflected by the inclusion of more than 50 graphs and figures throughout the text. Several contexts in which density estimation can be used are discussed, including the exploration and presentation of data, nonparametric discriminant analysis, cluster analysis, simulation and the bootstrap, bump hunting, projection pursuit, and the estimation of hazard rates and other quantities that depend on the density. This book includes general survey of methods available for density estimation. The Kernel method, both for univariate and multivariate data, is discussed in detail, with particular emphasis on ways of deciding how much to smooth and on computation aspects. Attention is also given to adaptive methods, which smooth to a greater degree in the tails of the distribution, and to methods based on the idea of penalized likelihood.




Smoothing of Multivariate Data


Book Description

An applied treatment of the key methods and state-of-the-art tools for visualizing and understanding statistical data Smoothing of Multivariate Data provides an illustrative and hands-on approach to the multivariate aspects of density estimation, emphasizing the use of visualization tools. Rather than outlining the theoretical concepts of classification and regression, this book focuses on the procedures for estimating a multivariate distribution via smoothing. The author first provides an introduction to various visualization tools that can be used to construct representations of multivariate functions, sets, data, and scales of multivariate density estimates. Next, readers are presented with an extensive review of the basic mathematical tools that are needed to asymptotically analyze the behavior of multivariate density estimators, with coverage of density classes, lower bounds, empirical processes, and manipulation of density estimates. The book concludes with an extensive toolbox of multivariate density estimators, including anisotropic kernel estimators, minimization estimators, multivariate adaptive histograms, and wavelet estimators. A completely interactive experience is encouraged, as all examples and figurescan be easily replicated using the R software package, and every chapter concludes with numerous exercises that allow readers to test their understanding of the presented techniques. The R software is freely available on the book's related Web site along with "Code" sections for each chapter that provide short instructions for working in the R environment. Combining mathematical analysis with practical implementations, Smoothing of Multivariate Data is an excellent book for courses in multivariate analysis, data analysis, and nonparametric statistics at the upper-undergraduate and graduatelevels. It also serves as a valuable reference for practitioners and researchers in the fields of statistics, computer science, economics, and engineering.




BONUS Algorithm for Large Scale Stochastic Nonlinear Programming Problems


Book Description

This book presents the details of the BONUS algorithm and its real world applications in areas like sensor placement in large scale drinking water networks, sensor placement in advanced power systems, water management in power systems, and capacity expansion of energy systems. A generalized method for stochastic nonlinear programming based on a sampling based approach for uncertainty analysis and statistical reweighting to obtain probability information is demonstrated in this book. Stochastic optimization problems are difficult to solve since they involve dealing with optimization and uncertainty loops. There are two fundamental approaches used to solve such problems. The first being the decomposition techniques and the second method identifies problem specific structures and transforms the problem into a deterministic nonlinear programming problem. These techniques have significant limitations on either the objective function type or the underlying distributions for the uncertain variables. Moreover, these methods assume that there are a small number of scenarios to be evaluated for calculation of the probabilistic objective function and constraints. This book begins to tackle these issues by describing a generalized method for stochastic nonlinear programming problems. This title is best suited for practitioners, researchers and students in engineering, operations research, and management science who desire a complete understanding of the BONUS algorithm and its applications to the real world.




Nonparametric Functional Estimation


Book Description

Nonparametric Functional Estimation is a compendium of papers, written by experts, in the area of nonparametric functional estimation. This book attempts to be exhaustive in nature and is written both for specialists in the area as well as for students of statistics taking courses at the postgraduate level. The main emphasis throughout the book is on the discussion of several methods of estimation and on the study of their large sample properties. Chapters are devoted to topics on estimation of density and related functions, the application of density estimation to classification problems, and the different facets of estimation of distribution functions. Statisticians and students of statistics and engineering will find the text very useful.