The Elements of Statistical Learning


Book Description

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.




Statistical Inference


Book Description

This classic textbook builds theoretical statistics from the first principles of probability theory. Starting from the basics of probability, the authors develop the theory of statistical inference using techniques, definitions, and concepts that are statistical and natural extensions, and consequences, of previous concepts. It covers all topics from a standard inference course including: distributions, random variables, data reduction, point estimation, hypothesis testing, and interval estimation. Features The classic graduate-level textbook on statistical inference Develops elements of statistical theory from first principles of probability Written in a lucid style accessible to anyone with some background in calculus Covers all key topics of a standard course in inference Hundreds of examples throughout to aid understanding Each chapter includes an extensive set of graduated exercises Statistical Inference, Second Edition is primarily aimed at graduate students of statistics, but can be used by advanced undergraduate students majoring in statistics who have a solid mathematics background. It also stresses the more practical uses of statistical theory, being more concerned with understanding basic statistical concepts and deriving reasonable statistical procedures, while less focused on formal optimality considerations. This is a reprint of the second edition originally published by Cengage Learning, Inc. in 2001.




Computer Age Statistical Inference, Student Edition


Book Description

The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and influence. 'Data science' and 'machine learning' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? How does it all fit together? Now in paperback and fortified with exercises, this book delivers a concentrated course in modern statistical thinking. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov Chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. Each chapter ends with class-tested exercises, and the book concludes with speculation on the future direction of statistics and data science.




All of Statistics


Book Description

Taken literally, the title "All of Statistics" is an exaggeration. But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like non-parametric curve estimation, bootstrapping, and classification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analysing data.







Essentials of Statistical Inference


Book Description

Aimed at advanced undergraduates and graduate students in mathematics and related disciplines, this engaging textbook gives a concise account of the main approaches to inference, with particular emphasis on the contrasts between them. It is the first textbook to synthesize contemporary material on computational topics with basic mathematical theory.




An Introduction to Statistical Learning


Book Description

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data. Four of the authors co-wrote An Introduction to Statistical Learning, With Applications in R (ISLR), which has become a mainstay of undergraduate and graduate classrooms worldwide, as well as an important reference book for data scientists. One of the keys to its success was that each chapter contains a tutorial on implementing the analyses and methods presented in the R scientific computing environment. However, in recent years Python has become a popular language for data science, and there has been increasing demand for a Python-based alternative to ISLR. Hence, this book (ISLP) covers the same materials as ISLR but with labs implemented in Python. These labs will be useful both for Python novices, as well as experienced users.







Essential Statistical Inference


Book Description

​This book is for students and researchers who have had a first year graduate level mathematical statistics course. It covers classical likelihood, Bayesian, and permutation inference; an introduction to basic asymptotic distribution theory; and modern topics like M-estimation, the jackknife, and the bootstrap. R code is woven throughout the text, and there are a large number of examples and problems. An important goal has been to make the topics accessible to a wide audience, with little overt reliance on measure theory. A typical semester course consists of Chapters 1-6 (likelihood-based estimation and testing, Bayesian inference, basic asymptotic results) plus selections from M-estimation and related testing and resampling methodology. Dennis Boos and Len Stefanski are professors in the Department of Statistics at North Carolina State. Their research has been eclectic, often with a robustness angle, although Stefanski is also known for research concentrated on measurement error, including a co-authored book on non-linear measurement error models. In recent years the authors have jointly worked on variable selection methods. ​