Compositional Data Analysis in the Geosciences


Book Description

Since Karl Pearson wrote his paper on spurious correlation in 1897, a lot has been said about the statistical analysis of compositional data, mainly by geologists such as Felix Chayes. The solution appeared in the 1980s, when John Aitchison proposed to use Iogratios. Since then, the approach has seen a great expansion, mainly building on the idea of the `natural geometry' of the sample space. Statistics is expected to give sense to our perception of the natural scale of the data, and this is made possible for compositional data using Iogratios. This publication will be a milestone in this process.




Compositional Data Analysis


Book Description

It is difficult to imagine that the statistical analysis of compositional data has been a major issue of concern for more than 100 years. It is even more difficult to realize that so many statisticians and users of statistics are unaware of the particular problems affecting compositional data, as well as their solutions. The issue of ``spurious correlation'', as the situation was phrased by Karl Pearson back in 1897, affects all data that measures parts of some whole, such as percentages, proportions, ppm and ppb. Such measurements are present in all fields of science, ranging from geology, biology, environmental sciences, forensic sciences, medicine and hydrology. This book presents the history and development of compositional data analysis along with Aitchison's log-ratio approach. Compositional Data Analysis describes the state of the art both in theoretical fields as well as applications in the different fields of science. Key Features: Reflects the state-of-the-art in compositional data analysis. Gives an overview of the historical development of compositional data analysis, as well as basic concepts and procedures. Looks at advances in algebra and calculus on the simplex. Presents applications in different fields of science, including, genomics, ecology, biology, geochemistry, planetology, chemistry and economics. Explores connections to correspondence analysis and the Dirichlet distribution. Presents a summary of three available software packages for compositional data analysis. Supported by an accompanying website featuring R code. Applied scientists working on compositional data analysis in any field of science, both in academia and professionals will benefit from this book, along with graduate students in any field of science working with compositional data.




Analyzing Compositional Data with R


Book Description

This book presents the statistical analysis of compositional data sets, i.e., data in percentages, proportions, concentrations, etc. The subject is covered from its grounding principles to the practical use in descriptive exploratory analysis, robust linear models and advanced multivariate statistical methods, including zeros and missing values, and paying special attention to data visualization and model display issues. Many illustrated examples and code chunks guide the reader into their modeling and interpretation. And, though the book primarily serves as a reference guide for the R package “compositions,” it is also a general introductory text on Compositional Data Analysis. Awareness of their special characteristics spread in the Geosciences in the early sixties, but a strategy for properly dealing with them was not available until the works of Aitchison in the eighties. Since then, research has expanded our understanding of their theoretical principles and the potentials and limitations of their interpretation. This is the first comprehensive textbook addressing these issues, as well as their practical implications with regard to software. The book is intended for scientists interested in statistically analyzing their compositional data. The subject enjoys relatively broad awareness in the geosciences and environmental sciences, but the spectrum of recent applications also covers areas like medicine, official statistics, and economics. Readers should be familiar with basic univariate and multivariate statistics. Knowledge of R is recommended but not required, as the book is self-contained.




Compositional Data Analysis in Practice


Book Description

Compositional data are quantitative descriptions of the parts of some whole, conveying exclusively relative information. Examples are found in various fields, including geology, medicine, chemistry, agriculture, economics, social science, etc. This concise book presents a very applied introduction to compositional data analysis, focussing on the use of R for analysis. It includes lots of real examples, code snippets, and colour figures, to illustrate the methods.




Compositional Data Analysis


Book Description

It is difficult to imagine that the statistical analysis of compositional data has been a major issue of concern for more than 100 years. It is even more difficult to realize that so many statisticians and users of statistics are unaware of the particular problems affecting compositional data, as well as their solutions. The issue of ``spurious correlation'', as the situation was phrased by Karl Pearson back in 1897, affects all data that measures parts of some whole, such as percentages, proportions, ppm and ppb. Such measurements are present in all fields of science, ranging from geology, biology, environmental sciences, forensic sciences, medicine and hydrology. This book presents the history and development of compositional data analysis along with Aitchison's log-ratio approach. Compositional Data Analysis describes the state of the art both in theoretical fields as well as applications in the different fields of science. Key Features: Reflects the state-of-the-art in compositional data analysis. Gives an overview of the historical development of compositional data analysis, as well as basic concepts and procedures. Looks at advances in algebra and calculus on the simplex. Presents applications in different fields of science, including, genomics, ecology, biology, geochemistry, planetology, chemistry and economics. Explores connections to correspondence analysis and the Dirichlet distribution. Presents a summary of three available software packages for compositional data analysis. Supported by an accompanying website featuring R code. Applied scientists working on compositional data analysis in any field of science, both in academia and professionals will benefit from this book, along with graduate students in any field of science working with compositional data.




Handbook of Mathematical Geosciences


Book Description

This Open Access handbook published at the IAMG's 50th anniversary, presents a compilation of invited path-breaking research contributions by award-winning geoscientists who have been instrumental in shaping the IAMG. It contains 45 chapters that are categorized broadly into five parts (i) theory, (ii) general applications, (iii) exploration and resource estimation, (iv) reviews, and (v) reminiscences covering related topics like mathematical geosciences, mathematical morphology, geostatistics, fractals and multifractals, spatial statistics, multipoint geostatistics, compositional data analysis, informatics, geocomputation, numerical methods, and chaos theory in the geosciences.




Statistics for Geoscientists


Book Description

Presents nearly all the important elementary and analytical methods of statistics, designed for the needs of the geoscientist and completely free from higher mathematics. Translated from the second German edition.




Data Science for the Geosciences


Book Description

Data Science for the Geosciences provides students and instructors with the statistical and machine learning foundations to address Earth science questions using real-world case studies in natural hazards, climate change, environmental contamination and Earth resources. It focuses on techniques that address common characteristics of geoscientific data, including extremes, multivariate, compositional, geospatial and space-time methods. Step-by-step instructions are provided, enabling readers to easily follow the protocols for each method, solve their geoscientific problems and make interpretations. With an emphasis on intuitive reasoning throughout, students are encouraged to develop their understanding without the need for complex mathematics, making this the perfect text for those with limited mathematical or coding experience. Students can test their skills with homework exercises that focus on data scientific analysis, modeling, and prediction problems, and through the use of supplemental Python notebooks that can be applied to real datasets worldwide.




Modeling and Analysis of Compositional Data


Book Description

Modeling and Analysis of Compositional Data presents a practical and comprehensive introduction to the analysis of compositional data along with numerous examples to illustrate both theory and application of each method. Based upon short courses delivered by the authors, it provides a complete and current compendium of fundamental to advanced methodologies along with exercises at the end of each chapter to improve understanding, as well as data and a solutions manual which is available on an accompanying website. Complementing Pawlowsky-Glahn’s earlier collective text that provides an overview of the state-of-the-art in this field, Modeling and Analysis of Compositional Data fills a gap in the literature for a much-needed manual for teaching, self learning or consulting.




Compositional Data Analysis in Practice


Book Description

Compositional Data Analysis in Practice is a user-oriented practical guide to the analysis of data with the property of a constant sum, for example percentages adding up to 100%. Compositional data can give misleading results if regular statistical methods are applied, and are best analysed by first transforming them to logarithms of ratios. This book explains how this transformation affects the analysis, results and interpretation of this very special type of data. All aspects of compositional data analysis are considered: visualization, modelling, dimension-reduction, clustering and variable selection, with many examples in the fields of food science, archaeology, sociology and biochemistry, and a final chapter containing a complete case study using fatty acid compositions in ecology. The applicability of these methods extends to other fields such as linguistics, geochemistry, marketing, economics and finance. R Software The following repository contains data files and R scripts from the book https://github.com/michaelgreenacre/CODAinPractice. The R package easyCODA, which accompanies this book, is available on CRAN -- note that you should have version 0.25 or higher. The latest version of the package will always be available on R-Forge and can be installed from R with this instruction: install.packages("easyCODA", repos="http://R-Forge.R-project.org").