Analyzing High-Dimensional Gene Expression and DNA Methylation Data with R


Book Description

Analyzing high-dimensional gene expression and DNA methylation data with R is the first practical book that shows a ``pipeline" of analytical methods with concrete examples starting from raw gene expression and DNA methylation data at the genome scale. Methods on quality control, data pre-processing, data mining, and further assessments are presented in the book, and R programs based on simulated data and real data are included. Codes with example data are all reproducible. Features: • Provides a sequence of analytical tools for genome-scale gene expression data and DNA methylation data, starting from quality control and pre-processing of raw genome-scale data. • Organized by a parallel presentation with explanation on statistical methods and corresponding R packages/functions in quality control, pre-processing, and data analyses (e.g., clustering and networks). • Includes source codes with simulated and real data to reproduce the results. Readers are expected to gain the ability to independently analyze genome-scaled expression and methylation data and detect potential biomarkers. This book is ideal for students majoring in statistics, biostatistics, and bioinformatics and researchers with an interest in high dimensional genetic and epigenetic studies.




Computational Genomics with R


Book Description

Computational Genomics with R provides a starting point for beginners in genomic data analysis and also guides more advanced practitioners to sophisticated data analysis techniques in genomics. The book covers topics from R programming, to machine learning and statistics, to the latest genomic data analysis techniques. The text provides accessible information and explanations, always with the genomics context in the background. This also contains practical and well-documented examples in R so readers can analyze their data by simply reusing the code presented. As the field of computational genomics is interdisciplinary, it requires different starting points for people with different backgrounds. For example, a biologist might skip sections on basic genome biology and start with R programming, whereas a computer scientist might want to start with genome biology. After reading: You will have the basics of R and be able to dive right into specialized uses of R for computational genomics such as using Bioconductor packages. You will be familiar with statistics, supervised and unsupervised learning techniques that are important in data modeling, and exploratory analysis of high-dimensional data. You will understand genomic intervals and operations on them that are used for tasks such as aligned read counting and genomic feature annotation. You will know the basics of processing and quality checking high-throughput sequencing data. You will be able to do sequence analysis, such as calculating GC content for parts of a genome or finding transcription factor binding sites. You will know about visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization. You will be familiar with analysis of different high-throughput sequencing data sets, such as RNA-seq, ChIP-seq, and BS-seq. You will know basic techniques for integrating and interpreting multi-omics datasets. Altuna Akalin is a group leader and head of the Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center, Berlin. He has been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. He has published an extensive body of work in this area. The framework for this book grew out of the yearly computational genomics courses he has been organizing and teaching since 2015.




Multivariate Data Integration Using R


Book Description

Large biological data, which are often noisy and high-dimensional, have become increasingly prevalent in biology and medicine. There is a real need for good training in statistics, from data exploration through to analysis and interpretation. This book provides an overview of statistical and dimension reduction methods for high-throughput biological data, with a specific focus on data integration. It starts with some biological background, key concepts underlying the multivariate methods, and then covers an array of methods implemented using the mixOmics package in R. Features: Provides a broad and accessible overview of methods for multi-omics data integration Covers a wide range of multivariate methods, each designed to answer specific biological questions Includes comprehensive visualisation techniques to aid in data interpretation Includes many worked examples and case studies using real data Includes reproducible R code for each multivariate method, using the mixOmics package The book is suitable for researchers from a wide range of scientific disciplines wishing to apply these methods to obtain new and deeper insights into biological mechanisms and biomedical problems. The suite of tools introduced in this book will enable students and scientists to work at the interface between, and provide critical collaborative expertise to, biologists, bioinformaticians, statisticians and clinicians.




Bayesian Approaches in Oncology Using R and OpenBUGS


Book Description

Bayesian Approaches in Oncology Using R and OpenBUGS serves two audiences: those who are familiar with the theory and applications of bayesian approach and wish to learn or enhance their skills in R and OpenBUGS, and those who are enrolled in R and OpenBUGS-based course for bayesian approach implementation. For those who have never used R/OpenBUGS, the book begins with a self-contained introduction to R that lays the foundation for later chapters. Many books on the bayesian approach and the statistical analysis are advanced, and many are theoretical. While most of them do cover the objective, the fact remains that data analysis can not be performed without actually doing it, and this means using dedicated statistical software. There are several software packages, all with their specific objective. Finally, all packages are free to use, are versatile with problem-solving, and are interactive with R and OpenBUGS. This book continues to cover a range of techniques related to oncology that grow in statistical analysis. It intended to make a single source of information on Bayesian statistical methodology for oncology research to cover several dimensions of statistical analysis. The book explains data analysis using real examples and includes all the R and OpenBUGS codes necessary to reproduce the analyses. The idea is to overall extending the Bayesian approach in oncology practice. It presents four sections to the statistical application framework: Bayesian in Clinical Research and Sample Size Calcuation Bayesian in Time-to-Event Data Analysis Bayesian in Longitudinal Data Analysis Bayesian in Diagnostics Test Statistics This book is intended as a first course in bayesian biostatistics for oncology students. An oncologist can find useful guidance for implementing bayesian in research work. It serves as a practical guide and an excellent resource for learning the theory and practice of bayesian methods for the applied statistician, biostatistician, and data scientist.




Introduction to Bioinformatics with R


Book Description

In biological research, the amount of data available to researchers has increased so much over recent years, it is becoming increasingly difficult to understand the current state of the art without some experience and understanding of data analytics and bioinformatics. An Introduction to Bioinformatics with R: A Practical Guide for Biologists leads the reader through the basics of computational analysis of data encountered in modern biological research. With no previous experience with statistics or programming required, readers will develop the ability to plan suitable analyses of biological datasets, and to use the R programming environment to perform these analyses. This is achieved through a series of case studies using R to answer research questions using molecular biology datasets. Broadly applicable statistical methods are explained, including linear and rank-based correlation, distance metrics and hierarchical clustering, hypothesis testing using linear regression, proportional hazards regression for survival data, and principal component analysis. These methods are then applied as appropriate throughout the case studies, illustrating how they can be used to answer research questions. Key Features: · Provides a practical course in computational data analysis suitable for students or researchers with no previous exposure to computer programming. · Describes in detail the theoretical basis for statistical analysis techniques used throughout the textbook, from basic principles · Presents walk-throughs of data analysis tasks using R and example datasets. All R commands are presented and explained in order to enable the reader to carry out these tasks themselves. · Uses outputs from a large range of molecular biology platforms including DNA methylation and genotyping microarrays; RNA-seq, genome sequencing, ChIP-seq and bisulphite sequencing; and high-throughput phenotypic screens. · Gives worked-out examples geared towards problems encountered in cancer research, which can also be applied across many areas of molecular biology and medical research. This book has been developed over years of training biological scientists and clinicians to analyse the large datasets available in their cancer research projects. It is appropriate for use as a textbook or as a practical book for biological scientists looking to gain bioinformatics skills.




Data Integration, Manipulation and Visualization of Phylogenetic Trees


Book Description

Data Integration, Manipulation and Visualization of Phylogenetic Trees introduces and demonstrates data integration, manipulation and visualization of phylogenetic trees using a suite of R packages, tidytree, treeio, ggtree and ggtreeExtra. Using the most comprehensive packages for phylogenetic data integration and visualization, contains numerous examples that can be used for teaching and learning. Ideal for undergraduate readers and researchers with a working knowledge of R and ggplot2. Key Features: Manipulating phylogenetic tree with associated data using tidy verbs Integrating phylogenetic data from diverse sources Visualizing phylogenetic data using grammar of graphics




Data Analysis for the Life Sciences with R


Book Description

This book covers several of the statistical concepts and data analytic skills needed to succeed in data-driven life science research. The authors proceed from relatively basic concepts related to computed p-values to advanced topics related to analyzing highthroughput data. They include the R code that performs this analysis and connect the lines of code to the statistical and mathematical concepts explained.







Comprehensive Chemometrics


Book Description

Comprehensive Chemometrics, Second Edition, Four Volume Set features expanded and updated coverage, along with new content that covers advances in the field since the previous edition published in 2009. Subject of note include updates in the fields of multidimensional and megavariate data analysis, omics data analysis, big chemical and biochemical data analysis, data fusion and sparse methods. The book follows a similar structure to the previous edition, using the same section titles to frame articles. Many chapters from the previous edition are updated, but there are also many new chapters on the latest developments. Presents integrated reviews of each chemical and biological method, examining their merits and limitations through practical examples and extensive visuals Bridges a gap in knowledge, covering developments in the field since the first edition published in 2009 Meticulously organized, with articles split into 4 sections and 12 sub-sections on key topics to allow students, researchers and professionals to find relevant information quickly and easily Written by academics and practitioners from various fields and regions to ensure that the knowledge within is easily understood and applicable to a large audience Presents integrated reviews of each chemical and biological method, examining their merits and limitations through practical examples and extensive visuals Bridges a gap in knowledge, covering developments in the field since the first edition published in 2009 Meticulously organized, with articles split into 4 sections and 12 sub-sections on key topics to allow students, researchers and professionals to find relevant information quickly and easily Written by academics and practitioners from various fields and regions to ensure that the knowledge within is easily understood and applicable to a large audience




Comprehensive Toxicology


Book Description

Comprehensive Toxicology, Third Edition, Fifteen Volume Set discusses chemical effects on biological systems, with a focus on understanding the mechanisms by which chemicals induce adverse health effects. Organized by organ system, this comprehensive reference work addresses the toxicological effects of chemicals on the immune system, the hematopoietic system, cardiovascular system, respiratory system, hepatic toxicology, renal toxicology, gastrointestinal toxicology, reproductive and endocrine toxicology, neuro and behavioral toxicology, developmental toxicology and carcinogenesis, also including critical sections that cover the general principles of toxicology, cellular and molecular toxicology, biotransformation and toxicology testing and evaluation. Each section is examined in state-of-the-art chapters written by domain experts, providing key information to support the investigations of researchers across the medical, veterinary, food, environment and chemical research industries, and national and international regulatory agencies. Thoroughly revised and expanded to 15 volumes that include the latest advances in research, and uniquely organized by organ system for ease of reference and diagnosis, this new edition is an essential reference for researchers of toxicology. Organized to cover both the fundamental principles of toxicology and unique aspects of major organ systems Thoroughly revised to include the latest advances in the toxicological effects of chemicals on the immune system Features additional coverage throughout and a new volume on toxicology of the hematopoietic system Presents in-depth, comprehensive coverage from an international author base of domain experts