Algorithms for Next-Generation Sequencing Data


Book Description

The 14 contributed chapters in this book survey the most recent developments in high-performance algorithms for NGS data, offering fundamental insights and technical information specifically on indexing, compression and storage; error correction; alignment; and assembly. The book will be of value to researchers, practitioners and students engaged with bioinformatics, computer science, mathematics, statistics and life sciences.




Computational Genomics with R


Book Description

Computational Genomics with R provides a starting point for beginners in genomic data analysis and also guides more advanced practitioners to sophisticated data analysis techniques in genomics. The book covers topics from R programming, to machine learning and statistics, to the latest genomic data analysis techniques. The text provides accessible information and explanations, always with the genomics context in the background. This also contains practical and well-documented examples in R so readers can analyze their data by simply reusing the code presented. As the field of computational genomics is interdisciplinary, it requires different starting points for people with different backgrounds. For example, a biologist might skip sections on basic genome biology and start with R programming, whereas a computer scientist might want to start with genome biology. After reading: You will have the basics of R and be able to dive right into specialized uses of R for computational genomics such as using Bioconductor packages. You will be familiar with statistics, supervised and unsupervised learning techniques that are important in data modeling, and exploratory analysis of high-dimensional data. You will understand genomic intervals and operations on them that are used for tasks such as aligned read counting and genomic feature annotation. You will know the basics of processing and quality checking high-throughput sequencing data. You will be able to do sequence analysis, such as calculating GC content for parts of a genome or finding transcription factor binding sites. You will know about visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization. You will be familiar with analysis of different high-throughput sequencing data sets, such as RNA-seq, ChIP-seq, and BS-seq. You will know basic techniques for integrating and interpreting multi-omics datasets. Altuna Akalin is a group leader and head of the Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center, Berlin. He has been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. He has published an extensive body of work in this area. The framework for this book grew out of the yearly computational genomics courses he has been organizing and teaching since 2015.




High Performance Computing for Computational Science - VECPAR 2004


Book Description

This book constitutes the thoroughly refereed post-proceedings of the 6th International Conference on High Performance Computing for Computational Science, VECPAR 2004, held in Valencia, Spain, in June 2004. The 48 revised full papers presented together with 5 invited papers were carefully selected during two rounds of reviewing and improvement from initially 130 contributions. The papers are organized in topical sections on large-scale computations, data management and data mining, GRID computing infrastructure, cluster computing, parallel and distributed computing, and computational linear and non-linear algebra.







High-Performance Computing Using FPGAs


Book Description

High-Performance Computing using FPGA covers the area of high performance reconfigurable computing (HPRC). This book provides an overview of architectures, tools and applications for High-Performance Reconfigurable Computing (HPRC). FPGAs offer very high I/O bandwidth and fine-grained, custom and flexible parallelism and with the ever-increasing computational needs coupled with the frequency/power wall, the increasing maturity and capabilities of FPGAs, and the advent of multicore processors which has caused the acceptance of parallel computational models. The Part on architectures will introduce different FPGA-based HPC platforms: attached co-processor HPRC architectures such as the CHREC’s Novo-G and EPCC’s Maxwell systems; tightly coupled HRPC architectures, e.g. the Convey hybrid-core computer; reconfigurably networked HPRC architectures, e.g. the QPACE system, and standalone HPRC architectures such as EPFL’s CONFETTI system. The Part on Tools will focus on high-level programming approaches for HPRC, with chapters on C-to-Gate tools (such as Impulse-C, AutoESL, Handel-C, MORA-C++); Graphical tools (MATLAB-Simulink, NI LabVIEW); Domain-specific languages, languages for heterogeneous computing(for example OpenCL, Microsoft’s Kiwi and Alchemy projects). The part on Applications will present case from several application domains where HPRC has been used successfully, such as Bioinformatics and Computational Biology; Financial Computing; Stencil computations; Information retrieval; Lattice QCD; Astrophysics simulations; Weather and climate modeling.




Genome-Scale Algorithm Design


Book Description

Guided by standard bioscience workflows in high-throughput sequencing analysis, this book for graduate students, researchers, and professionals in bioinformatics and computer science offers a unified presentation of genome-scale algorithms. This new edition covers the use of minimizers and other advanced data structures in pangenomics approaches.




High Performance Computing


Book Description

The 5th International Symposium on High Performance Computing (ISHPC–V) was held in Odaiba, Tokyo, Japan, October 20–22, 2003. The symposium was thoughtfully planned, organized, and supported by the ISHPC Organizing C- mittee and its collaborating organizations. The ISHPC-V program included two keynote speeches, several invited talks, two panel discussions, and technical sessions covering theoretical and applied research topics in high–performance computing and representing both academia and industry. One of the regular sessions highlighted the research results of the ITBL project (IT–based research laboratory, http://www.itbl.riken.go.jp/). ITBL is a Japanese national project started in 2001 with the objective of re- izing a virtual joint research environment using information technology. ITBL aims to connect 100 supercomputers located in main Japanese scienti?c research laboratories via high–speed networks. A total of 58 technical contributions from 11 countries were submitted to ISHPC-V. Each paper received at least three peer reviews. After a thorough evaluation process, the program committee selected 14 regular (12-page) papers for presentation at the symposium. In addition, several other papers with fav- able reviews were recommended for a poster session presentation. They are also included in the proceedings as short (8-page) papers. Theprogramcommitteegaveadistinguishedpaperawardandabeststudent paper award to two of the regular papers. The distinguished paper award was given for “Code and Data Transformations for Improving Shared Cache P- formance on SMT Processors” by Dimitrios S. Nikolopoulos. The best student paper award was given for “Improving Memory Latency Aware Fetch Policies for SMT Processors” by Francisco J. Cazorla.




Grid Computing for Bioinformatics and Computational Biology


Book Description

The only single, up-to-date source for Grid issues in bioinformatics and biology Bioinformatics is fast emerging as an important discipline for academic research and industrial applications, creating a need for the use of Grid computing techniques for large-scale distributed applications. This book successfully presents Grid algorithms and their real-world applications, provides details on modern and ongoing research, and explores software frameworks that integrate bioinformatics and computational biology. Additional coverage includes: * Bio-ontology and data mining * Data visualization * DNA assembly, clustering, and mapping * Molecular evolution and phylogeny * Gene expression and micro-arrays * Molecular modeling and simulation * Sequence search and alignment * Protein structure prediction * Grid infrastructure, middleware, and tools for bio data Grid Computing for Bioinformatics and Computational Biology is an indispensable resource for professionals in several research and development communities including bioinformatics, computational biology, Grid computing, data mining, and more. It also serves as an ideal textbook for undergraduate- and graduate-level courses in bioinformatics and Grid computing.




High Performance Computing - HiPC 2002


Book Description

This book constitutes the refereed proceedings of the 9th International Conference on High Performance Computing, HiPC 2002, held in Bangalore, India in December 2002. The 57 revised full contributed papers and 9 invited papers presented together with various keynote abstracts were carefully reviewed and selected from 145 submissions. The papers are organized in topical sections on algorithms, architecture, systems software, networks, mobile computing and databases, applications, scientific computation, embedded systems, and biocomputing.




8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014)


Book Description

Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researchers from different scientific fields is, more than ever, of foremost importance boosting the research efforts in the field and contributing to the education of a new generation of Bioinformatics scientists. PACBB‘14 contributes to this effort promoting this fruitful interaction. PACBB'14 technical program included 34 papers spanning many different sub-fields in Bioinformatics and Computational Biology. Therefore, the conference promotes the interaction of scientists from diverse research groups and with a distinct background such as computer scientists, mathematicians or biologists.