Computational Development Towards High-throughput NMR-based Protein Structure Determination


Book Description

Three-dimensional structures of proteins determined in solution by NMR spectroscopy have the unique advantage of revealing details of molecular structure and dynamics in a physiologically relevant state; however, the many tedious steps needed to solve and validate a structure make this method challenging. The barriers to NMR structure determination become higher for larger proteins whose spectra are harder to resolve. It is clear that advances need to be made in automating protein structure determination by NMR spectroscopy. The goal of my research has been to use computational methods to advance the development of high-throughput NMR spectroscopy. Accelerating and streamlining the structure determination process will enable investigators to spend less time solving structures and more time investigating challenging biomolecular systems. My goals have been to develop an automation protocol that integrates multiple steps, ensures the robustness of each step, incorporates iterative corrections, and includes visualization tools to validate and extend the results. I developed PINE-SPARKY as a graphical interface for checking and extending automated assignments made by the PINE-NMR server. ADAPT-NMR directs fast data collection by reduced dimensionality on the basis of ongoing NMR assignments. I helped develop a version of ADAPT-NMR (originally only for Varian spectrometers) for Bruker spectrometers, and I created ADAPT-NMR Enhancer as a visualization tool for validating and extending assignments made by ADAPT-NMR on either spectrometer system. I developed the PONDEROSA package to automate the next steps. PONDEROSA carries out automatic picking of 3D-NOESY peaks and iterative structure determinations with the protein sequence and the assignments as inputs. These automation and visualization tools cover almost all of the steps involved in protein structure determination by NMR spectroscopy. As a practical test of this technology, I solved the structure of the 2A proteinase from the human rhinovirus. As a side project, I built a relational database (PACSY DB) that combines information from the Protein Data Bank (PDB) and the Biological Magnetic Resonance data Bank (BMRB) and incorporates tools for structure analysis. PACSY DB can carry out complex queries that combine atomic coordinates, NMR parameters, and structural features of proteins.




Protein Structure


Book Description

This text offers in-depth perspectives on every aspect of protein structure identification, assessment, characterization, and utilization, for a clear understanding of the diversity of protein shapes, variations in protein function, and structure-based drug design. The authors cover numerous high-throughput technologies as well as computational methods to study protein structures and residues. A valuable reference, this book reflects current trends in the effort to solve new structures arising from genome initiatives, details methods to detect and identify errors in the prediction of protein structural models, and outlines challenges in the conversion of routine processes into high-throughput platforms.




Precise and Accurate Structural Genomics Protein Structure Determination Using RD and GFT NMR Spectroscopy


Book Description

Nuclear magnetic resonance (NMR) has emerged as a powerful tool for determining three-dimensional structures of proteins in solution. The major part of this dissertation describes the efforts to deal with the core steps involved in NMR-base protein structure determination: NMR data collection, NMR data analysis, structure calculation and refinement. NMR data collection has been recognized as one of the major bottlenecks of NMR-base structure determination due to the necessarily long time. Chapter 2 and 3 describe two NMR data collection and analysis protocols for high-quality protein structure determination based on Reduced Dimensionality (RD) and G-matrix Fourier transform (GFT) NMR, respectively. The rapidly collected RD & GFT NMR data enabled high-quality structure determination of four structural genomics target proteins in high-throughput. Chapter 5 introduces the program UBNMR which was developed to facilitate NMR data analysis in general and RD & GFT NMR data pre-processing and data analysis in particular. RD and GFT NMR based protocols, with the aid of UBNMR, are expected to greatly impact on the NMR-based structural biology and structural genomics. In addition to rapid NMR data collection and analysis, structure calculation and refinement are also pivotal for obtaining the high-quality protein structures. Chapter 5 describes the analysis of the newly implemented simultaneous GFT NOESY to obtain an accurate and precise initial structural fold. Chapter 6 presents a protein structure refinement strategy using NOE data collected in supercooled water at low temperatures.




Structure Computation and Dynamics in Protein NMR


Book Description

Volume 17 is the second in a special topic series devoted to modern techniques in protein NMR, under the Biological Magnetic Resonance series. Volume 16, with the subtitle Modern Techniques in Protein NMR , is the first in this series. These two volumes present some of the recent, significant advances in the biomolecular NMR field with emphasis on developments during the last five years. We are honored to have brought together in these volume some of the world s foremost experts who have provided broad leadership in advancing this field. Volume 16 contains - vances in two broad categories: I. Large Proteins, Complexes, and Membrane Proteins and II. Pulse Methods. Volume 17 contains major advances in: I. Com- tational Methods and II. Structure and Dynamics. The opening chapter of volume 17 starts with a consideration of some important aspects of modeling from spectroscopic and diffraction data by Wilfred van Gunsteren and his colleagues. The next two chapters deal with combined automated assignments and protein structure determination, an area of intense research in many laboratories since the traditional manual methods are often inadequate or laborious in handling large volumes of NMR data on large proteins. First, Werner Braun and his associates describe their experience with the NOAH/DIAMOD protocol developed in their laboratory.




Development and Application of Methodology for Rapid NMR Data Collection and Protein Structure Determination


Book Description

This thesis focuses on the development and application of methodology for rapid NMR data collection and protein structure determination. In chapter 1, simultaneously cycled (SC) NMR is introduced and exemplified by implementing without readout gradients a set of 2D [1 H, 1 H] SC Exclusive COSY (E. COSY) NMR experiments and with readout gradients a 2D [1 H, 1 H] double quantum filtered (DQF) COSY experiment. Spatially selective 1 H r.f. pulses are applied as composite pulses such that n steps of the respective cycles are effected simultaneously in n slices of the sample, thus reducing total acquisition time by a factor of n. In chapter 2, the structure of the 142-residue protein Q8ZP25_SALTY encoded in the genome of Salmonella typhimurium (NESG target StR70) was determined by NMR, refined using residual dipolar coupling constraints and compared to the X-ray structure of Q8ZP25_SALTY and the NMR structure of homologous protein HYAE_ECOLI. Protein Q8ZP25_SALTY belongs to Pfam PF07449, which itself belongs to the 'thioredoxin-like clan'. However, protein Q8ZP25_SALTY and the other proteins of Pfam PF07449, do not contain the Cys-X-X-Cys active site sequence motif of thioredoxin. The structures presented here exhibit the expected thioredoxin-like fold and support biochemical data suggesting that members of Pfam family PF07449 specifically interact with Tat signal peptides involved in hydrogenase assembly. In chapter 3, the development of a hardware and software infrastructure designed specifically to support high throughput NMR protein structure determination for structural genomics is described. In addition, a "consensus run" protocol is detailed which uses common results from two disparate programs for automated NMR structure determination, namely, CYANA and AUTOSTRUCTURE, to minimize errors in initial NOESY peak assignments. The high throughput infrastructure and consensus run have supported the determination of 47 protein structures to date. Finally, in chapter 4, the relaxation agent and gadolinium chelate gadoversetamide is used to affect the T 1 and T 2 relaxation times of amide and methyl protons in proteins in aqueous solution so as to map protein surfaces. The protocol is explored to distinguish surface and buried residues in order to identify homo-dimer interfaces.




Computational Aspects of the Study of Biological Macromolecules by Nuclear Magnetic Resonance Spectroscopy


Book Description

Without computers - no modern NMR; Parametric estimation in 1-D, 2-D, and 3-D NMR; Computational aspects of multinuclear NMR spectroscopy of proteins at NMRFAM; Principles of multidimensional NMR techniques for measurement of J coupling constants; Comparison of the NMR and X-ray structures of hirudin; The application of the linear prediction principle to NMR spectroscopy; NMR data processing and structure calculations using parallel computers; Software approaches for determination of 3-dimensional molecular structures from multi-dimensional NMR; Applicability and limitations of three-dimensional NMR spectroscopy for the study of proteins in solution; The role of selective two-dimensional NMR correlation methods in supplementing computer-supported multiplet analysis by MARCO POLO; Application of maximum entropy methods to NMR spectra of proteins; Pattern recognition in two-dimensional NMR spectra of proteins; The application and development of software tools for the processinf and analysis of heteronuclear multi-dimensional NMR data; Distance geometry in torsion angle space: new developments and applications; Structure determination by NMR: the modeling of NMR parameters as ensemble averages; Time averaged distance restraints in NMR based structural refinement; Analysis of backbone dynamics of interleukin-1 beta; A new version of DADAS (Distance Analysis in Dihedral Angle Space) and its performance; An amateur looks at error analysis in the determination of protein structure by NMR; Structural interpretation of NMR data in the presence of motion; New interactive and automatic algorithms for the assignment of NMR spectra; Outline of a computer program for the analysis of protein NMR spectra; Assignment of the NMR spectra of homologous proteins; Incorporation of internal motion in NMR refinements based on NOESY data; Refinement of three-dimensional protein and DNA structures in solution from NMR data; How to deal with spin-diffusion and internal mobility in biomolecules: a relaxation matrix approach; Interactive computer graphics in the assignment of protein 2D and 3D NMR spectra; Determination of large protein structures from NMR data: definition of the solution structure of the TRP repressor; Interpretation of NMR data in terms of protein structure: summary of a round table discussion; Fast calculation of the relaxation matrix; NMR structures of proteins using stereospecific assignments and relaxation matrix refinement in a hybrid method of distance geometry and simulated annealing; A critique of the interpretation of nuclear Overhauser effects of duplex DNA; Improvement in resolution with nonlinear methods applied to NMR signals from macromolecules; STELLA and CLAIRE: a seraglio of programs for human-aided assignment of 2D 1H NMR spectra of proteins; MolSkop: towards NMR molecular scope; Ribonuclease H: full assignment of backbone proton resonances with heteronuclear 3D NMR and solution structure; Sampling properties of simulated annealing and distance geometry.




Structural Genomics and High Throughput Structural Biology


Book Description

Researchers in structural genomics continue to search for biochemical and cellular functions of proteins as well as the ways in which proteins assemble into functional pathways and networks using either experimental or computational approaches. Based on the experience of leading international experts, Structural Genomics and High Throughput Stru




Biological NMR Spectroscopy


Book Description

This book presents a critical assessment of progress on the use of nuclear magnetic resonance spectroscopy to determine the structure of proteins, including brief reviews of the history of the field along with coverage of current clinical and in vivo applications. The book, in honor of Oleg Jardetsky, one of the pioneers of the field, is edited by two of the most highly respected investigators using NMR, and features contributions by most of the leading workers in the field. It will be valued as a landmark publication that presents the state-of-the-art perspectives regarding one of today's most important technologies.




Protein Structure Determination by Paramagnetic NMR and Computational Hybrid Approach


Book Description

Computational modelling of proteins that rely on either de novo or evolutionary based approaches often produce poor quality structures, primarily due to the limitations in their algorithms or forcefields. Traditional experimental techniques such as X-ray crystallography depend on narrow set of crystallographic conditions while solution/solid state nuclear magnetic resonance (NMR) spectroscopy relies on cumbersome spectral analysis and complete resonance assignments. These traditional approaches are slow and costly endeavours. Computational/experimental hybrid approaches on the other hand provide a new avenue for reliable, rapid and cost-effective structure determination. Paramagnetic NMR offers easy generation of useful and sparse structural information which can be implemented as restraints in structure prediction algorithms. Pseudocontact shifts (PCS) are the most powerful of structural restraints generated by paramagnetic NMR which are long range in nature and can be easily obtained by simple 2D NMR experiments. This thesis demonstrates different approaches involved in protein structure calculations using PCS restraints in Rosetta. Chapter 2 demonstrates structure determination using PCS restraints exclusively obtained from protein samples in microcrystalline state by magic angle spinning (MAS) NMR spectroscopy. Chapter 3 discusses the implementation of using PCS data from multiple metal centres to precisely determine the location of spins in space in a manner analogues to GPS-satellites. Chapter 4 extends the usage of PCS data from multiple metal centres to capture distinct conformational states in proteins. Chapter 5 demonstrates new techniques especially developed for structure determination of large proteins involving super secondary structure motifs (Smotifs) and data driven iterative resampling. These different computational techniques serve the goal of determining accurate 3D models using minimal experimental data, which are applicable to proteins systems that are currently beyond the realm of traditional experimental approaches.




Computational Systems Bioinformatics


Book Description

This proceedings volume contains 29 papers covering many of the latest developments in the fast-growing field of bioinformatics. The contributions span a wide range of topics, including computational genomics and genetics, protein function and computational proteomics, the transcriptome, structural bioinformatics, microarray data analysis, motif identification, biological pathways and systems, and biomedical applications. The papers not only cover theoretical aspects of bioinformatics but also delve into the application of new methods, with input from computation, engineering and biology disciplines. This multidisciplinary approach to bioinformatics gives these proceedings a unique viewpoint of the field.