New Approaches to Protein NMR Automation


Book Description

The three-dimensional structure of a protein molecule is the key to understanding its biological and physiological properties. A major problem in bioinformatics is to efficiently determine the three-dimensional structures of query proteins. Protein NMR structure de- termination is one of the main experimental methods and is comprised of: (i) protein sample production and isotope labelling, (ii) collecting NMR spectra, and (iii) analysis of the spectra to produce the protein structure. In protein NMR, the three-dimensional struc- ture is determined by exploiting a set of distance restraints between spatially proximate atoms. Currently, no practical automated protein NMR method exists that is without human intervention. We first propose a complete automated protein NMR pipeline, which can efficiently be used to determine the structures of moderate sized proteins. Second, we propose a novel and efficient semidefinite programming-based (SDP) protein structure determination method. The proposed automated protein NMR pipeline consists of three modules: (i) an automated peak picking method, called PICKY, (ii) a backbone chemical shift assign- ment method, called IPASS, and (iii) a protein structure determination method, called FALCON-NMR. When tested on four real protein data sets, this pipeline can produce structures with reasonable accuracies, starting from NMR spectra. This general method can be applied to other macromolecule structure determination methods. For example, a promising application is RNA NMR-assisted secondary structure determination. In the second part of this thesis, due to the shortcomings of FALCON-NMR, we propose a novel SDP-based protein structure determination method from NMR data, called SPROS. Most of the existing prominent protein NMR structure determination methods are based on molecular dynamics coupled with a simulated annealing schedule. In these methods, an objective function representing the error between observed and given distance restraints is minimized; these objective functions are highly non-convex and difficult to optimize. Euclidean distance geometry methods based on SDP provide a natural formulation for realizing a three-dimensional structure from a set of given distance constraints. However, the complexity of the SDP solvers increases cubically with the input matrix size, i.e., the number of atoms in the protein, and the number of constraints. In fact, the complexity of SDP solvers is a major obstacle in their applicability to the protein NMR problem. To overcome these limitations, the SPROS method models the protein molecule as a set of intersecting two- and three-dimensional cliques. We adapt and extend a technique called semidefinite facial reduction for the SDP matrix size reduction, which makes the SDP problem size approximately one quarter of the original problem. The reduced problem is solved nearly one hundred times faster and is more robust against numerical problems. Reasonably accurate results were obtained when SPROS was applied to a set of 20 real protein data sets.







Structure Computation and Dynamics in Protein NMR


Book Description

Volume 17 is the second in a special topic series devoted to modern techniques in protein NMR, under the Biological Magnetic Resonance series. Volume 16, with the subtitle Modern Techniques in Protein NMR , is the first in this series. These two volumes present some of the recent, significant advances in the biomolecular NMR field with emphasis on developments during the last five years. We are honored to have brought together in these volume some of the world s foremost experts who have provided broad leadership in advancing this field. Volume 16 contains - vances in two broad categories: I. Large Proteins, Complexes, and Membrane Proteins and II. Pulse Methods. Volume 17 contains major advances in: I. Com- tational Methods and II. Structure and Dynamics. The opening chapter of volume 17 starts with a consideration of some important aspects of modeling from spectroscopic and diffraction data by Wilfred van Gunsteren and his colleagues. The next two chapters deal with combined automated assignments and protein structure determination, an area of intense research in many laboratories since the traditional manual methods are often inadequate or laborious in handling large volumes of NMR data on large proteins. First, Werner Braun and his associates describe their experience with the NOAH/DIAMOD protocol developed in their laboratory.




Computational Development Towards High-throughput NMR-based Protein Structure Determination


Book Description

Three-dimensional structures of proteins determined in solution by NMR spectroscopy have the unique advantage of revealing details of molecular structure and dynamics in a physiologically relevant state; however, the many tedious steps needed to solve and validate a structure make this method challenging. The barriers to NMR structure determination become higher for larger proteins whose spectra are harder to resolve. It is clear that advances need to be made in automating protein structure determination by NMR spectroscopy. The goal of my research has been to use computational methods to advance the development of high-throughput NMR spectroscopy. Accelerating and streamlining the structure determination process will enable investigators to spend less time solving structures and more time investigating challenging biomolecular systems. My goals have been to develop an automation protocol that integrates multiple steps, ensures the robustness of each step, incorporates iterative corrections, and includes visualization tools to validate and extend the results. I developed PINE-SPARKY as a graphical interface for checking and extending automated assignments made by the PINE-NMR server. ADAPT-NMR directs fast data collection by reduced dimensionality on the basis of ongoing NMR assignments. I helped develop a version of ADAPT-NMR (originally only for Varian spectrometers) for Bruker spectrometers, and I created ADAPT-NMR Enhancer as a visualization tool for validating and extending assignments made by ADAPT-NMR on either spectrometer system. I developed the PONDEROSA package to automate the next steps. PONDEROSA carries out automatic picking of 3D-NOESY peaks and iterative structure determinations with the protein sequence and the assignments as inputs. These automation and visualization tools cover almost all of the steps involved in protein structure determination by NMR spectroscopy. As a practical test of this technology, I solved the structure of the 2A proteinase from the human rhinovirus. As a side project, I built a relational database (PACSY DB) that combines information from the Protein Data Bank (PDB) and the Biological Magnetic Resonance data Bank (BMRB) and incorporates tools for structure analysis. PACSY DB can carry out complex queries that combine atomic coordinates, NMR parameters, and structural features of proteins.







Protein NMR Spectroscopy


Book Description

Nuclear Magnetic Resonance (NMR) spectroscopy, a physical phenomenon based upon the magnetic properties of certain atomic nuclei, has found a wide range of applications in life sciences over recent decades. This up-to-date volume covers NMR techniques and their application to proteins, with a focus on practical details. Providing newcomers to NMR with practical guidance to carry out successful experiments with proteins and analyze the resulting spectra, those familiar with the chemical applications of NMR will also find it useful in understanding the special requirements of protein NMR.










Structural Bioinformatics


Book Description

Structural Bioinformatics was the first major effort to show the application of the principles and basic knowledge of the larger field of bioinformatics to questions focusing on macromolecular structure, such as the prediction of protein structure and how proteins carry out cellular functions, and how the application of bioinformatics to these life science issues can improve healthcare by accelerating drug discovery and development. Designed primarily as a reference, the first edition nevertheless saw widespread use as a textbook in graduate and undergraduate university courses dealing with the theories and associated algorithms, resources, and tools used in the analysis, prediction, and theoretical underpinnings of DNA, RNA, and proteins. This new edition contains not only thorough updates of the advances in structural bioinformatics since publication of the first edition, but also features eleven new chapters dealing with frontier areas of high scientific impact, including: sampling and search techniques; use of mass spectrometry; genome functional annotation; and much more. Offering detailed coverage for practitioners while remaining accessible to the novice, Structural Bioinformatics, Second Edition is a valuable resource and an excellent textbook for a range of readers in the bioinformatics and advanced biology fields. Praise for the previous edition: "This book is a gold mine of fundamental and practical information in an area not previously well represented in book form." —Biochemistry and Molecular Education "... destined to become a classic reference work for workers at all levels in structural bioinformatics...recommended with great enthusiasm for educators, researchers, and graduate students." —BAMBED "...a useful and timely summary of a rapidly expanding field." —Nature Structural Biology "...a terrific job in this timely creation of a compilation of articles that appropriately addresses this issue." —Briefings in Bioinformatics