Computational Development Towards High-throughput NMR-based Protein Structure Determination


Book Description

Three-dimensional structures of proteins determined in solution by NMR spectroscopy have the unique advantage of revealing details of molecular structure and dynamics in a physiologically relevant state; however, the many tedious steps needed to solve and validate a structure make this method challenging. The barriers to NMR structure determination become higher for larger proteins whose spectra are harder to resolve. It is clear that advances need to be made in automating protein structure determination by NMR spectroscopy. The goal of my research has been to use computational methods to advance the development of high-throughput NMR spectroscopy. Accelerating and streamlining the structure determination process will enable investigators to spend less time solving structures and more time investigating challenging biomolecular systems. My goals have been to develop an automation protocol that integrates multiple steps, ensures the robustness of each step, incorporates iterative corrections, and includes visualization tools to validate and extend the results. I developed PINE-SPARKY as a graphical interface for checking and extending automated assignments made by the PINE-NMR server. ADAPT-NMR directs fast data collection by reduced dimensionality on the basis of ongoing NMR assignments. I helped develop a version of ADAPT-NMR (originally only for Varian spectrometers) for Bruker spectrometers, and I created ADAPT-NMR Enhancer as a visualization tool for validating and extending assignments made by ADAPT-NMR on either spectrometer system. I developed the PONDEROSA package to automate the next steps. PONDEROSA carries out automatic picking of 3D-NOESY peaks and iterative structure determinations with the protein sequence and the assignments as inputs. These automation and visualization tools cover almost all of the steps involved in protein structure determination by NMR spectroscopy. As a practical test of this technology, I solved the structure of the 2A proteinase from the human rhinovirus. As a side project, I built a relational database (PACSY DB) that combines information from the Protein Data Bank (PDB) and the Biological Magnetic Resonance data Bank (BMRB) and incorporates tools for structure analysis. PACSY DB can carry out complex queries that combine atomic coordinates, NMR parameters, and structural features of proteins.







New Approaches to Protein NMR Automation


Book Description

The three-dimensional structure of a protein molecule is the key to understanding its biological and physiological properties. A major problem in bioinformatics is to efficiently determine the three-dimensional structures of query proteins. Protein NMR structure de- termination is one of the main experimental methods and is comprised of: (i) protein sample production and isotope labelling, (ii) collecting NMR spectra, and (iii) analysis of the spectra to produce the protein structure. In protein NMR, the three-dimensional struc- ture is determined by exploiting a set of distance restraints between spatially proximate atoms. Currently, no practical automated protein NMR method exists that is without human intervention. We first propose a complete automated protein NMR pipeline, which can efficiently be used to determine the structures of moderate sized proteins. Second, we propose a novel and efficient semidefinite programming-based (SDP) protein structure determination method. The proposed automated protein NMR pipeline consists of three modules: (i) an automated peak picking method, called PICKY, (ii) a backbone chemical shift assign- ment method, called IPASS, and (iii) a protein structure determination method, called FALCON-NMR. When tested on four real protein data sets, this pipeline can produce structures with reasonable accuracies, starting from NMR spectra. This general method can be applied to other macromolecule structure determination methods. For example, a promising application is RNA NMR-assisted secondary structure determination. In the second part of this thesis, due to the shortcomings of FALCON-NMR, we propose a novel SDP-based protein structure determination method from NMR data, called SPROS. Most of the existing prominent protein NMR structure determination methods are based on molecular dynamics coupled with a simulated annealing schedule. In these methods, an objective function representing the error between observed and given distance restraints is minimized; these objective functions are highly non-convex and difficult to optimize. Euclidean distance geometry methods based on SDP provide a natural formulation for realizing a three-dimensional structure from a set of given distance constraints. However, the complexity of the SDP solvers increases cubically with the input matrix size, i.e., the number of atoms in the protein, and the number of constraints. In fact, the complexity of SDP solvers is a major obstacle in their applicability to the protein NMR problem. To overcome these limitations, the SPROS method models the protein molecule as a set of intersecting two- and three-dimensional cliques. We adapt and extend a technique called semidefinite facial reduction for the SDP matrix size reduction, which makes the SDP problem size approximately one quarter of the original problem. The reduced problem is solved nearly one hundred times faster and is more robust against numerical problems. Reasonably accurate results were obtained when SPROS was applied to a set of 20 real protein data sets.




Structure Computation and Dynamics in Protein NMR


Book Description

Volume 17 is the second in a special topic series devoted to modern techniques in protein NMR, under the Biological Magnetic Resonance series. Volume 16, with the subtitle Modern Techniques in Protein NMR , is the first in this series. These two volumes present some of the recent, significant advances in the biomolecular NMR field with emphasis on developments during the last five years. We are honored to have brought together in these volume some of the world s foremost experts who have provided broad leadership in advancing this field. Volume 16 contains - vances in two broad categories: I. Large Proteins, Complexes, and Membrane Proteins and II. Pulse Methods. Volume 17 contains major advances in: I. Com- tational Methods and II. Structure and Dynamics. The opening chapter of volume 17 starts with a consideration of some important aspects of modeling from spectroscopic and diffraction data by Wilfred van Gunsteren and his colleagues. The next two chapters deal with combined automated assignments and protein structure determination, an area of intense research in many laboratories since the traditional manual methods are often inadequate or laborious in handling large volumes of NMR data on large proteins. First, Werner Braun and his associates describe their experience with the NOAH/DIAMOD protocol developed in their laboratory.







Protein NMR Techniques


Book Description

When I was asked to edit the second edition of Protein NMR Techniques, my first thought was that the time was ripe for a new edition. The past several years have seen a surge in the development of novel methods that are truly revolutionizing our ability to characterize biological macromolecules in terms of speed, accuracy, and size limitations. I was particularly excited at the prospect of making these techniques accessible to all NMR labs and for the opportunity to ask the experts to divulge their hints and tips and to write, practically, about the methods. I commissioned 19 chapters with wide scope for Protein NMR Techniques, and the volume has been organized with numerous themes in mind. Chapters 1 and 2 deal with recombinant protein expression using two organisms, E. coli and P. pastoris, that can produce high yields of isotopically labeled protein at a reasonable cost. Staying with the idea of isotopic labeling, Chapter 3 describes methods for perdeuteration and site-specific protonation and is the first of several chapters in the book that is relevant to studies of higher molecular weight systems. A different, but equally powerful, method that uses molecular biology to “edit” the spectrum of a large molecule using segmental labeling is presented in Chapter 4. Having successfully produced a high molecular weight target for study, the next logical step is data acquisition. Hence, the final chapter on this theme, Chapter 5, describes TROSY methods for stru- ural studies.




Biological NMR Spectroscopy


Book Description

This book presents a critical assessment of progress on the use of nuclear magnetic resonance spectroscopy to determine the structure of proteins, including brief reviews of the history of the field along with coverage of current clinical and in vivo applications. The book, in honor of Oleg Jardetsky, one of the pioneers of the field, is edited by two of the most highly respected investigators using NMR, and features contributions by most of the leading workers in the field. It will be valued as a landmark publication that presents the state-of-the-art perspectives regarding one of today's most important technologies.




Protein NMR Spectroscopy


Book Description

Nuclear Magnetic Resonance (NMR) spectroscopy, a physical phenomenon based upon the magnetic properties of certain atomic nuclei, has found a wide range of applications in life sciences over recent decades. This up-to-date volume covers NMR techniques and their application to proteins, with a focus on practical details. Providing newcomers to NMR with practical guidance to carry out successful experiments with proteins and analyze the resulting spectra, those familiar with the chemical applications of NMR will also find it useful in understanding the special requirements of protein NMR.