|
Joint Center for Structural Genomics, La Jolla, CA 92037, USA.
XtalPred is a web server for prediction of protein crystallizability. The prediction is made by comparing several features of the protein with distributions of these features in TargetDB and combining the results into an overall probability of crystallization. XtalPred provides:(1) a detailed comparison of the protein's features to the corresponding distribution from TargetDB;(2) a summary of protein features and predictions that indicate problems that are likely to be encountered during protein crystallization;(3) prediction of ligands; and (4)(optional) lists of close homologs from complete microbial genomes that are more likely to crystallize. AVAILABILITY: The XtalPred web server is freely available for academic users on http://ffas.burnham.org/XtalPred
Latest threads:
Latest citations:
Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, Charlottesville, VA 22908-0736, USA.
Until recently, protein crystallization has mostly been regarded as a stochastic event over which the investigator has little or no control. With the dramatic technological advances in synchrotron-radiation sources and detectors and the equally impressive progress in crystallographic software, including automated model building and validation, crystallization has increasingly become the rate-limiting step in X-ray diffraction studies of macromolecules. However, with the advent of recombinant methods it has also become possible to engineer target proteins and their complexes for higher propensity to form crystals with desirable X-ray diffraction qualities. As most proteins that are under investigation today are obtained by heterologous overexpression, these techniques hold the promise of becoming routine tools with the potential to transform classical crystallization screening into a more rational high-success-rate approach. This article presents an overview of protein-engineering methods designed to enhance crystallizability and discusses a number of examples of their successful application.
University of Leeds, Leeds Institute of Molecular Medicine Section of Experimental Therapeutics, St. James's University Hospital, Leeds, United Kingdom.
Folds are the basic building blocks of protein structures. Understanding the emergence of novel protein folds is an important step towards understanding the rules governing the evolution of protein structure and function and for developing tools for protein structure modeling and design. We explored the frequency of occurrences of an exhaustively classified library of supersecondary structural elements (Smotifs), in protein structures, in order to identify features that would define a fold as novel compared to previously known structures. We found that a surprisingly small set of Smotifs is sufficient to describe all known folds. Furthermore, novel folds do not require novel Smotifs, but rather are a new combination of existing ones. Novel folds can be typified by the inclusion of a relatively higher number of rarely occurring Smotifs in their structures and, to a lesser extent, by a novel topological combination of commonly occurring Smotifs. When investigating the structural features of Smotifs, we found that the top 10% of most frequent ones have a higher fraction of internal contacts, while some of the most rare motifs are larger, and contain a longer loop region.
Muse Oke,
Lester G Carter,
Kenneth A Johnson,
Huanting Liu,
Stephen A McMahon,
Xuan Yan,
Melina Kerou,
Nadine D Weikart,
Nadia Kadi,
Md Arif Sheikh,
Stefan Schmelz,
Mark Dorward,
Michal Zawadzki,
Christopher Cozens,
Helen Falconer,
Helen Powers,
Ian M Overton,
C A Johannes van Niekerk,
Xu Peng,
Prakash Patel,
Roger A Garrett,
David Prangishvili,
Catherine H Botting,
Peter J Coote,
David T F Dryden,
Geoffrey J Barton,
Ulrich Schwarz-Linek,
Gregory L Challis,
Garry L Taylor,
Malcolm F White,
James H Naismith
Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST, UK.
The Scottish Structural Proteomics Facility was funded to develop a laboratory scale approach to high throughput structure determination. The effort was successful in that over 40 structures were determined. These structures and the methods harnessed to obtain them are reported here. This report reflects on the value of automation but also on the continued requirement for a high degree of scientific and technical expertise. The efficiency of the process poses challenges to the current paradigm of structural analysis and publication. In the 5 year period we published ten peer-reviewed papers reporting structural data arising from the pipeline. Nevertheless, the number of structures solved exceeded our ability to analyse and publish each new finding. By reporting the experimental details and depositing the structures we hope to maximize the impact of the project by allowing others to follow up the relevant biology.
Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, 9700 S Cass Ave., Argonne, IL, 60439, USA, gbabnigg@anl.gov.
The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein's propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for ~720 unique proteins that resulted in X-ray structures. The correlation of the protein's iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein's propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor .
Protein Pept Lett. 2010 Jan 4;:
20044918
Institute for Neuro- and Bioinformatics, University of Lübeck, 23538 Lübeck, Germany. EPNSugan@ntu.edu.sg.
X-ray crystallography is the most widely used method for protein 3-dimensional structure determination. Selection of target protein that can yield high quality crystal for X-ray crystallography is a challenging task. Prediction of protein crystallization propensity from sequence information is useful for the selection of target protein for crystallization. Recently, support vector machines have been widely used to solve various biological problems. In this work, we present a SVMCRYS method which use support vector machine to classify protein sequence into 'amenable to crystallization' and 'resistant to crystallization'. SVMCRYS was trained on a dataset containing 728 sequences that gave diffraction quality crystal and 728 sequences where work had been stopped before obtaining crystal. The performance of SVMCRYS method was compared with other sequence-based crystallization prediction methods such as SECRET, CRYSTALP, OB-Score, ParCrys and XtalPred using three different datasets. SVMCRYS achieved better prediction rate with higher sensitivity and specificity. Our analysis suggests that SVMCRYS can be used to predict proteins which are amenable to crystallization and proteins which are difficult for crystallization. The SVMCRYS software, dataset and feature set can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm.
Midwest Center for Structural Genomics, Structural Biology Center, Biosciences Division, Argonne National Laboratory, 9700 S Class Ave., Argonne, IL 60439, USA.
Protein X-ray crystallography recently celebrated its 50th anniversary. The structures of myoglobin and hemoglobin determined by Kendrew and Perutz provided the first glimpses into the complex protein architecture and chemistry. Since then, the field of structural molecular biology has experienced extraordinary progress and now more than 55000 protein structures have been deposited into the Protein Data Bank. In the past decade many advances in macromolecular crystallography have been driven by world-wide structural genomics efforts. This was made possible because of third-generation synchrotron sources, structure phasing approaches using anomalous signal, and cryo-crystallography. Complementary progress in molecular biology, proteomics, hardware and software for crystallographic data collection, structure determination and refinement, computer science, databases, robotics and automation improved and accelerated many processes. These advancements provide the robust foundation for structural molecular biology and assure strong contribution to science in the future. In this report we focus mainly on reviewing structural genomics high-throughput X-ray crystallography technologies and their impact.
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada.
Production of high-quality crystals is one of the main bottlenecks in the X-ray crystallography driven protein structure determination. Availability of structure determination data repositories, such as TargetDB and PepcDB, and flexibility in target selection in structural genomics motivate development of methods that predict crystallization propensity from a given protein sequence. We introduce a novel linear model tree-based meta-predictor, MetaPPCP, which takes advantage of the complementarity of state-of-the-art protein crystallization propensity predictors to provide predictions with about 80% accuracy. Our method combines predictions of XtalPred and CRYSTALP2 with information concerning isoelectric point, hydropathy and number of solved structures for similar sequences. Empirical comparison shows that MetaPPCP outperforms current predictors including OB-Score, XtalPred, ParCrys and CRYSTALP2. MetaPPCP obtains over 92% accuracy for over a half of its predictions that have probability (propensity to be predicted as crystallizable or crystallization resistant) of above 0.8. The proposed method could provide useful input for target selection procedures of current structural genomics efforts.
Methods Mol Biol. 2009 ;569 :129-56
19623489
Fav:1
The Burnham Institute, La Jolla, CA, USA.
The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination. The purpose of this chapter is to give a practical review of the most popular protein structure prediction methods based on sequence similarity and to outline a practical approach to protein structure prediction. While the main focus of this chapter is on template-based protein structure prediction, it also provides references to other methods and programs which play an important role in protein structure prediction.
J Biomol NMR. 2008 Oct 1;:
18827972
Cit:10
Department of Chemistry, University of Gothenburg, Box 462, 40530, Gothenburg, Sweden, martin.billeter@chem.gu.se.
This 'Perspective' bears on the present state of protein structure determination by NMR in solution. The focus is on a comparison of the infrastructure available for NMR structure determination when compared to protein crystal structure determination by X-ray diffraction. The main conclusion emerges that the unique potential of NMR to generate high resolution data also on dynamics, interactions and conformational equilibria has contributed to a lack of standard procedures for structure determination which would be readily amenable to improved efficiency by automation. To spark renewed discussion on the topic of NMR structure determination of proteins, procedural steps with high potential for improvement are identified.
Ian M Overton,
C A Johannes van Niekerk,
Lester G Carter,
Alice Dawson,
David M A Martin,
Scott Cameron,
Stephen A McMahon,
Malcolm F White,
William N Hunter,
James H Naismith,
Geoffrey J Barton
School of Life Sciences Research, University of Dundee, Dow Street, Dundee, DD1 5EH and Centre for Biomolecular Sciences, School of Biomedical Science, North Haugh, The University, St Andrews, KY16 9ST, UK.
TarO (http://www.compbio.dundee.ac.uk/taro) offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural biology techniques. The protein sequence is analysed by 17 algorithms and compared to 8 databases. TarO gathers putative homologues, including orthologues, and then obtains predictions of properties for these sequences including crystallisation propensity, protein disorder and post-translational modifications. Analyses are run on a high-performance computing cluster, the results integrated, stored in a database and accessed through a web-based user interface. Output is in tabulated format and in the form of an annotated multiple sequence alignment (MSA) that may be edited interactively in the program Jalview. TarO also simplifies the gathering of additional annotations via the Distributed Annotation System, both from the MSA in Jalview and through links to Dasty2. Routes to other information gateways are included, for example to relevant pages from UniProt, COG and the Conserved Domains Database. Open access to TarO is available from a guest account with private accounts for academic use available on request. Future development of TarO will include further analysis steps and integration with the Protein Information Management System (PIMS), a sister project in the BBSRC 'Structural Proteomics of Rational Targets' initiative.
Other papers by authors:
Lukasz Slabinski,
Lukasz Jaroszewski,
Ana P C Rodrigues,
Leszek Rychlewski,
Ian A Wilson,
Scott A Lesley,
Adam Godzik
The process of experimental determination of protein structure is marred with a high ratio of failures at many stages. With availability of large quantities of data from high-throughput structure determination in structural genomics centers, we can now learn to recognize protein features correlated with failures; thus, we can recognize proteins more likely to succeed and eventually learn how to modify those that are less likely to succeed. Here, we identify several protein features that correlate strongly with successful protein production and crystallization and combine them into a single score that assesses "crystallization feasibility." The formula derived here was tested with a jackknife procedure and validated on independent benchmark sets. The "crystallization feasibility" score described here is being applied to target selection in the Joint Center for Structural Genomics, and is now contributing to increasing the success rate, lowering the costs, and shortening the time for protein structure determination. Analyses of PDB depositions suggest that very similar features also play a role in non-high-throughput structure determination, suggesting that this crystallization feasibility score would also be of significant interest to structural biology, as well as to molecular and biochemistry laboratories.
Lukasz Jaroszewski,
Lukasz Slabinski,
John Wooley,
Ashley M Deacon,
Scott A Lesley,
Ian A Wilson,
Adam Godzik
Joint Center for Structural Genomics, Bioinformatics Core, Burnham Institute for Medical Research, 10901 N. Torrey Pines Road, La Jolla, CA 92037, USA.
Even closely homologous proteins often have different crystallization properties and propensities. This observation can be used to introduce an additional dimension into crystallization trials by simultaneous targeting multiple homologs in what we call a "genome pool" strategy. We show that this strategy works because protein physicochemical properties correlated with crystallization success have a surprisingly broad distribution within most protein families. There are also "easy" and "difficult" families where this distribution is tilted in one direction. This leads to uneven structural coverage of protein families, with more "easy" ones solved. Increasing the size of the "genome pool" can improve chances of solving the "difficult" ones. In contrast, our analysis does not indicate that any specific genomes are "easy" or "difficult". Finally, we show that the group of proteins with known 3D structures is systematically different from the general pool of known proteins and we assess the structural consequences of these differences.
J Mol Biol. 2010 Jan 29;:
20122942
Debanu Das,
Davide Moiani,
Herbert L Axelrod,
Mitchell D Miller,
Daniel McMullan,
Kevin K Jin,
Polat Abdubek,
Tamara Astakhova,
Prasad Burra,
Dennis Carlton,
Hsiu-Ju Chiu,
Thomas Clayton,
Marc C Deller,
Lian Duan,
Dustin Ernst,
Julie Feuerhelm,
Joanna C Grant,
Anna Grzechnik,
Slawomir K Grzechnik,
Gye Won Han,
Lukasz Jaroszewski,
Heath E Klock,
Mark W Knuth,
Piotr Kozbial,
S Sri Krishna,
Abhinav Kumar,
David Marciano,
Andrew T Morse,
Edward Nigoghossian,
Linda Okach,
Jessica Paulsen,
Ron Reyes,
Christopher L Rife,
Natasha Sefcovic,
Henry J Tien,
Christine B Trame,
Henry van den Bedem,
Dana Weekes,
Qingping Xu,
Keith O Hodgson,
John Wooley,
Marc-André Elsliger,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
John A Tainer,
Ian A Wilson
Joint Center for Structural Genomics (http://www.jcsg.org); Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA.
Mre11 nuclease plays a central role in the repair of cytotoxic and mutagenic DNA double-strand breaks (DSBs). As x-ray structural information has only been available for the Pyrococcus furiosus enzyme (PfMre11), the conserved and variable features of this nuclease across the domains of life have not been experimentally defined. Our crystal structure and biochemical studies demonstrate that TM1635 from Thermotoga maritima, originally annotated as a putative nuclease, is the Mre11 endo/exonuclease from T. maritima (TmMre11) and the first such structure from eubacteria. TmMre11 and PfMre11 display similar overall structures, despite sequence identity in the twilight zone of only ~20%. However, they differ substantially in their DNA specificity domains and in their dimeric organization. Residues in the nuclease domain are highly conserved, but those in the DNA specificity domain are not. The structural differences likely affect how Mre11s from different organisms recognize and interact with single-stranded DNA, double-stranded DNA and DNA hairpin structures during DNA repair. The TmMre11 nuclease active site has no bound metal ions, but is conserved in sequence and structure with exception of a histidine that is important in PfMre11 nuclease activity. Nevertheless, biochemical characterization confirms that TmMre11 possesses both endonuclease and exonuclease activities on ssDNA and dsDNA substrates, respectively.
J Mol Biol. 2009 Nov 10;:
19913036
Qingping Xu,
Alex Bateman,
Robert D Finn,
Polat Abdubek,
Tamara Astakhova,
Herbert L Axelrod,
Constantina Bakolitsa,
Dennis Carlton,
Connie Chen,
Hsiu-Ju Chiu,
Michelle Chiu,
Thomas Clayton,
Debanu Das,
Marc C Deller,
Lian Duan,
Kyle Ellrott,
Dustin Ernst,
Carol L Farr,
Julie Feuerhelm,
Joanna C Grant,
Anna Grzechnik,
Gye Won Han,
Lukasz Jaroszewski,
Kevin K Jin,
Heath E Klock,
Mark W Knuth,
Piotr Kozbial,
S Sri Krishna,
Abhinav Kumar,
David Marciano,
Daniel McMullan,
Mitchell D Miller,
Andrew T Morse,
Edward Nigoghossian,
Amanda Nopakun,
Linda Okach,
Christina Puckett,
Ron Reyes,
Christopher L Rife,
Natasha Sefcovic,
Henry J Tien,
Christine B Trame,
Henry van den Bedem,
Dana Weekes,
Tiffany Wooten,
Keith O Hodgson,
John Wooley,
Marc-André Elsliger,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics; Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA.
Pleckstrin homology (PH) domains have been identified only in eukaryotic proteins to date. We have determined crystal structures for three members of an uncharacterized protein family (Pfam PF08000), which provide compelling evidence for the existence of Pleckstrin homology-like domains (PH-like) in bacteria (PHb). The first two structures contain a single PHb domain that forms a dome-shaped, oligomeric ring with C(5) symmetry. The third structure has an additional helical hairpin attached at the C-terminus and forms a similar, but much larger ring with C(12) symmetry. Thus, both molecular assemblies exhibit rare, higher order, cyclic symmetry, but preserve a similar arrangement of their PHb domains, which gives rise to a conserved hydrophilic surface at the intersection of the beta-strands of adjacent protomers that likely mediates protein-protein interactions. As a result of these structures, additional families of bacterial PH (PHb) domains can now be identified, suggesting that PH domains are much more widespread than originally anticipated. Thus, rather than being a eukaryotic innovation, the PH domain superfamily appears to have existed before prokaryotes and eukaryotes diverged.
Ying Zhang,
Ines Thiele,
Dana Weekes,
Zhanwen Li,
Lukasz Jaroszewski,
Krzysztof Ginalski,
Ashley M Deacon,
John Wooley,
Scott A Lesley,
Ian A Wilson,
Bernhard Palsson,
Andrei Osterman,
Adam Godzik
Joint Center for Molecular Modeling (JCMM), Burnham Institute for Medical Research, La Jolla, CA 92037, USA.
Metabolic pathways have traditionally been described in terms of biochemical reactions and metabolites. With the use of structural genomics and systems biology, we generated a three-dimensional reconstruction of the central metabolic network of the bacterium Thermotoga maritima. The network encompassed 478 proteins, of which 120 were determined by experiment and 358 were modeled. Structural analysis revealed that proteins forming the network are dominated by a small number (only 182) of basic shapes (folds) performing diverse but mostly related functions. Most of these folds are already present in the essential core (approximately 30%) of the network, and its expansion by nonessential proteins is achieved with relatively few additional folds. Thus, integration of structural data with networks analysis generates insight into the function, mechanism, and evolution of biological networks.
J Mol Biol. 2009 May 15;:
19450606
Qingping Xu,
Dennis Carlton,
Mitchell D Miller,
Marc-André Elsliger,
S Sri Krishna,
Polat Abdubek,
Tamara Astakhova,
Prasad Burra,
Hsiu-Ju Chiu,
Thomas Clayton,
Marc C Deller,
Lian Duan,
Ylva Elias,
Julie Feuerhelm,
Joanna C Grant,
Anna Grzechnik,
Slawomir K Grzechnik,
Gye Won Han,
Lukasz Jaroszewski,
Kevin K Jin,
Heath E Klock,
Mark W Knuth,
Piotr Kozbial,
Abhinav Kumar,
David Marciano,
Daniel McMullan,
Andrew T Morse,
Edward Nigoghossian,
Linda Okach,
Silvya Oommachen,
Jessica Paulsen,
Ron Reyes,
Christopher L Rife,
Natasha Sefcovic,
Christine Trame,
Christina V Trout,
Henry van den Bedem,
Dana Weekes,
Keith O Hodgson,
John Wooley,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics; Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA, USA.
Cell cycle regulated stalk biogenesis in Caulobacter crescentus is controlled by a multi-step phosphorelay system consisting of the hybrid histidine kinase ShkA, the histidine-phosphotransfer protein ShpA and the response regulator TacA. ShpA shuttles phosphoryl groups between ShkA and TacA. When phosphorylated, TacA triggers a downstream transcription cascade for stalk synthesis in an RpoN-dependent manner. The crystal structure of ShpA was determined to 1.52 A resolution. ShpA belongs to a family of monomeric histidine phosphotransfer (HPt) proteins, which feature a highly conserved four-helix bundle. The phosphorylatable histidine, His56, is located on the surface of the helix bundle and is fully solvent exposed. One end of the four-helix bundle in ShpA is shorter compared to other characterized histidine phosphotransfer proteins, whereas the face that potentially interacts with the response regulators is structurally conserved. Similarities of the interaction surface around the phosphorylation site suggest that ShpA is likely to share a common mechanism for molecular recognition and phosphotransfer with yeast phosphotransfer protein YPD1 despite low overall sequence similarity.
Qingping Xu,
Sebastian Sudek,
Daniel McMullan,
Mitchell D Miller,
Bernhard Geierstanger,
David H Jones,
S Sri Krishna,
Glen Spraggon,
Badry Bursalay,
Polat Abdubek,
Claire Acosta,
Eileen Ambing,
Tamara Astakhova,
Herbert L Axelrod,
Dennis Carlton,
Jonathan Caruthers,
Hsiu-Ju Chiu,
Thomas Clayton,
Marc C Deller,
Lian Duan,
Ylva Elias,
Marc-André Elsliger,
Julie Feuerhelm,
Slawomir K Grzechnik,
Joanna Hale,
Gye Won Han,
Justin Haugen,
Lukasz Jaroszewski,
Kevin K Jin,
Heath E Klock,
Mark W Knuth,
Piotr Kozbial,
Abhinav Kumar,
David Marciano,
Andrew T Morse,
Edward Nigoghossian,
Linda Okach,
Silvya Oommachen,
Jessica Paulsen,
Ron Reyes,
Christopher L Rife,
Christina V Trout,
Henry van den Bedem,
Dana Weekes,
Aprilfawn White,
Guenter Wolf,
Chloe Zubieta,
Keith O Hodgson,
John Wooley,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA 94025, USA; Stanford Synchrotron Radiation Lightsource (SSRL), SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA 94025, USA.
The crystal structures of two homologous endopeptidases from cyanobacteria Anabaena variabilis and Nostoc punctiforme were determined at 1.05 and 1.60 A resolution, respectively, and contain a bacterial SH3-like domain (SH3b) and a ubiquitous cell-wall-associated NlpC/P60 (or CHAP) cysteine peptidase domain. The NlpC/P60 domain is a primitive, papain-like peptidase in the CA clan of cysteine peptidases with a Cys126/His176/His188 catalytic triad and a conserved catalytic core. We deduced from structure and sequence analysis, and then experimentally, that these two proteins act as gamma-D-glutamyl-L-diamino acid endopeptidases (EC 3.4.22.-). The active site is located near the interface between the SH3b and NlpC/P60 domains, where the SH3b domain may help define substrate specificity, instead of functioning as a targeting domain, so that only muropeptides with an N-terminal L-alanine can bind to the active site.
Proteins. 2008 Dec 23;:
19173316
Cit:1
Debanu Das,
Piotr Kozbial,
Herbert L Axelrod,
Mitchell D Miller,
Daniel McMullan,
S Sri Krishna,
Polat Abdubek,
Claire Acosta,
Tamara Astakhova,
Prasad Burra,
Dennis Carlton,
Connie Chen,
Hsiu-Ju Chiu,
Thomas Clayton,
Marc C Deller,
Lian Duan,
Ylva Elias,
Marc-André Elsliger,
Dustin Ernst,
Carol Farr,
Julie Feuerhelm,
Anna Grzechnik,
Slawomir K Grzechnik,
Joanna Hale,
Gye Won Han,
Lukasz Jaroszewski,
Kevin K Jin,
Hope A Johnson,
Heath E Klock,
Mark W Knuth,
Abhinav Kumar,
David Marciano,
Andrew T Morse,
Kevin D Murphy,
Edward Nigoghossian,
Amanda Nopakun,
Linda Okach,
Silvya Oommachen,
Jessica Paulsen,
Christina Puckett,
Ron Reyes,
Christopher L Rife,
Natasha Sefcovic,
Sebastian Sudek,
Henry Tien,
Christine Trame,
Christina V Trout,
Henry van den Bedem,
Dana Weekes,
Aprilfawn White,
Qingping Xu,
Keith O Hodgson,
John Wooley,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics.
ECX21941 represents a very large family (over 600 members) of novel, ocean metagenome-specific proteins identified by clustering of the dataset from the Global Ocean Sampling expedition. The crystal structure of ECX21941 reveals unexpected similarity to Sm/LSm proteins, which are important RNA-binding proteins, despite no detectable sequence similarity. The ECX21941 protein assembles as a homopentamer in solution and in the crystal structure when expressed in Escherichia coli and represents the first pentameric structure for this Sm/LSm family of proteins, although the actual oligomeric form in vivo is currently not known. The genomic neighborhood analysis of ECX21941 and its homologs combined with sequence similarity searches suggest a cyanophage origin for this protein. The specific functions of members of this family are unknown, but our structure analysis of ECX21941 indicates nucleic acid-binding capabilities and suggests a role in RNA and/or DNA processing. Proteins 2009.(c) 2008 Wiley-Liss, Inc.
Proteins. 2008 Dec 8;:
19127588
Debanu Das,
S Sri Krishna,
Daniel McMullan,
Mitchell D Miller,
Qingping Xu,
Polat Abdubek,
Claire Acosta,
Tamara Astakhova,
Herbert L Axelrod,
Prasad Burra,
Dennis Carlton,
Hsiu-Ju Chiu,
Thomas Clayton,
Marc C Deller,
Lian Duan,
Ylva Elias,
Marc-André Elsliger,
Dustin Ernst,
Julie Feuerhelm,
Anna Grzechnik,
Slawomir K Grzechnik,
Joanna Hale,
Gye Won Han,
Lukasz Jaroszewski,
Kevin K Jin,
Heath E Klock,
Mark W Knuth,
Piotr Kozbial,
Abhinav Kumar,
David Marciano,
Andrew T Morse,
Kevin D Murphy,
Edward Nigoghossian,
Linda Okach,
Silvya Oommachen,
Jessica Paulsen,
Ron Reyes,
Christopher L Rife,
Natasha Sefcovic,
Henry Tien,
Christine B Trame,
Christina V Trout,
Henry van den Bedem,
Dana Weekes,
Aprilfawn White,
Keith O Hodgson,
John Wooley,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics.
Proteins. 2008 Nov 18;:
19089981
Qingping Xu,
Christopher L Rife,
Dennis Carlton,
Mitchell D Miller,
S Sri Krishna,
Marc-André Elsliger,
Polat Abdubek,
Tamara Astakhova,
Hsiu-Ju Chiu,
Thomas Clayton,
Lian Duan,
Julie Feuerhelm,
Slawomir K Grzechnik,
Joanna Hale,
Gye Won Han,
Lukasz Jaroszewski,
Kevin K Jin,
Heath E Klock,
Mark W Knuth,
Abhinav Kumar,
Daniel McMullan,
Andrew T Morse,
Edward Nigoghossian,
Linda Okach,
Silvya Oommachen,
Jessica Paulsen,
Ron Reyes,
Henry van den Bedem,
Keith O Hodgson,
John Wooley,
Ashley M Deacon,
Adam Godzik,
Scott A Lesley,
Ian A Wilson
Joint Center for Structural Genomics (JCSG).
Latest similar papers:
Nucleic Acids Res. 2009 May 27;:
19474339
Cit:2
Rutgers, the State University of New Jersey, Department of Chemistry & Chemical Biology, BioMaPS Institute for Quantitative Biology, Wright-Rieman Laboratories, 610 Taylor Road, Piscataway, NJ 08854.
The w3DNA (web 3DNA) server is a user-friendly web-based interface to the 3DNA suite of programs for the analysis, reconstruction, and visualization of three-dimensional (3D) nucleic-acid-containing structures, including their complexes with proteins and other ligands. The server allows the user to determine a wide variety of conformational parameters in a given structure-such as the identities and rigid-body parameters of interacting nucleic-acid bases and base-pair steps, the nucleotides comprising helical fragments, etc. It is also possible to build 3D models of arbitrary nucleotide sequences and helical types, customized single-stranded and double-helical structures with user-defined base-pair parameters and sequences, and models of DNA 'decorated' at user-defined sites with proteins and other molecules. The visualization component offers unique, publication-quality representations of nucleic-acid structures, such as 'block' images of bases and base pairs and stacking diagrams of interacting nucleotides. The w3DNA web server, located at http://w3dna.rutgers.edu, is free and open to all users with no login requirement.
Nucleic Acids Res. 2009 May 13;:
19443452
Cit:2
Department of Computer Science, Department of Bioengineering and Department of Plant and Microbial Biology, University of California, Berkeley, USA.
We present the INTREPID web server for predicting functionally important residues in proteins. INTREPID has been shown to boost the recall and precision of catalytic residue prediction over other sequence-based methods and can be used to identify other types of functional residues. The web server takes an input protein sequence, gathers homologs, constructs a multiple sequence alignment and phylogenetic tree and finally runs the INTREPID method to assign a score to each position. Residues predicted to be functionally important are displayed on homologous 3D structures (where available), highlighting spatial patterns of conservation at various significance thresholds. The INTREPID web server is available at http://phylogenomics.berkeley.edu/intrepid.
Nucleic Acids Res. 2009 May 12;:
19435882
Cit:1
Japan Biological Informatics Consortium (JBIC), 2-45 Aomi, Koto-ku, Tokyo 135-8073, Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Mizuho Information & Research Institute, Inc., 2-3 Kanda-Nishikicho, Chiyoda-ku, Tokyo 101-8443 and Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa 277-8562, Japan.
The CentroidFold web server (http://www.ncrna.org/centroidfold/) is a web application for RNA secondary structure prediction powered by one of the most accurate prediction engine. The server accepts two kinds of sequence data: a single RNA sequence and a multiple alignment of RNA sequences. It responses with a prediction result shown as a popular base-pair notation and a graph representation. PDF version of the graph representation is also available. For a multiple alignment sequence, the server predicts a common secondary structure. Usage of the server is quite simple. You can paste a single RNA sequence (FASTA or plain sequence text) or a multiple alignment (CLUSTAL-W format) into the textarea then click on the 'execute CentroidFold' button. The server quickly responses with a prediction result. The major advantage of this server is that it employs our original CentroidFold software as its prediction engine which scores the best accuracy in our benchmark results. Our web server is freely available with no login requirement.
Nucleic Acids Res. 2009 May 5;:
19417074
Department of Biochemistry and Howard Hughes Medical Institute and University of Texas, Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA.
The biological properties of proteins are often gleaned through comparative analysis of evolutionary relatives. Although protein structure similarity search methods detect more distant homologs than purely sequence-based methods, structural resemblance can result from either homology (common ancestry) or analogy (similarity without common ancestry). While many existing web servers detect structural neighbors, they do not explicitly address the question of homology versus analogy. Here, we present a web server named HorA (Homology or Analogy) that identifies likely homologs for a query protein structure. Unlike other servers, HorA combines sequence information from state-of-the-art profile methods with structure information from spatial similarity measures using an advanced computational technique. HorA aims to identify biologically meaningful connections rather than purely 3D-geometric similarities. The HorA method finds approximately 90% of remote homologs defined in the manually curated database SCOP. HorA will be especially useful for finding remote homologs that might be overlooked by other sequence or structural similarity search servers. The HorA server is available at http://prodata.swmed.edu/horaserver.
Nucleic Acids Res. 2009 Apr 30;:
19406927
Cit:2
Mark Berjanskii,
Peter Tang,
Jack Liang,
Joseph A Cruz,
Jianjun Zhou,
You Zhou,
Edward Bassett,
Cam Macdonell,
Paul Lu,
Guohui Lin,
David S Wishart
Department of Computing Science, Department of Biological Sciences, University of Alberta and National Research Council, National Institute for Nanotechnology (NINT), Edmonton, AB, Canada T6G 2E8.
GeNMR (GEnerate NMR structures) is a web server for rapidly generating accurate 3D protein structures using sequence data, NOE-based distance restraints and/or NMR chemical shifts as input. GeNMR accepts distance restraints in XPLOR or CYANA format as well as chemical shift files in either SHIFTY or BMRB formats. The web server produces an ensemble of PDB coordinates for the protein within 15-25 min, depending on model complexity and completeness of experimental restraints. GeNMR uses a pipeline of several pre-existing programs and servers to calculate the actual protein structure. In particular, GeNMR combines genetic algorithms for structure optimization along with homology modeling, chemical shift threading, torsion angle and distance predictions from chemical shifts/NOEs as well as ROSETTA-based structure generation and simulated annealing with XPLOR-NIH to generate and/or refine protein coordinates. GeNMR greatly simplifies the task of protein structure determination as users do not have to install or become familiar with complex stand-alone programs or obscure format conversion utilities. Tests conducted on a sample of 90 proteins from the BioMagResBank indicate that GeNMR produces high-quality models for all protein queries, regardless of the type of NMR input data. GeNMR was developed to facilitate rapid, user-friendly structure determination of protein structures via NMR spectroscopy. GeNMR is accessible at http://www.genmr.ca.
Bioinformatics. 2009 Apr 8;:
19357095
Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA.
SUMMARY: The Generic Genome Browser (GBrowse) is one of the most widely used tools for visualizing genomic features along a reference sequence. However, the installation and configuration of GBrowse is not trivial for biologists. We have developed a web server, WebGBrowse that allows users to upload genome annotation in the GFF3 format, configure the display of each genomic feature by simply using a web browser, and visualize the configured genomic features with the integrated GBrowse software. AVAILABILITY: WebGBrowse is accessible via http://webgbrowse.cgb.indiana.edu/ and the system is also freely available for local installations. Contact: dongq@indiana.edu.
Bioinformatics. 2009 Mar 5;:
19269988
Department of Computer Science, University of Georgia, Athens, GA 30602.
SUMMARY: RNATOPS-W is a web server to search sequences for RNA secondary structures including pseudoknots. The server accepts an annotated RNA multiple structural alignment as a structural profile and genomic or other sequences to search. It is built upon RNATOPS (Huang et al., 2008), a command line C++ software package for the same purpose, in which filters to speed up search are manually selected. RNATOPS-W improves upon RNATOPS by adding the function of automatic selection of an hidden Markov model (HMM) filter and also a friendly user interface for selection of a substructure filter by the user. In addition, RNATOPS-W complements existing RNA secondary structure search web servers that either use built-in structure profiles or are not able to detect pseudoknots. RNATOPS-W inherits the efficiency of RNATOPS in detecting large, complex RNA structures. AVAILABILITY: The web server RNATOPS-W is available at website www.uga.edu/RNA-Informatics/?f=software&p=RNATOPS-w. The underlying search program RNATOPS can be downloaded at www.uga.edu/RNA-Informatics/?f=software&p=RNATOPS. CONTACT: cai@cs.uga.edu Supplementary Material: The online Supplementary Material contains additional experimental data.
Bioinformatics. 2008 Nov 6;:
18990723
Cit:2
NEC Laboratories of America, Princeton, NJ, Computational Biology Program, Sloan-Kettering Institute, Memorial Sloan-Kettering Cancer Center, New York, NY, Department of Genome Sciences, Department of Computer Science and Engineering, University of Washington, Seattle, WA.
SUMMARY: We present a large scale implementation of the Rankprop protein homology ranking algorithm in the form of an openly accessible web server. We use the NRDB40 PSI-BLAST all-vs-all protein similarity network of 1.1 million proteins to construct the graph for the Rankprop algorithm, whereas previously, results were only reported for a database of 108,000 proteins. We also describe two algorithmic improvements to the original algorithm, including propagation from multiple homologs of the query and better normalization of ranking scores, that lead to higher accuracy and to scores with a probabilistic intepretation. AVAILABILITY: The Rankprop web server and source code is available at http://rankprop.gs.washington.edu. CONTACT: iain@nec-labs.com.
Nucleic Acids Res. 2008 May 24;:
18503087
Cit:30
Howard Hughes Medical Institute and Department of Biochemistry, University of Texas Southwestern Medical Center, 6001 Forest Park Road, Dallas, TX 75390-9050, USA.
Multiple sequence alignments are essential in computational sequence and structural analysis, with applications in homology detection, structure modeling, function prediction and phylogenetic analysis. We report PROMALS3D web server for constructing alignments for multiple protein sequences and/or structures using information from available 3D structures, database homologs and predicted secondary structures. PROMALS3D shows higher alignment accuracy than a number of other advanced methods. Input of PROMALS3D web server can be FASTA format protein sequences, PDB format protein structures and/or user-defined alignment constraints. The output page provides alignments with several formats, including a colored alignment augmented with useful information about sequence grouping, predicted secondary structures and consensus sequences. Intermediate results of sequence and structural database searches are also available. The PROMALS3D web server is available at: http://prodata.swmed.edu/promals3d/.
|
|
|||||||||||||||
|
|