BioInfoBank Library


 
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biomolecular Engineering, University of California at Santa Cruz Santa Cruz, CA 95064.
MOTIVATION: Knots in polypeptide chains have been found in very few proteins, and consequently should be generally avoided in protein structure prediction methods. Most effective structure prediction methods do not model the protein folding process itself, but rather seek only to correctly obtain the final native state. Consequently, the mechanisms that prevent knots from occurring in native proteins are not relevant to the modeling process, and as a result, knots can occur with significantly higher frequency in protein models. Here we describe Knotfind, a simple algorithm for knot detection that is fast enough for structure prediction, where tens or hundreds of thousands of conformations may be sampled during the course of a prediction. We have used this algorithm to characterize knots in large populations of model structures generated for targets in CASP 5 and CASP 6 using the Rosetta homology-based modeling method. RESULTS: Analysis of CASP5 models suggested several possible avenues for introduction of knots into these models, and these insights were applied to structure prediction in CASP 6, resulting in a significant decrease in the proportion of knotted models generated. Additionally, using the knot detection algorithm on structures in the Protein Data Bank, a previously unreported deep trefoil knot was found in acetylornithine transcarbamylase. AVAILABILITY: The Knotfind algorithm is available in the Rosetta structure prediction program at http://www.rosettacommons.org CONTACT: bort@soe.ucsc.edu.

Latest citations:

go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Moffitt Cancer Center, Tampa, FL 33612, USA.
Template based protein structure prediction (commonly referred to as homology or comparative modeling) uses knowledge of solved structures to model a protein sequence's native or true fold. First, a parent structure is found and then a template structure is built by mapping the target sequence onto the parent structure. This putative structure is refined using a combination of backbone moves, side-chain packing, and loop modeling. Template based protein structure prediction has always held great promise to produce atomically accurate models close to the native conformation based on two major assumptions. First, similar sequences exhibit similar protein folds. Second, soluble proteins populate a discrete fold space with many representatives already solved in our Protein Data Bank (PDB). Ironically, beginning so close to the native structure is also the primary source of problems confronting this method and is the reason for the lack of progress in this category of structure prediction. In this review, the general concepts and procedures for template based structure prediction are outlined based on the following topics: sequence alignment, parent structure selection, template structure building, refinement, evaluation, and final structure selection. Then, a description of established software and algorithms is provided where the advantages and limitations of the different methods will be pointed out. This is followed by a discussion of the developments in template based structure prediction up to the 7th Critical Assessment of Structure Prediction meeting. Lastly, we will address the increased difficulty in improving templates that start so close to the native structure, and discuss the improvements needed in this field.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. firas@u.washington.edu
MOTIVATION: Our focus has been on detecting topological properties that are rare in real proteins, but occur more frequently in models generated by protein structure prediction methods such as Rosetta. We previously created the Knotfind algorithm, successfully decreasing the frequency of knotted Rosetta models during CASP6. We observed an additional class of knot-like loops that appeared to be equally un-protein-like and yet do not contain a mathematical knot. These topological features are commonly referred to as slip-knots and are caused by the same mechanisms that result in knotted models. Slip-knots are undetectable by the original Knotfind algorithm. We have generalized our algorithm to detect them, and analyzed CASP6 models built using the Rosetta loop modeling method. RESULTS: After analyzing known protein structures in the PDB, we found that slip-knots do occur in certain proteins, but are rare and fall into a small number of specific classes. Our group used this new Pokefind algorithm to distinguish between these rare real slip-knots and the numerous classes of slip-knots that we discovered in Rosetta models and models submitted by the various CASP7 servers. The goal of this work is to improve future models created by protein structure prediction methods. Both algorithms are able to detect un-protein-like features that current metrics such as GDT are unable to identify, so these topological filters can also be used as additional assessment tools.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
[My paper] Anna L Mallam
St John's College and University Chemical Laboratory, Cambridge, UK.
The issue of how a newly synthesized polypeptide chain folds to form a protein with a unique three-dimensional structure, otherwise known as the 'protein-folding problem', remains a fundamental question in the life sciences. Over the last few decades, much information has been gathered about the mechanisms by which proteins fold. However, despite the vast topological diversity observed in biological structures, it was thought improbable, if not impossible, that a polypeptide chain could 'knot' itself to form a functional protein. Nevertheless, such knotted structures have since been identified, raising questions about how such complex topologies can arise during folding. Their formation does not fit any current folding models or mechanisms, and therefore represents an important piece of the protein-folding puzzle. This article reviews the progress made towards discovering how nature codes for, and contends with, knots during protein folding, and examines the insights gained from both experimental and computational studies. Mechanisms to account for the formation of knotted structures that were previously thought unfeasible, and their implications for protein folding, are also discussed.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Among proteins of known three-dimensional structure, only a few possess complex topological features such as knotted or interlinked (catenated) protein backbones. Such unusual proteins offer potentially unique insights into folding pathways and stabilization mechanisms. They also present special challenges for both theorists and computational scientists interested in understanding and predicting protein-folding behavior. Here, we review complex topological features in proteins with a focus on recent progress on the identification and characterization of knotted and interlinked protein systems. Also, an approach is described for designing an expanded set of knotted proteins.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Among the thousands of known three-dimensional protein folds, only a few have been found whose backbones are in knotted configurations. The rarity of knotted proteins has important implications for how natural proteins reach their natively folded states. Proteins with such unusual features offer unique opportunities for studying the relationships between structure, folding, and stability. Here we report the identification of a unique slipknot feature in the fold of a well-known thermostable protein, alkaline phosphatase. A slipknot is created when a knot is formed by part of a protein chain, after which the backbone doubles back so that the entire structure becomes unknotted in a mathematical sense. Slipknots are therefore not detected by computational tests that look for knots in complete protein structures. A computational survey looking specifically for slipknots in the Protein Data Bank reveals a few other instances in addition to alkaline phosphatase. Unexpected similarities are noted among some of the proteins identified. In addition, two transmembrane proteins are found to contain slipknots. Finally, mutagenesis experiments on alkaline phosphatase are used to probe the contribution the slipknot feature makes to thermal stability. The trends and conserved features observed in these proteins provide new insights into mechanisms of protein folding and stability.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Transcarbamylases catalyze the transfer of the carbamyl group from carbamyl phosphate (CP) to an amino group of a second substrate such as aspartate, ornithine, or putrescine. Previously, structural determination of a transcarbamylase from Xanthomonas campestris led to the discovery of a novel N-acetylornithine transcarbamylase (AOTCase) that catalyzes the carbamylation of N-acetylornithine. Recently, a novel N-succinylornithine transcarbamylase (SOTCase) from Bacteroides fragilis was identified. Structural comparisons of AOTCase from X. campestris and SOTCase from B. fragilis revealed that residue Glu92 (X. campestris numbering) plays a critical role in distinguishing AOTCase from SOTCase. Enzymatic assays of E92P, E92S, E92V, and E92A mutants of AOTCase demonstrate that each of these mutations converts the AOTCase to an SOTCase. Similarly, the P90E mutation in B. fragilis SOTCase (equivalent to E92 in X. campestris AOTCase) converts the SOTCase to AOTCase. Hence, a single amino acid substitution is sufficient to swap the substrate specificities of AOTCase and SOTCase. X-ray crystal structures of these mutants in complexes with CP and N-acetyl-L-norvaline (an analog of N-acetyl-L-ornithine) or N-succinyl-L-norvaline (an analog of N-succinyl-L-ornithine) substantiate this conversion. In addition to Glu92 (X. campestris numbering), other residues such as Asn185 and Lys30 in AOTCase, which are involved in binding substrates through bridging water molecules, help to define the substrate specificity of AOTCase. These results provide the correct annotation (AOTCase or SOTCase) for a set of the transcarbamylase-like proteins that have been erroneously annotated as ornithine transcarbamylase (OTCase, EC 2.1.3.3).

Other papers by authors:

go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. firas@u.washington.edu
MOTIVATION: Our focus has been on detecting topological properties that are rare in real proteins, but occur more frequently in models generated by protein structure prediction methods such as Rosetta. We previously created the Knotfind algorithm, successfully decreasing the frequency of knotted Rosetta models during CASP6. We observed an additional class of knot-like loops that appeared to be equally un-protein-like and yet do not contain a mathematical knot. These topological features are commonly referred to as slip-knots and are caused by the same mechanisms that result in knotted models. Slip-knots are undetectable by the original Knotfind algorithm. We have generalized our algorithm to detect them, and analyzed CASP6 models built using the Rosetta loop modeling method. RESULTS: After analyzing known protein structures in the PDB, we found that slip-knots do occur in certain proteins, but are rare and fall into a small number of specific classes. Our group used this new Pokefind algorithm to distinguish between these rare real slip-knots and the numerous classes of slip-knots that we discovered in Rosetta models and models submitted by the various CASP7 servers. The goal of this work is to improve future models created by protein structure prediction methods. Both algorithms are able to detect un-protein-like features that current metrics such as GDT are unable to identify, so these topological filters can also be used as additional assessment tools.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Merck.
We developed PolyA-Seq, a strand-specific and quantitative method for high-throughput sequencing of 3' ends of polyadenylated transcripts, and used it to globally map polyA sites in 24 matched tissues in human, rhesus, dog, mouse and rat. We show that PolyA-Seq is as accurate as existing RNA sequencing (RNA-Seq) approaches for digital gene expression (DGE), enabling simultaneous mapping of polyadenylation (polyA) sites and quantitative measurement of their usage. In human, we confirmed 158,533 known sites and discovered 280,857 novel sites (FDR<2.5%). On average 10% of novel human sites were also detected in matched tissues in other species. Most novel sites represent uncharacterized alternative polyA events and extensions of known transcripts in human and mouse, but primarily delineate novel transcripts in the other three species. 69.1% of known human genes that we detected have multiple polyA sites in their 3'UTRs, with 49.3% having three or more. We also detected polyadenylation of noncoding and antisense transcripts, including constitutive and tissue-specific primary microRNAs. The canonical polyA signal was strongly enriched and positionally conserved in all species. In general, usage of polyA sites is more similar within the same tissues across different species than within a species. These quantitative maps of polyA usage in evolutionarily and functionally related samples constitute a resource for understanding the regulatory mechanisms underlying alternative polyadenylation.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biochemistry, University of Washington, Seattle, Washington, USA.
Computational enzyme design holds promise for the production of renewable fuels, drugs and chemicals. De novo enzyme design has generated catalysts for several reactions, but with lower catalytic efficiencies than naturally occurring enzymes. Here we report the use of game-driven crowdsourcing to enhance the activity of a computationally designed enzyme through the functional remodeling of its structure. Players of the online game Foldit were challenged to remodel the backbone of a computationally designed bimolecular Diels-Alderase to enable additional interactions with substrates. Several iterations of design and characterization generated a 24-residue helix-turn-helix motif, including a 13-residue insertion, that increased enzyme activity >18-fold. X-ray crystallography showed that the large insertion adopts a helix-turn-helix structure positioned as in the Foldit model. These results demonstrate that human creativity can extend beyond the macroscopic challenges encountered in everyday life to molecular-scale design problems.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, 60-780 Poznan, Poland.
Mason-Pfizer monkey virus (M-PMV), a D-type retrovirus assembling in the cytoplasm, causes simian acquired immunodeficiency syndrome (SAIDS) in rhesus monkeys. Its pepsin-like aspartic protease (retropepsin) is an integral part of the expressed retroviral polyproteins. As in all retroviral life cycles, release and dimerization of the protease (PR) is strictly required for polyprotein processing and virion maturation. Biophysical and NMR studies have indicated that in the absence of substrates or inhibitors M-PMV PR should fold into a stable monomer, but the crystal structure of this protein could not be solved by molecular replacement despite countless attempts. Ultimately, a solution was obtained in mr-rosetta using a model constructed by players of the online protein-folding game Foldit. The structure indeed shows a monomeric protein, with the N- and C-termini completely disordered. On the other hand, the flap loop, which normally gates access to the active site of homodimeric retropepsins, is clearly traceable in the electron density. The flap has an unusual curled shape and a different orientation from both the open and closed states known from dimeric retropepsins. The overall fold of the protein follows the retropepsin canon, but the C(α) deviations are large and the active-site 'DTG' loop (here NTG) deviates up to 2.7 Å from the standard conformation. This structure of a monomeric retropepsin determined at high resolution (1.6 Å) provides important extra information for the design of dimerization inhibitors that might be developed as drugs for the treatment of retroviral infections, including AIDS.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biochemistry, University of Washington, Box 357370, Seattle, WA 98195, USA.
Foldit is a multiplayer online game in which players collaborate and compete to create accurate protein structure models. For specific hard problems, Foldit player solutions can in some cases outperform state-of-the-art computational methods. However, very little is known about how collaborative gameplay produces these results and whether Foldit player strategies can be formalized and structured so that they can be used by computers. To determine whether high performing player strategies could be collectively codified, we augmented the Foldit gameplay mechanics with tools for players to encode their folding strategies as "recipes" and to share their recipes with other players, who are able to further modify and redistribute them. Here we describe the rapid social evolution of player-developed folding algorithms that took place in the year following the introduction of these tools. Players developed over 5,400 different recipes, both by creating new algorithms and by modifying and recombining successful recipes developed by other players. The most successful recipes rapidly spread through the Foldit player population, and two of the recipes became particularly dominant. Examination of the algorithms encoded in these two recipes revealed a striking similarity to an unpublished algorithm developed by scientists over the same period. Benchmark calculations show that the new algorithm independently discovered by scientists and by Foldit players outperforms previously published methods. Thus, online scientific game frameworks have the potential not only to solve hard scientific problems, but also to discover and formalize effective new strategies and algorithms.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biochemistry, University of Washington, Seattle, Washington, USA.
Following the failure of a wide range of attempts to solve the crystal structure of M-PMV retroviral protease by molecular replacement, we challenged players of the protein folding game Foldit to produce accurate models of the protein. Remarkably, Foldit players were able to generate models of sufficient quality for successful molecular replacement and subsequent structure determination. The refined structure provides new insights for the design of antiretroviral drugs.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada M5S 1A8.
H-NS and Lsr2 are nucleoid-associated proteins from Gram-negative bacteria and Mycobacteria, respectively, that play an important role in the silencing of horizontally acquired foreign DNA that is more AT-rich than the resident genome. Despite the fact that Lsr2 and H-NS proteins are dissimilar in sequence and structure, they serve apparently similar functions and can functionally complement one another. The mechanism by which these xenogeneic silencers selectively target AT-rich DNA has been enigmatic. We performed high-resolution protein binding microarray analysis to simultaneously assess the binding preference of H-NS and Lsr2 for all possible 8-base sequences. Concurrently, we performed a detailed structure-function relationship analysis of their C-terminal DNA binding domains by NMR. Unexpectedly, we found that H-NS and Lsr2 use a common DNA binding mechanism where a short loop containing a "Q/RGR" motif selectively interacts with the DNA minor groove, where the highest affinity is for AT-rich sequences that lack A-tracts. Mutations of the Q/RGR motif abolished DNA binding activity. Netropsin, a DNA minor groove-binding molecule effectively outcompeted H-NS and Lsr2 for binding to AT-rich sequences. These results provide a unified molecular mechanism to explain findings related to xenogeneic silencing proteins, including their lack of apparent sequence specificity but preference for AT-rich sequences. Our findings also suggest that structural information contained within the DNA minor groove is deciphered by xenogeneic silencing proteins to distinguish genetic material that is self from nonself.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Banting and Best Department of Medical Research, Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, M5S 3E1, Canada, matt.weirauch@utoronto.ca.
Transcription factors (TFs) play key roles in the regulation of gene expression by binding in a sequence-specific manner to genomic DNA. In eukaryotes, DNA binding is achieved by a wide range of structural forms and motifs. TFs are typically classified by their DNA-binding domain (DBD) type. In this chapter, we catalogue and survey 91 different TF DBD types in metazoa, plants, fungi, and protists. We briefly discuss well-characterized TF families representing the major DBD superclasses. We also examine the species distributions and inferred evolutionary histories of the various families, and the potential roles played by TF family expansion and dimerization.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biomedical Sciences, Chang Gung University, Kwei-Shan, Taiwan. jmeir@mail.cgu.edu.tw
DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering;(2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting.

Latest similar papers:

go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Chemistry, Seoul National University, Seoul 151-747, Republic of Korea.
Contemporary template-based modeling techniques allow applications of modeling methods to vast biological problems. However, they tend to fail to provide accurate structures for less-conserved local regions in sequence even when the overall structure can be modeled reliably. We call these regions unreliable local regions (ULRs). Accurate modeling of ULRs is of enormous value because they are frequently involved in functional specificity. In this article, we introduce a new method for modeling ULRs in template-based models by employing a sophisticated loop modeling technique. Combined with our previous study on protein termini, the method is applicable to refinement of both loop and terminus ULRs. A large-scale test carried out in a blind fashion in CASP9 (the 9th Critical Assessment of techniques for Protein Structure prediction) shows that ULR structures are improved over initial template-based models by refinement in more than 70% of the successfully detected ULRs. It is also notable that successful modeling of several long ULRs over 12 residues is achieved. Overall, the current results show that a careful application of loop and terminus modeling can be a promising tool for model refinement in template-based modeling. Proteins 2012. © 2012 Wiley-Liss, Inc.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101.
MOTIVATION: Modeling of side-chain conformations constitutes an indispensable effort in protein structure modeling, protein-protein docking and protein design. Thanks to an intensive attention to this field, many of the existing programs can achieve reasonably good and comparable prediction accuracy. Moreover, in our previous work on CIS-RR, we argued that the prediction with few atomic clashes can complement the current existing methods for subsequent analysis and refinement of protein structures. However, these recent efforts to enhance the quality of predicted side chains have been accompanied by a significant increase of computational cost. RESULTS: In this study, by mainly focusing on improving the speed of side-chain conformation prediction, we present a RApid Side-chain Predictor, called RASP. To achieve a much faster speed with a comparable accuracy to the best existing methods, we not only employ the clash elimination strategy of CIS-RR, but also carefully optimize energy terms and integrate different search algorithms. In comprehensive benchmark testings, RASP is over one order of magnitude faster (~40 times over CIS-RR) than the recently developed methods, while achieving comparable or even better accuracy. AVAILABILITY: RASP is available to non-commercial users at our website: http://jianglab.ibp.ac.cn/lims/rasp/rasp CONTACT: taijiao@moon.ibp.ac.cn SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA. tex@uw.edu
Prediction of protein structures from sequences is a fundamental problem in computational biology. Algorithms that attempt to predict a structure from sequence primarily use two sources of information. The first source is physical in nature: proteins fold into their lowest energy state. Given an energy function that describes the interactions governing folding, a method for constructing models of protein structures, and the amino acid sequence of a protein of interest, the structure prediction problem becomes a search for the lowest energy structure. Evolution provides an orthogonal source of information: proteins of similar sequences have similar structure, and therefore proteins of known structure can guide modeling. The relatively successful Rosetta approach takes advantage of the first, but not the second source of information during model optimization. Following the classic work by Andrej Sali and colleagues, we develop a probabilistic approach to derive spatial restraints from proteins of known structure using advances in alignment technology and the growth in the number of structures in the Protein Data Bank. These restraints define a region of conformational space that is high-probability, given the template information, and we incorporate them into Rosetta's comparative modeling protocol. The combined approach performs considerably better on a benchmark based on previous CASP experiments. Incorporating evolutionary information into Rosetta is analogous to incorporating sparse experimental data: in both cases, the additional information eliminates large regions of conformational space and increases the probability that energy-based refinement will hone in on the deep energy minimum at the native state.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Chemical, Food, Pharmaceutical and Pharmacological Sciences (DiSCAFF), University of Piemonte Orientale Amedeo Avogadro, Novara, Italy.
Polymers can be modeled as open polygonal paths and their closure generates knots. Knotted proteins detection is currently achieved via high-throughput methods based on a common framework insensitive to the handedness of knots. Here we propose a topological framework for the computation of the HOMFLY polynomial, an handedness-sensitive invariant. Our approach couples a multi-component reduction scheme with the polynomial computation. After validation on tabulated knots and links the framework was applied to the entire Protein Data Bank along with a set of selected topological checks that allowed to discard artificially entangled structures. This led to an up-to-date table of knotted proteins that also includes two newly detected right-handed trefoil knots in recently deposited protein structures. The application range of our framework is not limited to proteins and it can be extended to the topological analysis of biological and synthetic polymers and more generally to arbitrary polygonal paths.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Laboratoire de Mathematiques et Physique Theorique CNRS UMR, Fédération Denis Poisson, Université de Tours, France.
We introduce a novel generalization of the discrete nonlinear Schrödinger equation. It supports solitons that we utilize to model chiral polymers in the collapsed phase and, in particular, proteins in their native state. As an example we consider the villin headpiece HP35, an archetypal protein for testing both experimental and theoretical approaches to protein folding. We use its backbone as a template to explicitly construct a two-soliton configuration. Each of the two solitons describe well over 7.000 supersecondary structures of folded proteins in the Protein Data Bank with sub-angstrom accuracy suggesting that these solitons are common in nature.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Bioengineering and Therapeutic Sciences, University of California at San Francisco, 1700 4th Street, San Francisco, CA 94158, USA.
The RosettaBackrub server (http://kortemmelab.ucsf.edu/backrub) implements the Backrub method, derived from observations of alternative conformations in high-resolution protein crystal structures, for flexible backbone protein modeling. Backrub modeling is applied to three related applications using the Rosetta program for structure prediction and design:(I) modeling of structures of point mutations,(II) generating protein conformational ensembles and designing sequences consistent with these conformations and (III) predicting tolerated sequences at protein-protein interfaces. The three protocols have been validated on experimental data. Starting from a user-provided single input protein structure in PDB format, the server generates near-native conformational ensembles. The predicted conformations and sequences can be used for different applications, such as to guide mutagenesis experiments, for ensemble-docking approaches or to generate sequence libraries for protein design.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Bioinformatics and Telemedicine, Collegium Medicum, Jagiellonian University, Lazarza 16, PL-31-530 Krakow, Poland.
The three-dimensional structures of a set of 'never born proteins'(NBP, random amino acid sequence proteins with no significant homology with known proteins) were predicted using two methods: Rosetta and the one based on the 'fuzzy-oil-drop'(FOD) model. More than 3000 different random amino acid sequences have been generated, filtered against the non redundant protein sequence data base, to remove sequences with significant homology with known proteins, and subjected to three-dimensional structure prediction. Comparison between Rosetta and FOD predictions allowed to select the ten top (highest structural similarity) and the ten bottom (the lowest structural similarity) structures from the ranking list organized according to the RMS-D value. The selected structures were taken for detailed analysis to define the scale of structural accordance and discrepancy between the two methods. The structural similarity measurements revealed discrepancies between structures generated on the basis of the two methods. Their potential biological function appeared to be quite different as well. The ten bottom structures appeared to be 'unfoldable' for the FOD model. Some aspects of the general characteristics of the NBPs are also discussed. The calculations were performed on the EUChinaGRID grid platform to test the performance of this infrastructure for massive protein structure predictions.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. firas@u.washington.edu
MOTIVATION: Our focus has been on detecting topological properties that are rare in real proteins, but occur more frequently in models generated by protein structure prediction methods such as Rosetta. We previously created the Knotfind algorithm, successfully decreasing the frequency of knotted Rosetta models during CASP6. We observed an additional class of knot-like loops that appeared to be equally un-protein-like and yet do not contain a mathematical knot. These topological features are commonly referred to as slip-knots and are caused by the same mechanisms that result in knotted models. Slip-knots are undetectable by the original Knotfind algorithm. We have generalized our algorithm to detect them, and analyzed CASP6 models built using the Rosetta loop modeling method. RESULTS: After analyzing known protein structures in the PDB, we found that slip-knots do occur in certain proteins, but are rare and fall into a small number of specific classes. Our group used this new Pokefind algorithm to distinguish between these rare real slip-knots and the numerous classes of slip-knots that we discovered in Rosetta models and models submitted by the various CASP7 servers. The goal of this work is to improve future models created by protein structure prediction methods. Both algorithms are able to detect un-protein-like features that current metrics such as GDT are unable to identify, so these topological filters can also be used as additional assessment tools.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Forschungszentrum Karlsruhe, Institute for Nanotechnology, PO Box 3640, 76021 Karlsruhe, Germany.
Biophysical forcefields have contributed less than originally anticipated to recent progress in protein structure prediction. Here, we have investigated the selectivity of a recently developed all-atom free-energy forcefield for protein structure prediction and quality assessment (QA). Using a heuristic method, but excluding homology, we generated decoy-sets for all targets of the CASP7 protein structure prediction assessment with <150 amino acids. The decoys in each set were then ranked by energy in short relaxation simulations and the best low-energy cluster was submitted as a prediction. For four of nine template-free targets, this approach generated high-ranking predictions within the top 10 models submitted in CASP7 for the respective targets. For these targets, our de-novo predictions had an average GDT_S score of 42.81, significantly above the average of all groups. The refinement protocol has difficulty for oligomeric targets and when no near-native decoys are generated in the decoy library. For targets with high-quality decoy sets the refinement approach was highly selective. Motivated by this observation, we rescored all server submissions up to 200 amino acids using a similar refinement protocol, but using no clustering, in a QA exercise. We found an excellent correlation between the best server models and those with the lowest energy in the forcefield. The free-energy refinement protocol may thus be an efficient tool for relative QA and protein structure prediction. Proteins 2009.(c) 2009 Wiley-Liss, Inc.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
School of Computing, SASTRA University, Thanjavur, India.
Genetic algorithms (GA) are often well suited for optimisation problems involving several conflicting objectives. It is more suitable to model the protein structure prediction problem as a multi-objective optimisation problem since the potential energy functions used in the literature to evaluate the conformation of a protein are based on the calculations of two different interaction energies: local (bond atoms) and non-local (non-bond atoms) and experiments have shown that those types of interactions are in conflict, by using the potential energy function, Chemistry at Harvard Macromolecular Mechanics. In this paper, we have modified the immune inspired Pareto archived evolutionary strategy (I-PAES) algorithm and denoted it as MI-PAES. It can effectively exploit some prior knowledge about the hydrophobic interactions, which is one of the most important driving forces in protein folding to make vaccines. The proposed MI-PAES is comparable with other evolutionary algorithms proposed in literature, both in terms of best solution found and the computational time and often results in much better search ability than that of the canonical GA.
matthiew
 

Polish News
2012-05-24 06:33:23 © BioInfoBank Institute