BioInfoBank Library


 
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
BioInfoBank Institute, ul. Limanowskiego 24A, 60-744 Poznan, Poland. leszek@bioinfo.pl
MOTIVATION: The Ligand-Info system is based on the assumption that small molecules with similar structure have similar functional (binding) properties. The developed system enables a fast and sensitive index based search for similar compounds in large databases. Index profiles, constructed by averaging indexes of related molecules are used to increase the specificity of the search. The utilization of index profiles helps to focus on frequent, common features of a family of compounds. RESULTS: A Java-based tool for clustering and scanning of small molecules has been created. The tool can interactively cluster sets of molecules and create index profiles on the user side and automatically download similar molecules from a databases of 250 000 compounds. The results of the application of index profiles demonstrate that the profile based search strategy can increase the quality of the selection process. AVAILABILITY: The system is available at http://Ligand.Info. The application requires the Java Runtime Environment 1.4, which can be automatically installed during the first use on desktop systems, which support it. A standalone version of the program is available from the authors upon request.

Latest citations:

go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Bioinformatics Research Centre, Department of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK.
We introduce TOPS+ strings, a highly abstract string-based model of protein topology that permits efficient computation of structure comparison, and can optionally represent ligand information. In this model we consider loops as secondary structure elements (SSEs) as well as helices and strands; in addition we represent ligands as first class objects. Interactions between SSEs and between SSEs and ligands are described by incoming/outgoing arcs and ligand arcs respectively; and SSEs are annotated with arc interaction direction and type. We are able to abstract away from the ligands themselves, to give a model characterized by a regular grammar rather than the context sensitive grammar of the original TOPS model (Gilbert, et al., 2000; Gilbert, et al., 2001; Viksna and Gilbert, 2001). Our TOPS+ strings model is sufficiently descriptive to obtain biologically meaningful results and has the advantage of permitting fast string-based structure matching and comparison as well as avoiding issues of NP-completeness associated with graph problems. Our structure comparison method is computationally more efficient in identifying distantly related proteins than BLAST, CLUSTALW, SSAP and TOPS because of the compact and abstract string-based representation of protein structure which records both topological and biochemical information including the functionally important loop regions of the protein structures. The accuracy of our comparison method is comparable with that of TOPS. Also, we have demonstrated that our TOPS+ strings method out-performs the TOPS method for the ligand-dependent protein structures and provides biologically meaningful results. AVAILABILITY: The TOPS+ strings comparison server is available from http://www.dcs.gla.ac.uk/~mallika/topsplus.html. CONTACT: mallikav@burnham.org.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Pawińskiego 5a, 02-106, Warsaw, Poland, D.Plewczynski@icm.edu.pl.
The term Interactome describes the set of all molecular interactions in cells, especially in the context of protein-protein interactions. These interactions are crucial for most cellular processes, so the full representation of the interaction repertoire is needed to understand the cell molecular machinery at the system biology level. In this short review, we compare various methods for predicting protein-protein interactions using sequence and structure information. The ultimate goal of those approaches is to present the complete methodology for the automatic selection of interaction partners using their amino acid sequences and/or three dimensional structures, if known. Apart from a description of each method, details of the software or web interface needed for high throughput prediction on the whole genome scale are also provided. The proposed validation of the theoretical methods using experimental data would be a better assessment of their accuracy.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Istituto di Chimica Organica “Alessandro Marchesini”, Facoltà di Farmacia, Università degli Studi di Milano, Via Venezian 21, 20133 Milano, Italy.
REST/NRSF is a multifunctional transcription factor that represses or silences many neuron-specific genes in both neural and non-neural cells by recruitment to its cognate RE1/NRSE regulatory sites. An increase in RE1/NRSE genomic binding is found in Huntington's disease (HD), resulting in the repression of REST/NRSF regulated gene transcription, among which BDNF, thus representing one of the possible detrimental effectors in HD. Three 2-aminothiazole derivatives were recently identified as potent modulators of the RE1/NRSE silencing activity through a cell-based gene reporter assay. In this study, the structure-activity relationships (SAR) of a library of commercially available 2-aminoisothiazoles diversely substituted at the amino group or at position 4 has been evaluated. A quantitative structure-activity relationship analysis performed using the Phase strategy yielded highly predictive 3D-QSAR pharmacophore model for in silico drug screening.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
In today's research environment, a wealth of experimental/theoretical structural data is available and the number of therapeutically relevant macromolecular structures is growing rapidly. This, coupled with the huge number of small non-peptide potential drug candidates easily available (over 7 million compounds), highlight the need of using computer-aided techniques for the efficient identification and optimization of novel hit compounds. Virtual (or in silico) ligand screening based on the three-dimensional structure of macromolecular targets (SB-VLS) is firmly established as an important approach to identify chemical entities that have a high likelihood of binding to a target molecule to elicit desired biological responses. A myriad of free applications and services facilitating the drug discovery process have been posted on the Web. In this review, we cite over 350 URLs that are useful for SB-VLS projects and essentially free for academic groups. We attempt to provide links for in silico ADME/tox prediction tools, compound collections, some ligand-based methods, characterization/simulation of 3D targets and homology modeling tools, druggable pocket predictions, active site comparisons, analysis of macromolecular interfaces, protein docking tools to help identify binding pockets and protein-ligand docking/scoring methods. As such, we aim at providing both, methods pertaining to the field of Structural Bioinformatics (defined here as tools to study macromolecules) and methods pertaining to the field of Chemoinformatics (defined here as tools to make better decisions faster in the arena of drug/lead identification and optimization). We also report several recent success stories using these free computer methods. This review should help readers finding free computer tools useful for their projects. Overall, we are confident that these tools will facilitate rapid and cost-effective identification of new hit compounds. The URLs presented in this review will be updated regularly at www.vls3d.com in the coming months,"Links" section.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Interdisciplinary Centre for Mathematical and Computational Modeling, University of Warsaw, Pawinskiego 5a Street, 02-106 Warsaw, Poland.
A structure-based in silico virtual drug discovery procedure was assessed with severe acute respiratory syndrome coronavirus main protease serving as a case study. First, potential compounds were extracted from protein-ligand complexes selected from Protein Data Bank database based on structural similarity to the target protein. Later, the set of compounds was ranked by docking scores using a Electronic High-Throughput Screening flexible docking procedure to select the most promising molecules. The set of best performing compounds was then used for similarity search over the 1 million entries in the Ligand.Info Meta-Database. Selected molecules having close structural relationship to a 2-methyl-2,4-pentanediol may provide candidate lead compounds toward the development of novel allosteric severe acute respiratory syndrome protease inhibitors.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
INSERM U648, University Paris V, 45 rue des Sts Peres, 75006 Paris, France. bruno.villoutreix@univ-paris5.fr.
The processes used by academic and industrial scientists to discover new drugs have recently experienced a true renaissance with many new and exciting techniques. The number of protein structures and/or chemical ligands is constantly growing, through the use of parallel chemistry, X-ray crystallography, NMR or homology modeling methods and so is the theoretical understanding of protein-ligand interactions. As such, structure-based approaches to drug-design and in silico screening are becoming routine part of most modern lead discovery programs. Prioritization of compound libraries is an extremely important task that aims at the rapid identification of tight-binding ligands and ultimately new therapeutic compounds. These in silico approaches combined with other experimental methods facilitate the design of new medicines to treat cardiovascular, degenerative, infectious, and neoplastic diseases, among others. Here, we review key concepts and specific features of several selected ligand-receptor docking/scoring methods while several other topics pertaining to the field of in silico screening are reviewed in the following articles of this special issue of Current Protein and Peptide Science.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Inserm U648, Paris 5 University, 45 rue des Sts Peres, 75006 Paris, France.
In silico screening based on the structures of the ligands or of the receptors has become an essential tool to facilitate the drug discovery process but compound collections are needed to carry out such in silico experiments. It has been recognized that absorption, distribution, metabolism, excretion and toxicity (ADME/tox) are key properties that need to be considered early on, even during the database preparation stage. FAF-Drugs is an online service based on Frowns (a chemoinformatics toolkit) that allows users to process their own compound collections via simple ADME/Tox filtering rules such as molecular weight, polar surface area, logP or number of rotatable bonds. SMILES (Simplified Molecular Input Line Entry System), CANSMILES (canonical smiles) or SDF (structure data file) files are required as input and molecules that pass or do not pass the filters are sent back in CANSMILES format. This service should thus help scientists engaging in drug discovery campaigns. Other utilities and several compound collections suitable for in silico screening are available at our site. FAF-Drugs can be accessed at http://bioserv.rpbs.jussieu.fr/FAFDrugs.html.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093-0412, USA.
This paper describes a virtual screening methodology that generates a ranked list of high-binding small molecule ligands for orphan G protein-coupled receptors (oGPCRs), circumventing the requirement for receptor three-dimensional structure determination. Features representing the receptor are based only on physicochemical properties of primary amino acid sequence, and ligand features use the two-dimensional atomic connection topology and atomic properties. An experimental screen comprised nearly 2 million hypothetical oGPCR-ligand complexes, from which it was observed that the top 1.96% predicted affinity scores corresponded to "highly active" ligands against orphan receptors. Results representing predicted high-scoring novel ligands for many oGPCRs are presented here. Validation of the method was carried out in several ways:(1) A random permutation of the structure-activity relationship of the training data was carried out; by comparing test statistic values of the randomized and nonshuffled data, we conclude that the value obtained with nonshuffled data is unlikely to have been encountered by chance.(2) Biological activities linked to the compounds with high cross-target binding affinity were analyzed using computed log-odds from a structure-based program. This information was correlated with literature citations where GPCR-related pathways or processes were linked to the bioactivity in question.(3) Anecdotal, out-of-sample predictions for nicotinic targets and known ligands were performed, with good accuracy in the low-to-high "active" binding range.(4) An out-of-sample consistency check using the commercial antipsychotic drug olanzapine produced "active" to "highly-active" predicted affinities for all oGPCRs in our study, an observation that is consistent with documented findings of cross-target affinity of this compound for many different GPCRs. It is suggested that this virtual screening approach may be used in support of the functional characterization of oGPCRs by identifying potential cognate ligands. Ultimately, this approach may have implications for pharmaceutical therapies to modulate the activity of faulty or disease-related cellular signaling pathways. In addition to application to cell surface receptors, this approach is a generalized strategy for discovery of small molecules that may bind intracellular enzymes and involve protein-protein interactions.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
MOTIVATION: Different resources exist for experimentally determined and computed three-dimensional (3D)-structures of low molecular weight structures but for approved drugs, no free, publicly accessible source of 3D-structures and conformers is available. Furthermore, for selection purposes or for correlation of structural similarity with medical application, the assignment of the Anatomical Therapeutic Chemical (ATC) classification codes to each structure according to the WHO-scheme would be desirable. RESULTS: The database contains approximately 2500 3D-structures of active ingredients of essential marketed drugs. To account for structural flexibility they are represented by 10(5) structural conformers. Here we present a web-query system enabling searches for drug name, synonyms, trade name, trivial name, formula, CAS-number, ATC-code etc. 2D-similarity screening (Tanimoto coefficients) and an automatic 3D-superposition procedure based on conformational representation are implemented. Drug structures above a similarity threshold as well as superimposed conformers can be retrieved in the mol- file format via a graphical interface. AVAILABILITY: For academic use the system is accessible at http://bioinf.charite.de/superdrug. The retrieval system requires the free browser-plugin 'chime' from MDL for visualization.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Pharmaceutical Chemistry, University of California San Francisco, Genentech Hall, 600 16th Street, San Francisco, California 94143, USA.
A critical barrier to entry into structure-based virtual screening is the lack of a suitable, easy to access database of purchasable compounds. We have therefore prepared a library of 727,842 molecules, each with 3D structure, using catalogs of compounds from vendors (the size of this library continues to grow). The molecules have been assigned biologically relevant protonation states and are annotated with properties such as molecular weight, calculated LogP, and number of rotatable bonds. Each molecule in the library contains vendor and purchasing information and is ready for docking using a number of popular docking programs. Within certain limits, the molecules are prepared in multiple protonation states and multiple tautomeric forms. In one format, multiple conformations are available for the molecules. This database is available for free download (http://zinc.docking.org) in several common file formats including SMILES, mol2, 3D SDF, and DOCK flexibase format. A Web-based query tool incorporating a molecular drawing interface enables the database to be searched and browsed and subsets to be created. Users can process their own molecules by uploading them to a server. Our hope is that this database will bring virtual screening libraries to a wide community of structural biologists and medicinal chemists.

Other papers by authors:

go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
BioInfoBank Institute, ul. Limanowskiego 24A, 60-744 Poznan, Poland.
Ligand.Info is a compilation of various publicly available databases of small molecules. The total size of the Meta-Database is over 1 million entries. The compound records contain calculated three-dimensional coordinates and sometimes information about biological activity. Some molecules have information about FDA drug approving status or about anti-HIV activity. Meta-Database can be downloaded from the http://Ligand.Info web page. The database can also be screened using a Java-based tool. The tool can interactively cluster sets of molecules on the user side and automatically download similar molecules from the server. The application requires the Java Runtime Environment 1.4 or higher, which can be automatically downloaded from Sun Microsystems or Apple Computer and installed during the first use of Ligand.Info on desktop systems, which support Java (Ms Windows, Mac OS, Solaris, and Linux). The Ligand.Info Meta-Database can be used for virtual high-throughput screening of new potential drugs. Presented examples showed that using a known antiviral drug as query the system was able to find others antiviral drugs and inhibitors.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
BioInfoBank Institute, ul. Limanowskiego 24A, 60-744 Poznan, Poland. kuba@bioinfo.pl
Cytokinins are plant hormones involved in the essential processes of plant growth and development. They bind with receptors known as CRE1/WOL/AHK4, AHK2, and AHK3, which possess histidine kinase activity. Recently, the sensor domain cyclases/histidine kinases associated sensory extracellular (CHASE) was identified in those proteins but little is known about its structure and interaction with ligands. Distant homology detection methods developed in our laboratory and molecular phylogeny enabled the prediction of the structure of the CHASE domain as similar to the photoactive yellow protein-like sensor domain. We have identified the active site pocket and amino acids that are involved in receptor-ligand interactions. We also show that fold evolution of cytokinin receptors is very important for a full understanding of the signal transduction mechanism in plants.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Interdisciplinary Center for Mathematical and Computational Modeling Warsaw University Warszawa, Poland.
We present here a simple method for fast and accurate comparison of proteins using their structures. The algorithm is based on structural alignment of segments of Calpha chains (with size of 99 or 199 residues). The method is optimized in terms of speed and accuracy. We test it on 97 representative proteins with the similarity measure based on the SCOP classification. We compare our algorithm with the LGscore2 automatic method. Our method has the same accuracy as the LGscore2 algorithm with much faster processing of the whole test set, which is promising. A second test is done using the ToolShop structure prediction evaluation program and shows that our tool is on average slightly less sensitive than the DALI server. Both algorithms give a similar number of correct models, however, the final alignment quality is better in the case of DALI. Our method was implemented under the name 3D-Hit as a web server at http://3dhit.bioinfo.pl/ free for academic use, with a weekly updated database containing a set of 5000 structures from the Protein Data Bank with non-homologous sequences.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
BioInfoBank Institute, Poznan, Poland.
In CASP5, the BioInfo.PL group has used the structure prediction Meta Server and the associated newly developed flexible meta-predictor, called 3D-Jury, as the main structure prediction tools. The most important feature of the meta-predictor is a high (86%) correlation between the reported confidence score and the quality of the selected model. The Gene Relational Database (GRDB) was used to confirm the fold recognition results by selecting distant homologues and subsequent structure prediction with the Meta Server. A fragment-splicing procedure was performed as a final processing step with large fragments extracted from selected models using model quality control provided by Verify3D. The comparison of submitted models with the native structure conducted after the CASP meeting showed that the GRDB-supported structure prediction led to a satisfactory template fold selection, whereas the fragment-splicing procedure must be improved in the future.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Bioinformatics Laboratory, BioInfoBank Institute, ul. Limanowskiego 24A, 60-744 Poznan, Poland.
ORFeus is a fully automated, sensitive protein sequence similarity search server available to the academic community via the Structure Prediction Meta Server (http://BioInfo.PL/Meta/). The goal of the development of ORFeus was to increase the sensitivity of the detection of distantly related protein families. Predicted secondary structure information was added to the information about sequence conservation and variability, a technique known from hybrid threading approaches. The accuracy of the meta profiles created this way is compared with profiles containing only sequence information and with the standard approach of aligning a single sequence with a profile. Additionally, the alignment of meta profiles is more sensitive in detecting remote homology between protein families than if aligning two sequence-only profiles or if aligning a profile with a sequence. The specificity of the alignment score is improved in the lower specificity range compared with the robust sequence-only profiles.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Biocomputing Group, Department of Biochemical Sciences,'A. Rossi-Fanelli', Sapienza Universita' di Roma, P.le Aldo Moro, 5, 00185 Rome, Italy, Computational Biology Unit, Bergen Centre for Computational Science, Høyteknologisenteret, Thormøhlensgate 55, Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Department of Molecular Biology, University of Bergen, HIB, Thormøhlensgt. 55, 5020 Bergen, Norway, BioInfoBank Institute, Limanowskiego 24A16 60-744, Poznań, Poland, ESBS, 1, Bld Sébastien Brandt, BP10413, 67412 Illkirch, France, Centre for Molecular Bioinformatics, Department of Biology, University of Rome 'Tor Vergata', Via della Ricerca Scientifica, 00133 Rome, Italy and Cellular & Molecular Logic Team, The Institute of Cancer Research (ICR), Section of Cell and Molecular Biology, SW3 6JB London, UK.
Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a 'Bar Code' format, which also displays known instances from homologous proteins through a novel 'Instance Mapper' protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Pawinskiego 5a, 02-106 Warsaw, Poland. D.Plewczynski@icm.edu.pl
We present here the random forest supervised machine learning algorithm applied to flexible docking results from five typical virtual high throughput screening (HTS) studies. Our approach is aimed at: i) reducing the number of compounds to be tested experimentally against the given protein target and ii) extending results of flexible docking experiments performed only on a subset of a chemical library in order to select promising inhibitors from the whole dataset. The random forest (RF) method is applied and tested here on compounds from the MDL drug data report (MDDR). The recall values for selected five diverse protein targets are over 90% and the performance reaches 100%. This machine learning method combined with flexible docking is capable to find 60% of the active compounds for most protein targets by docking only 10% of screened ligands. Therefore our in silico approach is able to scan very large databases rapidly in order to predict biological activity of small molecule inhibitors and provides an effective alternative for more computationally demanding methods in virtual HTS.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
CMBI, NCMLS, Radboud University Nijmegen Medical Centre, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands.
The 'omics' revolution is causing a flurry of data that all needs to be annotated for it to become useful. Sequences of proteins of unknown function can be annotated with a putative function by comparing them with proteins of known function. This form of annotation is typically performed with BLAST or similar software. Structural genomics is nowadays also bringing us three dimensional structures of proteins with unknown function. We present here software that can be used when sequence comparisons fail to determine the function of a protein with known structure but unknown function. The software, called 3D-Fun, is implemented as a server that runs at several European institutes and is freely available for everybody at all these sites. The 3D-Fun servers accept protein coordinates in the standard PDB format and compare them with all known protein structures by 3D structural superposition using the 3D-Hit software. If structural hits are found with proteins with known function, these are listed together with their function and some vital comparison statistics. This is conceptually very similar in 3D to what BLAST does in 1D. Additionally, the superposition results are displayed using interactive graphics facilities. Currently, the 3D-Fun system only predicts enzyme function but an expanded version with Gene Ontology predictions will be available soon. The server can be accessed at http://3dfun.bioinfo.pl/ or at http://3dfun.cmbi.ru.nl/.

Latest similar papers:

go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA.
INTRODUCTION Shotgun metagenomics creates millions of fragments of short DNA reads, which are meaningless unless analyzed appropriately. The Metagenomics RAST server (MG-RAST) is a web-based, open source system that offers a unique suite of tools for analyzing these data sets. After de-replication and quality control, fragments are mapped against a comprehensive nonredundant database (NR). Phylogenetic and metabolic reconstructions are computed from the set of hits against the NR. The resulting data are made available for browsing, download, and most importantly, comparison against a comprehensive collection of public metagenomes. A submitted metagenome is visible only to the user, unless the user makes it public or shares with other registered users. Public metagenomes are available to all.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Centre de Recherches de Biochimie Macromoléculaire UMR 5237, CNRS, University of Montpellier 1 and 2, Montpellier, France.
MOTIVATION: Over the last years a number of evidences has been accumulated about high incidence of tandem repeats in proteins carrying fundamental biological functions and being related to a number of human diseases. At the same time, frequently, protein repeats are strongly degenerated during evolution and, therefore, cannot be easily identified. To solve this problem, several computer programs which were based on different algorithms have been developed. Nevertheless, our tests showed that there is still room for improvement of methods for accurate and rapid detection of tandem repeats in proteins. RESULTS: We developed a new program called T-REKS for ab initio identification of the tandem repeats. It is based on clustering of lengths between identical short strings by using a K-means algorithm. Benchmark of the existing programs and T-REKS on several sequence datasets is presented. Our program being linked to the Protein Repeat DataBase opens the way for large-scale analysis of protein tandem repeats. T-REKS can also be applied to the nucleotide sequences. AVAILABILITY: The algorithm has been implemented in JAVA, the program is available upon request at http://bioinfo.montp.cnrs.fr/?r=t-reks. Protein Repeat DataBase generated by using T-REKS is accessible at http://bioinfo.montp.cnrs.fr/?r=repeatDB. CONTACT: julien.jorda@crbm.cnrs.fr; andrey.kajava@crbm.cnrs.fr.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
GeneGo, Inc., Saint Joseph, MI, USA.
Analysis of microarray, SNPs, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high-fidelity annotated knowledge base of protein interactions, pathways, and functional ontologies. This knowledge base has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here we present MetaDiscovery, an integrated platform for functional data analysis which is being developed at GeneGo for the past 8 years. On the content side, MetaDiscovery encompasses a comprehensive database of protein interactions of different types, pathways, network models and 10 functional ontologies covering human, mouse, and rat proteins. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for identification of over- and under-connected proteins in the data set, and a network module made up of network generation algorithms and filters. The suite also features MetaSearch, an application for combinatorial search of the database content, as well as a Java-based tool called MapEditor for drawing and editing custom pathway maps. Applications of MetaDiscovery include identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds, and clinical applications (analysis of large cohorts of patients and translational and personalized medicine).
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Center for Computational Pharmacology, University of Colorado Denver, Aurora, CO 80045, USA. karin.verspoor@ucdenver.edu
MOTIVATION: It is important for the quality of biological ontologies that similar concepts be expressed consistently, or univocally. Univocality is relevant for the usability of the ontology for humans, as well as for computational tools that rely on regularity in the structure of terms. However, in practice terms are not always expressed consistently, and we must develop methods for identifying terms that are not univocal so that they can be corrected. RESULTS: We developed an automated transformation-based clustering methodology for detecting terms that use different linguistic conventions for expressing similar semantics. These term sets represent occurrences of univocality violations. Our method was able to identify 67 examples of univocality violations in the Gene Ontology. AVAILABILITY: The identified univocality violations are available upon request. We are preparing a release of an open source version of the software to be available at http://bionlp.sourceforge.net.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
School of Computing and Mathematical Sciences, Auckland University of Technology, Private Bag 92006, Auckland 1142, New Zealand.
SUMMARY: A phenotypic character has a convex mapping to a given phylogenetic tree if each character state can be assigned a single point of origin in the tree. However, phenomena such as convergent evolution and lateral genetic transfer can lead to intermingling of character states and a consequent non-convex mapping. The phy-logenetic heterogeneity of different characters can identify subsets of states that are non-randomly associated or which may have been transferred from one lineage to another. We have developed Radié, an interactive 3D Java application for mapping character states to the leaves and internal edges of a radial phylogenetic tree. In Radié each state of a given character is associated with a unique color, and internal edges with many descendant character states can be represented in a number of different ways to illustrate the diversity within each group. AVAILABILITY: Radié is freely available for download at http://kiwi.cs.dal.ca/~beiko/software-and-data/radie; source code is available upon request from the authors. Supplementary Material: Supplementary figures and user docu-mentation (including the extended NEXUS format definition) are available at the journal website. CONTACT: beiko@cs.dal.ca.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Laboratory for Neuroinformatics, RIKEN Brain Science Institute Japan.
This article introduces a desktop application, named Concierge, for managing personal digital research resources. Using simple operations, it enables storage of various types of files and indexes them based on content descriptions. A key feature of the software is a high level of extensibility. By installing optional plug-ins, users can customize and extend the usability of the software based on their needs. In this paper, we also introduce a few optional plug-ins: literature management, electronic laboratory notebook, and XooNlps client plug-ins. XooNIps is a content management system developed to share digital research resources among neuroscience communities. It has been adopted as the standard database system in Japanese neuroinformatics projects. Concierge, therefore, offers comprehensive support from management of personal digital research resources to their sharing in open-access neuroinformatics databases such as XooNIps. This interaction between personal and open-access neuroinformatics databases is expected to enhance the dissemination of digital research resources. Concierge is developed as an open source project; Mac OS X and Windows XP versions have been released at the official site (http://concierge.sourceforge.jp).
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Zhangjiang Hi-Tech Park, Pudong, Shanghai,China 201203.
SUMMARY: DrugViz is a Cytoscape plugin that is designed to visualize and analyze small molecules within the framework of the interactome. DrugViz can import drug-target network information in an extended SIF file format to Cytoscape and display the two-dimensional (2D) structures of small molecule nodes in a unified visualization environment. It also can identify small molecule nodes by means of three different 2D structure searching methods, namely isomor-phism, substructure, and fingerprint-based similarity searches. After selections, users can furthermore conduct a two-side clustering analysis on drugs and targets, which allows for a detailed analysis of the active compounds in the network, and elucidate relationships between these drugs and targets. DrugViz represents a new tool for the analysis of data from chemogenomics, metabolomics and systems biology. AVAILABILITY: DrugViz and data set used in Application are freely avail-able for download at http://202.127.30.184:8080/software.html CONTACT: JingKang Shen Email: jkshen@mail.shcnc.ac.cn.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
519 Wartik Laboratory The Pennsylvania State University University Park Pennsylvania 16802 USA.
We have built a microarray database, StressDB, for management of microarray data from our studies on stress-modulated genes in Arabidopsis. StressDB provides small user groups with a locally installable web-based relational microarray database. It has a simple and intuitive architecture and has been designed for cDNA microarray technology users. StressDB uses Windows(trade mark) 2000 as the centralized database server with Oracle(trade mark) 8i as the relational database management system. It allows users to manage microarray data and data-related biological information over the Internet using a web browser. The source-code is currently available on request from the authors and will soon be made freely available for downloading from our website athttp://arastressdb.cac.psu.edu.
go to Publishergo to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Department of Computer Science, Heriot-Watt University and MRC Human Genetics Unit, Edinburgh, UK.
MOTIVATION: Due to different experimental setups and various interpretations of results, the data contained in online bioinformatics resources can be inconsistent, therefore, making it more difficult for users of these resources to assess the suitability and correctness of the answers to their queries. This work investigates the role of argumentation systems to help users evaluate such answers. More specifically, it looks closely at a gene expression case study, creating an appropriate representation of the underlying data and series of rules that are used by a third-party argumentation engine to reason over the query results provided by the mouse gene expression database EMAGE. RESULTS: A prototype using the ASPIC argumentation engine has been implemented and a preliminary evaluation carried out. This evaluation suggested that argumentation can be used to deal with inconsistent data in biological resources. AVAILABILITY: The ASPIC argumentation engine is available from http://www.argumentation.org. EMAGE gene expression data can be obtained from http://genex.hgu.mrc.ac.uk. The argumentation rules for the gene expression example are available from the lead author upon request. CONTACT: kcm1@hw.ac.uk.
go to Pubmedgo to Scholargo to Googleshow EndNote Citationshow BibTex Citation
Computer-Aided Drug Design (CADD) Group, Laboratory of Medicinal Chemistry, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick, MD, USA.
()New data, tools and services recently made available on the web server (http://cactus.nci.nih.gov) of the Computer-Aided Drug Design (CADD) Group, NCI, NIH, developed in the context of chemoinformatics and drug development work, are presented. These tools are designed for searching for structures in very large databases of small molecules. One of them is a web service-the Chemical Structure Lookup Service (CSLS)-for very rapid structure lookup in an aggregated collection of more than 80 databases comprising more than 27 million unique structures at the time of this writing. CSLS contains pointers to the entries in toxicology-related databases, catalogues of commercially available samples, drugs, assay results data sets, and databases in several other categories. CSLS allows the user to find out very rapidly in which one(s) of all these databases a given structure occurs independent of the representation of the input structure, by making use of InChIs as well as new CACTVS hashcode-based identifiers. These latter, calculable, identifiers are designed to take into account tautomerism, different resonance structures drawn for charged species, and presence of additional fragments. They make possible fine-tunable yet rapid compound identification and database overlap analyses in very large compound collections.
leszek
mmh
kuba
 

2010-09-06 05:28:04 © BioInfoBank Institute