Casillas, S (Sònia)
Latest papers:
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain.
As a growing number of haplotypic sequences from resequencing studies are now accumulating for Drosophila in the main primary sequence databases, collectively they can now be used to describe the general pattern of nucleotide variation across species and genes of this genus. The Drosophila Polymorphism Database (DPDB) is a secondary database that provides a collection of all well-annotated polymorphic sequences in Drosophila together with their associated diversity measures and options for reanalysis of the data that greatly facilitate both multi-locus and multi-species diversity studies in one of the most important groups of model organisms. Here we describe the state-of-the-art of the DPDB database and provide a step-by-step guide to all its searching and analytic capabilities. Finally, we illustrate its usefulness through selected examples. DPDB is freely available at http://dpdb.uab.cat.
Genomics, Bioinformatics and Evolution Group, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.
The McDonald and Kreitman test (MKT) is one of the most powerful and extensively used tests to detect the signature of natural selection at the molecular level. Here, we present the standard and generalized MKT website, a novel website that allows performing MKTs not only for synonymous and nonsynonymous changes, as the test was initially described, but also for other classes of regions and/or several loci. The website has three different interfaces:(i) the standard MKT, where users can analyze several types of sites in a coding region,(ii) the advanced MKT, where users can compare two closely linked regions in the genome that can be either coding or noncoding, and (iii) the multi-locus MKT, where users can analyze many separate loci in a single multi-locus test. The website has already been used to show that selection efficiency is positively correlated with effective population size in the Drosophila genus and it has been applied to include estimates of selection in DPDB. This website is a timely resource, which will presumably be widely used by researchers in the field and will contribute to enlarge the catalogue of cases of adaptive evolution. It is available at http://mkt.uab.es.
Most cited papers:
The majority of metazoan genomes consist of non-protein coding regions, however the functional significance of most noncoding DNA sequences remains unknown. Highly conserved noncoding sequences (CNSs) have proven to be reliable indicators of functionally constrained sequences such as cis-regulatory elements and noncoding RNA genes. However, CNSs may arise from non-selective evolutionary processes such as genomic regions with extremely low mutation rates known as mutation "cold spots." Here we combine comparative genomic data from recently completed insect genome projects with population genetic data in D. melanogaster, to test predictions of the mutational cold spot model of CNS evolution in the genus Drosophila. We find that point mutations in intronic and intergenic CNSs exhibit a significant reduction in levels of divergence relative to levels of polymorphism, as well as a significant excess of rare derived alleles, compared with either the non-conserved spacer regions between CNSs or with four-fold silent sites in coding regions. Controlling for the effects of purifying selection, we find no evidence of positive selection acting on Drosophila CNSs, although we do find evidence for the action of recurrent positive selection in the spacer regions between CNSs. We estimate that approximately 85% of sites in Drosophila CNSs are under constraint with selection coefficients (N(e)S) on the order of 10-100, and thus the estimated strength and number of sites under purifying selection is greater for Drosophila CNSs relative to those in the human genome, These patterns of non-neutral molecular evolution are incompatible with the mutational cold spot hypothesis to explain the existence of CNSs in Drosophila and, coupled with similar findings in mammals, argue against the general likelihood that CNSs are generated by mutational cold spots in any metazoan genome.
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.
Polymorphism studies are one of the main research areas of this genomic era. To date, however, no available web server or software package has been designed to automate the process of exploring and estimating nucleotide polymorphism in large DNA databases. Here, we introduce a novel software, PDA, Pipeline Diversity Analysis, that automatically can (i) search for polymorphic sequences in large databases, and (ii) estimate their genetic diversity. PDA is a collection of modules, mainly written in Perl, which works sequentially as follows: unaligned sequence retrieved from a DNA database are automatically classified by organism and gene, and aligned using the ClustalW algorithm. Sequence sets are regrouped depending on their similarity scores. Main diversity parameters, including polymorphism, synonymous and non-synonymous substitutions, linkage disequilibrium and codon bias are estimated both for the full length of the sequences and for specific functional regions. Program output includes a database with all sequences and estimations, and HTML pages with summary statistics, the performed alignments and a histogram maker tool. PDA is an essential tool to explore polymorphism in large DNA databases for sequences from different genes, populations or species. It has already been successfully applied to create a secondary database. PDA is available on the web at http://pda.uab.es/.
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.
Pipeline Diversity Analysis (PDA) is an open-source, web-based tool that allows the exploration of polymorphism in large datasets of heterogeneous DNA sequences, and can be used to create secondary polymorphism databases for different taxonomic groups, such as the Drosophila Polymorphism Database (DPDB). A new version of the pipeline presented here, PDA v.2, incorporates substantial improvements, including new methods for data mining and grouping sequences, new criteria for data quality assessment and a better user interface. PDA is a powerful tool to obtain and synthesize existing empirical evidence on genetic diversity in any species or species group. PDA v.2 is available on the web at http://pda.uab.es/.
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona 08193 Bellaterra (Barcelona), Spain.
MOTIVATION: Polymorphism studies are one of the main research areas of this genomic era. To date, however, no comprehensive secondary databases have been designed to provide searchable collections of polymorphic sequences with their associated diversity measures. RESULTS: We define a data model for the storage, representation and analysis of genotypic and haplotypic data. Under this model we have created DPDB,'Drosophila Polymorphism Database', a web site that provides a daily updated repository of all well-annotated polymorphic sequences in the Drosophila genus. It allows the search for any polymorphic set according to different parameter values of nucleotide diversity, linkage disequilibrium and codon bias. For data collection, analysis and updating we use PDA, a pipeline that automates the process of sequence retrieval, grouping, alignment and estimation of nucleotide diversity from Genbank sequences in different functional regions. The web site also includes analysis tools for sequence comparison and the estimation of genetic diversity, a page with real-time statistics of the database contents, a help section and a collection of selected links. AVAILABILITY: DPDB is freely available at http://dpdb.uab.es and can be downloaded via FTP. CONTACT: antonio.barbadilla@uab.es.
Genomics, Bioinformatics and Evolution Group, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.
The McDonald and Kreitman test (MKT) is one of the most powerful and extensively used tests to detect the signature of natural selection at the molecular level. Here, we present the standard and generalized MKT website, a novel website that allows performing MKTs not only for synonymous and nonsynonymous changes, as the test was initially described, but also for other classes of regions and/or several loci. The website has three different interfaces:(i) the standard MKT, where users can analyze several types of sites in a coding region,(ii) the advanced MKT, where users can compare two closely linked regions in the genome that can be either coding or noncoding, and (iii) the multi-locus MKT, where users can analyze many separate loci in a single multi-locus test. The website has already been used to show that selection efficiency is positively correlated with effective population size in the Drosophila genus and it has been applied to include estimates of selection in DPDB. This website is a timely resource, which will presumably be widely used by researchers in the field and will contribute to enlarge the catalogue of cases of adaptive evolution. It is available at http://mkt.uab.es.
Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona 08193 Bellaterra, Barcelona, Spain.
Multi-locus and multi-species nucleotide diversity studies would benefit enormously from a public database encompassing high-quality haplotypic sequences with their associated genetic diversity measures. MamPol,'Mammalia Polymorphism Database', is a website containing all the well-annotated polymorphic sequences available in GenBank for the Mammalia class grouped by name of organism and gene. Diversity measures of single nucleotide polymorphisms are provided for each set of haplotypic homologous sequences, including polymorphism at synonymous and non-synonymous sites, linkage disequilibrium and codon bias. Data gathering, calculation of diversity measures and daily updates are automatically performed using PDA software. The MamPol website includes several interfaces for browsing the contents of the database and making customizable comparative searches of different species or taxonomic groups. It also contains a set of tools for simple re-analysis of the available data and a statistics section that is updated daily and summarizes the contents of the database. MamPol is available at http://mampol.uab.es/ and can be downloaded via FTP.
Bárbara Negre,
Sònia Casillas,
Magali Suzanne,
Ernesto Sánchez-Herrero,
Michael Akam,
Michael Nefedov,
Antonio Barbadilla,
Pieter de Jong,
Alfredo Ruiz
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.
Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity.
Mesh-terms: Animals; Base Sequence; Chromosome Mapping; Chromosomes, Artificial, Bacterial; Comparative Study; Conserved Sequence :: genetics; Drosophila :: genetics; Drosophila Proteins :: genetics; Gene Components; Gene Expression; Genes, Homeobox :: genetics; Homeodomain Proteins :: genetics; Molecular Sequence Data; Nuclear Proteins :: genetics; Regulatory Sequences, Nucleic Acid :: genetics; Research Support, Non-U.S. Gov't; Sequence Analysis, DNA; Transcription Factors :: genetics;
