|
Latest Paper:
Hum Mutat. 2012 May 11;:
22581653
Medical Research Council Clinical Sciences Centre, Imperial College London, London W12 0NN, United Kingdom. j.ware@imperial.ac.uk.
Discriminating between rare benign and pathogenic variation is a key challenge in clinical genetics, particularly as increasing numbers of non-synonymous SNPs are identified in resequencing studies. Here, we describe an approach for the functional annotation of non-synonymous variants that identifies functionally important, disease-causing residues across protein families using multiple sequence alignment. We applied the methodology to long QT syndrome (LQT) genes, which cause sudden death, and their paralogues, which largely cause neurological disease, and accurately classified known LQT disease-causing variants (positive predictive value = 98.4%) with a better performance than established bioinformatic methods. The analysis also identified 1078 new putative disease loci, which we incorporated along with known variants into a comprehensive and freely accessible long QT resource (http://cardiodb.org/Paralogue_Annotation/), based on newly created Locus Reference Genomic sequences (http://www.lrg-sequence.org/). We propose that paralogous annotation is widely applicable for Mendelian human disease genes.
Nat Biotechnol. 2012 ;30 (4):365
22491292
Emek Demir,
Michael P Cary,
Suzanne Paley,
Ken Fukuda,
Christian Lemer,
Imre Vastrik,
Guanming Wu,
Peter D'Eustachio,
Carl Schaefer,
Joanne Luciano,
Frank Schacherer,
Irma Martinez-Flores,
Zhenjun Hu,
Veronica Jimenez-Jacinto,
Geeta Joshi-Tope,
Kumaran Kandasamy,
Alejandra C Lopez-Fuentes,
Huaiyu Mi,
Elgar Pichler,
Igor Rodchenkov,
Andrea Splendiani,
Sasha Tkachev,
Jeremy Zucker,
Gopal Gopinath,
Harsha Rajasimha,
Ranjani Ramakrishnan,
Imran Shah,
Mustafa Syed,
Nadia Anwar,
Ozgün Babur,
Michael Blinov,
Erik Brauner,
Dan Corwin,
Sylva Donaldson,
Frank Gibbons,
Robert Goldberg,
Peter Hornbeck,
Augustin Luna,
Peter Murray-Rust,
Eric Neumann,
Oliver Reubenacker,
Matthias Samwald,
Martijn van Iersel,
Sarala Wimalaratne,
Keith Allen,
Burk Braun,
Michelle Whirl-Carrillo,
Kei-Hoi Cheung,
Kam Dahlquist,
Andrew Finney,
Marc Gillespie,
Elizabeth Glass,
Li Gong,
Robin Haw,
Michael Honig,
Olivier Hubaut,
David Kane,
Shiva Krupa,
Martina Kutmon,
Julie Leonard,
Debbie Marks,
David Merberg,
Victoria Petri,
Alex Pico,
Dean Ravenscroft,
Liya Ren,
Nigam Shah,
Margot Sunshine,
Rebecca Tang,
Ryan Whaley,
Stan Letovksy,
Kenneth H Buetow,
Andrey Rzhetsky,
Vincent Schachter,
Bruno S Sobral,
Ugur Dogrusoz,
Shannon McWeeney,
Mirit Aladjem,
Ewan Birney,
Julio Collado-Vides,
Susumu Goto,
Michael Hucka,
Nicolas Le Novère,
Natalia Maltsev,
Akhilesh Pandey,
Paul Thomas,
Edgar Wingender,
Peter D Karp,
Chris Sander,
Gary D Bader
Nature. 2012 Apr 5;484 (7392):55-61
22481358
Felicity C Jones,
Manfred G Grabherr,
Yingguang Frank Chan,
Pamela Russell,
Evan Mauceli,
Jeremy Johnson,
Ross Swofford,
Mono Pirun,
Michael C Zody,
Simon White,
Ewan Birney,
Stephen Searle,
Jeremy Schmutz,
Jane Grimwood,
Mark C Dickson,
Richard M Myers,
Craig T Miller,
Brian R Summers,
Anne K Knecht,
Shannon D Brady,
Haili Zhang,
Alex A Pollen,
Timothy Howes,
Chris Amemiya,
Jen Baldwin,
Toby Bloom,
David B Jaffe,
Robert Nicol,
Jane Wilkinson,
Eric S Lander,
Federica Di Palma,
Kerstin Lindblad-Toh,
David M Kingsley
Department of Developmental Biology, Beckman Center B300, Stanford University School of Medicine, Stanford California 94305, USA.
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.
Bioinformatics. 2012 Feb 24;:
22368243
Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
MOTIVATION: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNA-seq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo, taking into account possible alternative isoforms and the dynamic range of expression values. RESULTS: We present a software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events, and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers. AVAILABILITY: Oases is freely available under the GPL license at www.ebi.ac.uk/~zerbino/oases/ CONTACT: dzerbino@ucsc.edu SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.
Cell. 2012 Feb 3;148 (3):473-86
22304916
Guillaume Junion,
Mikhail Spivakov,
Charles Girardot,
Martina Braun,
E Hilary Gustafson,
Ewan Birney,
Eileen E M Furlong
Genome Biology Unit, European Molecular Biology Laboratory, D-69117 Heidelberg, Germany.
Cell fate decisions are driven through the integration of inductive signals and tissue-specific transcription factors (TFs), although the details on how this information converges in cis remain unclear. Here, we demonstrate that the five genetic components essential for cardiac specification in Drosophila, including the effectors of Wg and Dpp signaling, act as a collective unit to cooperatively regulate heart enhancer activity, both in vivo and in vitro. Their combinatorial binding does not require any specific motif orientation or spacing, suggesting an alternative mode of enhancer function whereby cooperative activity occurs with extensive motif flexibility. A fraction of enhancers co-occupied by cardiogenic TFs had unexpected activity in the neighboring visceral mesoderm but could be rendered active in heart through single-site mutations. Given that cardiac and visceral cells are both derived from the dorsal mesoderm, this "dormant" TF binding signature may represent a molecular footprint of these cells' developmental lineage.
Genome Res. 2012 Jan ;22 (1):9-24
22090374
Bum-Kyu Lee,
Akshay A Bhinge,
Anna Battenhouse,
Ryan M McDaniell,
Zheng Liu,
Lingyun Song,
Yunyun Ni,
Ewan Birney,
Jason D Lieb,
Terrence S Furey,
Gregory E Crawford,
Vishwanath R Iyer
Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, Section of Molecular Genetics and Microbiology, University of Texas at Austin, Austin, Texas 78712, USA.
Cell-type diversity is governed in part by differential gene expression programs mediated by transcription factor (TF) binding. However, there are few systematic studies of the genomic binding of different types of TFs across a wide range of human cell types, especially in relation to gene expression. In the ENCODE Project, we have identified the genomic binding locations across 11 different human cell types of CTCF, RNA Pol II (RNAPII), and MYC, three TFs with diverse roles. Our data and analysis revealed how these factors bind in relation to genomic features and shape gene expression and cell-type specificity. CTCF bound predominantly in intergenic regions while RNAPII and MYC preferentially bound to core promoter regions. CTCF sites were relatively invariant across diverse cell types, while MYC showed the greatest cell-type specificity. MYC and RNAPII co-localized at many of their binding sites and putative target genes. Cell-type specific binding sites, in particular for MYC and RNAPII, were associated with cell-type specific functions. Patterns of binding in relation to gene features were generally conserved across different cell types. RNAPII occupancy was higher over exons than adjacent introns, likely reflecting a link between transcriptional elongation and splicing. TF binding was positively correlated with the expression levels of their putative target genes, but combinatorial binding, in particular of MYC and RNAPII, was even more strongly associated with higher gene expression. These data illuminate how combinatorial binding of transcription factors in diverse cell types is associated with gene expression and cell-type specific biology.
Paul Flicek,
M Ridwan Amode,
Daniel Barrell,
Kathryn Beal,
Simon Brent,
Denise Carvalho-Silva,
Peter Clapham,
Guy Coates,
Susan Fairley,
Stephen Fitzgerald,
Laurent Gil,
Leo Gordon,
Maurice Hendrix,
Thibaut Hourlier,
Nathan Johnson,
Andreas K Kähäri,
Damian Keefe,
Stephen Keenan,
Rhoda Kinsella,
Monika Komorowska,
Gautier Koscielny,
Eugene Kulesha,
Pontus Larsson,
Ian Longden,
William McLaren,
Matthieu Muffato,
Bert Overduin,
Miguel Pignatelli,
Bethan Pritchard,
Harpreet Singh Riat,
Graham R S Ritchie,
Magali Ruffier,
Michael Schuster,
Daniel Sobral,
Y Amy Tang,
Kieron Taylor,
Stephen Trevanion,
Jana Vandrovcova,
Simon White,
Mark Wilson,
Steven P Wilder,
Bronwen L Aken,
Ewan Birney,
Fiona Cunningham,
Ian Dunham,
Richard Durbin,
Xosé M Fernández-Suarez,
Jennifer Harrow,
Javier Herrero,
Tim J P Hubbard,
Anne Parker,
Glenn Proctor,
Giulietta Spudich,
Jan Vogel,
Andy Yates,
Amonida Zadissa,
Stephen M J Searle
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK. flicek@ebi.ac.uk
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.
Clara Amid,
Ewan Birney,
Lawrence Bower,
Ana Cerdeño-Tárraga,
Ying Cheng,
Iain Cleland,
Nadeem Faruque,
Richard Gibson,
Neil Goodgame,
Christopher Hunter,
Mikyung Jang,
Rasko Leinonen,
Xin Liu,
Arnaud Oisel,
Nima Pakseresht,
Sheila Plaister,
Rajesh Radhakrishnan,
Kethi Reddy,
Stephane Rivière,
Marc Rossello,
Alexander Senf,
Dimitriy Smirnov,
Petra Ten Hoopen,
Daniel Vaughan,
Robert Vaughan,
Vadim Zalunin,
Guy Cochrane
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. amid@ebi.ac.uk
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), Europe's primary nucleotide sequence resource, captures and presents globally comprehensive nucleic acid sequence and associated information. Covering the spectrum from raw data to assembled and functionally annotated genomes, the ENA has witnessed a dramatic growth resulting from advances in sequencing technology and ever broadening application of the methodology. During 2011, we have continued to operate and extend the broad range of ENA services. In particular, we have released major new functionality in our interactive web submission system, Webin, through developments in template-based submissions for annotated sequences and support for raw next-generation sequence read submissions.
Paul J Kersey,
Daniel M Staines,
Daniel Lawson,
Eugene Kulesha,
Paul Derwent,
Jay C Humphrey,
Daniel S T Hughes,
Stephan Keenan,
Arnaud Kerhornou,
Gautier Koscielny,
Nicholas Langridge,
Mark D McDowall,
Karine Megy,
Uma Maheswari,
Michael Nuhn,
Michael Paulini,
Helder Pedro,
Iliana Toneva,
Derek Wilson,
Andrew Yates,
Ewan Birney
Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. pkersey@ebi.ac.uk
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.
Nature. 2011 Oct 12;:
21993624
Kerstin Lindblad-Toh,
Manuel Garber,
Or Zuk,
Michael F Lin,
Brian J Parker,
Stefan Washietl,
Pouya Kheradpour,
Jason Ernst,
Gregory Jordan,
Evan Mauceli,
Lucas D Ward,
Craig B Lowe,
Alisha K Holloway,
Michele Clamp,
Sante Gnerre,
Jessica Alföldi,
Kathryn Beal,
Jean Chang,
Hiram Clawson,
James Cuff,
Federica Di Palma,
Stephen Fitzgerald,
Paul Flicek,
Mitchell Guttman,
Melissa J Hubisz,
David B Jaffe,
Irwin Jungreis,
W James Kent,
Dennis Kostka,
Marcia Lara,
Andre L Martins,
Tim Massingham,
Ida Moltke,
Brian J Raney,
Matthew D Rasmussen,
Jim Robinson,
Alexander Stark,
Albert J Vilella,
Jiayu Wen,
Xiaohui Xie,
Michael C Zody,
Jen Baldwin,
Toby Bloom,
Chee Whye Chin,
Dave Heiman,
Robert Nicol,
Chad Nusbaum,
Sarah Young,
Jane Wilkinson,
Kim C Worley,
Christie L Kovar,
Donna M Muzny,
Richard A Gibbs,
Andrew Cree,
Huyen H Dihn,
Gerald Fowler,
Shalili Jhangiani,
Vandita Joshi,
Sandra Lee,
Lora R Lewis,
Lynne V Nazareth,
Geoffrey Okwuonu,
Jireh Santibanez,
Wesley C Warren,
Elaine R Mardis,
George M Weinstock,
Richard K Wilson,
Kim Delehaunty,
David Dooling,
Catrina Fronik,
Lucinda Fulton,
Bob Fulton,
Tina Graves,
Patrick Minx,
Erica Sodergren,
Ewan Birney,
Elliott H Margulies,
Javier Herrero,
Eric D Green,
David Haussler,
Adam Siepel,
Nick Goldman,
Katherine S Pollard,
Jakob S Pedersen,
Eric S Lander,
Manolis Kellis
1] Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), 7 Cambridge Center, Cambridge, Massachusetts 02142, USA [2] Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Box 582, SE-751 23 Uppsala, Sweden.
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.
|
Polish News | |||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
|
|