Kulikova, T (Tamara)
Latest papers:
Nucleotide and protein sequence databases are major resources for biological and medical research. This chapter introduces the European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database, a comprehensive primary data archive for nucleic acid sequences, and Genome Reviews, a secondary database that provides an up-to-date, standardized and comprehensively annotated view of the genomic sequence of selected organisms with completely deciphered genomes. Focusing on plant nucleotide sequences, we demonstrate how these data are accessed, how sequence similarity searches are performed and how we can obtain a wealth of additional information relating to genome sequences using Integr8.
Guy Cochrane,
Ruth Akhtar,
Philippe Aldebert,
Nicola Althorpe,
Alastair Baldwin,
Kirsty Bates,
Sumit Bhattacharyya,
James Bonfield,
Lawrence Bower,
Paul Browne,
Matias Castro,
Tony Cox,
Fehmi Demiralp,
Ruth Eberhardt,
Nadeem Faruque,
Gemma Hoad,
Mikyung Jang,
Tamara Kulikova,
Alberto Labarga,
Rasko Leinonen,
Steven Leonard,
Quan Lin,
Rodrigo Lopez,
Dariusz Lorenc,
Hamish McWilliam,
Gaurab Mukherjee,
Francesco Nardone,
Sheila Plaister,
Stephen Robinson,
Siamak Sobhany,
Robert Vaughan,
Dan Wu,
Weimin Zhu,
Rolf Apweiler,
Tim Hubbard,
Ewan Birney
The Ensembl Trace Archive (http://trace.ensembl.org/) and the EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), known together as the European Nucleotide Archive, continue to see growth in data volume and diversity. Selected major developments of 2007 are presented briefly, along with data submission and retrieval information. In the face of increasing requirements for nucleotide trace, sequence and annotation data archiving, data capture priority decisions have been taken at the European Nucleotide Archive. Priorities are discussed in terms of how reliably information can be captured, the long-term benefits of its capture and the ease with which it can be captured.
Most cited papers:
Guenter Stoesser,
Wendy Baker,
Alexandra van den Broek,
Maria Garcia-Pastor,
Carola Kanz,
Tamara Kulikova,
Rasko Leinonen,
Quan Lin,
Vincent Lombard,
Rodrigo Lopez,
Renato Mancuso,
Francesco Nardone,
Peter Stoehr,
Mary Ann Tuli,
Katerina Tzouvara,
Robert Vaughan
EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. stoesser@ebi.ac.uk
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) incorporates, organizes and distributes nucleotide sequences from all available public sources. The database is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis to achieve optimal synchronization. Webin is the preferred web-based submission system for individual submitters, while automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, Email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS) integrates and links the main nucleotide and protein databases plus many other specialized molecular biology databases. For sequence similarity searching, a variety of tools (e.g. Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Paul Kersey,
Lawrence Bower,
Lorna Morris,
Alan Horne,
Robert Petryszak,
Carola Kanz,
Alexander Kanapin,
Ujjwal Das,
Karine Michoud,
Isabelle Phan,
Alexandre Gattiker,
Tamara Kulikova,
Nadeem Faruque,
Karyn Duggan,
Peter Mclaren,
Britt Reimholz,
Laurent Duret,
Simon Penel,
Ingmar Reuter,
Rolf Apweiler
The EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. pkersey@ebi.ac.uk
Integr8 is a new web portal for exploring the biology of organisms with completely deciphered genomes. For over 190 species, Integr8 provides access to general information, recent publications, and a detailed statistical overview of the genome and proteome of the organism. The preparation of this analysis is supported through Genome Reviews, a new database of bacterial and archaeal DNA sequences in which annotation has been upgraded (compared to the original submission) through the integration of data from many sources, including the EMBL Nucleotide Sequence Database, the UniProt Knowledgebase, InterPro, CluSTr, GOA and HOGENOM. Integr8 also allows the users to customize their own interactive analysis, and to download both customized and prepared datasets for their own use. Integr8 is available at http://www.ebi.ac.uk/integr8.
Guenter Stoesser,
Wendy Baker,
Alexandra van den Broek,
Evelyn Camon,
Maria Garcia-Pastor,
Carola Kanz,
Tamara Kulikova,
Rasko Leinonen,
Quan Lin,
Vincent Lombard,
Rodrigo Lopez,
Nicole Redaschi,
Peter Stoehr,
Mary Ann Tuli,
Katerina Tzouvara,
Robert Vaughan
EMBL Outstation, The European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
The EMBL Nucleotide Sequence Database (aka EMBL-Bank; http://www.ebi.ac.uk/embl/) incorporates, organises and distributes nucleotide sequences from all available public sources. EMBL-Bank is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis. Major contributors to the EMBL database are individual scientists and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many other specialized databases. For sequence similarity searching, a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Mesh-terms: Animals; Base Sequence; Confidentiality; Data Collection; Database Management Systems; Databases, Nucleic Acid; Databases, Protein; Europe; Expressed Sequence Tags; Genome; Genome, Human; Humans; Information Storage and Retrieval; Internet; Patents; Sequence Alignment; Sequence Analysis; Systems Integration;
Tamara Kulikova,
Philippe Aldebert,
Nicola Althorpe,
Wendy Baker,
Kirsty Bates,
Paul Browne,
Alexandra van den Broek,
Guy Cochrane,
Karyn Duggan,
Ruth Eberhardt,
Nadeem Faruque,
Maria Garcia-Pastor,
Nicola Harte,
Carola Kanz,
Rasko Leinonen,
Quan Lin,
Vincent Lombard,
Rodrigo Lopez,
Renato Mancuso,
Michelle McHale,
Francesco Nardone,
Ville Silventoinen,
Peter Stoehr,
Guenter Stoesser,
Mary Ann Tuli,
Katerina Tzouvara,
Robert Vaughan,
Dan Wu,
Weimin Zhu,
Rolf Apweiler
EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), maintained at the European Bioinformatics Institute (EBI), incorporates, organizes and distributes nucleotide sequences from public sources. The database is a part of an international collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged between the collaborating databases on a daily basis to achieve optimal synchrony. The web-based tool, Webin, is the preferred system for individual submission of nucleotide sequences, including Third Party Annotation (TPA) and alignment data. Automatic submission procedures are used for submission of data from large-scale genome sequencing centres and from the European Patent Office. Database releases are produced quarterly. The latest data collection can be accessed via FTP, email and WWW interfaces. The EBI's Sequence Retrieval System (SRS) integrates and links the main nucleotide and protein databases as well as many other specialist molecular biology databases. For sequence similarity searching, a variety of tools (e.g. FASTA and BLAST) are available that allow external users to compare their own sequences against the data in the EMBL Nucleotide Sequence Database, the complete genomic component subsection of the database, the WGS data sets and other databases. All available resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Carola Kanz,
Philippe Aldebert,
Nicola Althorpe,
Wendy Baker,
Alastair Baldwin,
Kirsty Bates,
Paul Browne,
Alexandra van den Broek,
Matias Castro,
Guy Cochrane,
Karyn Duggan,
Ruth Eberhardt,
Nadeem Faruque,
John Gamble,
Federico Garcia Diez,
Nicola Harte,
Tamara Kulikova,
Quan Lin,
Vincent Lombard,
Rodrigo Lopez,
Renato Mancuso,
Michelle McHale,
Francesco Nardone,
Ville Silventoinen,
Siamak Sobhany,
Peter Stoehr,
Mary Ann Tuli,
Katerina Tzouvara,
Robert Vaughan,
Dan Wu,
Weimin Zhu,
Rolf Apweiler
EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. ckanz@ebi.ac.uk
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. The database is part of an international collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged daily between the collaborating institutes to achieve swift synchrony. Webin is the preferred tool for individual submissions of nucleotide sequences, including Third Party Annotation (TPA) and alignments. Automated procedures are provided for submissions from large-scale sequencing projects and data from the European Patent Office. New and updated data records are distributed daily and the whole EMBL Nucleotide Sequence Database is released four times a year. Access to the sequence data is provided via ftp and several WWW interfaces. With the web-based Sequence Retrieval System (SRS) it is also possible to link nucleotide data to other specialist molecular biology databases maintained at the EBI. Other tools are available for sequence similarity searching (e.g. FASTA and BLAST). Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data.
Tamara Kulikova,
Ruth Akhtar,
Philippe Aldebert,
Nicola Althorpe,
Mikael Andersson,
Alastair Baldwin,
Kirsty Bates,
Sumit Bhattacharyya,
Lawrence Bower,
Paul Browne,
Matias Castro,
Guy Cochrane,
Karyn Duggan,
Ruth Eberhardt,
Nadeem Faruque,
Gemma Hoad,
Carola Kanz,
Charles Lee,
Rasko Leinonen,
Quan Lin,
Vincent Lombard,
Rodrigo Lopez,
Dariusz Lorenc,
Hamish McWilliam,
Gaurab Mukherjee,
Francesco Nardone,
Maria Pilar Garcia Pastor,
Sheila Plaister,
Siamak Sobhany,
Peter Stoehr,
Robert Vaughan,
Dan Wu,
Weimin Zhu,
Rolf Apweiler
EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK.
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl) at the EMBL European Bioinformatics Institute, UK, offers a large and freely accessible collection of nucleotide sequences and accompanying annotation. The database is maintained in collaboration with DDBJ and GenBank. Data are exchanged between the collaborating databases on a daily basis to achieve optimal synchrony. Webin is the preferred tool for individual submissions of nucleotide sequences, including Third Party Annotation, alignments and bulk data. Automated procedures are provided for submissions from large-scale sequencing projects and data from the European Patent Office. In 2006, the volume of data has continued to grow exponentially. Access to the data is provided via SRS, ftp and variety of other methods. Extensive external and internal cross-references enable users to search for related information across other databases and within the database. All available resources can be accessed via the EBI home page at http://www.ebi.ac.uk/. Changes over the past year include changes to the file format, further development of the EMBLCDS dataset and developments to the XML format.
Guy Cochrane,
Ruth Akhtar,
Philippe Aldebert,
Nicola Althorpe,
Alastair Baldwin,
Kirsty Bates,
Sumit Bhattacharyya,
James Bonfield,
Lawrence Bower,
Paul Browne,
Matias Castro,
Tony Cox,
Fehmi Demiralp,
Ruth Eberhardt,
Nadeem Faruque,
Gemma Hoad,
Mikyung Jang,
Tamara Kulikova,
Alberto Labarga,
Rasko Leinonen,
Steven Leonard,
Quan Lin,
Rodrigo Lopez,
Dariusz Lorenc,
Hamish McWilliam,
Gaurab Mukherjee,
Francesco Nardone,
Sheila Plaister,
Stephen Robinson,
Siamak Sobhany,
Robert Vaughan,
Dan Wu,
Weimin Zhu,
Rolf Apweiler,
Tim Hubbard,
Ewan Birney
The Ensembl Trace Archive (http://trace.ensembl.org/) and the EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), known together as the European Nucleotide Archive, continue to see growth in data volume and diversity. Selected major developments of 2007 are presented briefly, along with data submission and retrieval information. In the face of increasing requirements for nucleotide trace, sequence and annotation data archiving, data capture priority decisions have been taken at the European Nucleotide Archive. Priorities are discussed in terms of how reliably information can be captured, the long-term benefits of its capture and the ease with which it can be captured.
Nucleotide and protein sequence databases are major resources for biological and medical research. This chapter introduces the European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database, a comprehensive primary data archive for nucleic acid sequences, and Genome Reviews, a secondary database that provides an up-to-date, standardized and comprehensively annotated view of the genomic sequence of selected organisms with completely deciphered genomes. Focusing on plant nucleotide sequences, we demonstrate how these data are accessed, how sequence similarity searches are performed and how we can obtain a wealth of additional information relating to genome sequences using Integr8.
