Cargando…

Petabyte-scale innovations at the European Nucleotide Archive

Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure,...

Descripción completa

Detalles Bibliográficos
Autores principales: Cochrane, Guy, Akhtar, Ruth, Bonfield, James, Bower, Lawrence, Demiralp, Fehmi, Faruque, Nadeem, Gibson, Richard, Hoad, Gemma, Hubbard, Tim, Hunter, Christopher, Jang, Mikyung, Juhos, Szilveszter, Leinonen, Rasko, Leonard, Steven, Lin, Quan, Lopez, Rodrigo, Lorenc, Dariusz, McWilliam, Hamish, Mukherjee, Gaurab, Plaister, Sheila, Radhakrishnan, Rajesh, Robinson, Stephen, Sobhany, Siamak, Hoopen, Petra Ten, Vaughan, Robert, Zalunin, Vadim, Birney, Ewan
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686451/
https://www.ncbi.nlm.nih.gov/pubmed/18978013
http://dx.doi.org/10.1093/nar/gkn765
_version_ 1782167410389811200
author Cochrane, Guy
Akhtar, Ruth
Bonfield, James
Bower, Lawrence
Demiralp, Fehmi
Faruque, Nadeem
Gibson, Richard
Hoad, Gemma
Hubbard, Tim
Hunter, Christopher
Jang, Mikyung
Juhos, Szilveszter
Leinonen, Rasko
Leonard, Steven
Lin, Quan
Lopez, Rodrigo
Lorenc, Dariusz
McWilliam, Hamish
Mukherjee, Gaurab
Plaister, Sheila
Radhakrishnan, Rajesh
Robinson, Stephen
Sobhany, Siamak
Hoopen, Petra Ten
Vaughan, Robert
Zalunin, Vadim
Birney, Ewan
author_facet Cochrane, Guy
Akhtar, Ruth
Bonfield, James
Bower, Lawrence
Demiralp, Fehmi
Faruque, Nadeem
Gibson, Richard
Hoad, Gemma
Hubbard, Tim
Hunter, Christopher
Jang, Mikyung
Juhos, Szilveszter
Leinonen, Rasko
Leonard, Steven
Lin, Quan
Lopez, Rodrigo
Lorenc, Dariusz
McWilliam, Hamish
Mukherjee, Gaurab
Plaister, Sheila
Radhakrishnan, Rajesh
Robinson, Stephen
Sobhany, Siamak
Hoopen, Petra Ten
Vaughan, Robert
Zalunin, Vadim
Birney, Ewan
author_sort Cochrane, Guy
collection PubMed
description Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches.
format Text
id pubmed-2686451
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-26864512009-05-26 Petabyte-scale innovations at the European Nucleotide Archive Cochrane, Guy Akhtar, Ruth Bonfield, James Bower, Lawrence Demiralp, Fehmi Faruque, Nadeem Gibson, Richard Hoad, Gemma Hubbard, Tim Hunter, Christopher Jang, Mikyung Juhos, Szilveszter Leinonen, Rasko Leonard, Steven Lin, Quan Lopez, Rodrigo Lorenc, Dariusz McWilliam, Hamish Mukherjee, Gaurab Plaister, Sheila Radhakrishnan, Rajesh Robinson, Stephen Sobhany, Siamak Hoopen, Petra Ten Vaughan, Robert Zalunin, Vadim Birney, Ewan Nucleic Acids Res Articles Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches. Oxford University Press 2009-01 2008-10-31 /pmc/articles/PMC2686451/ /pubmed/18978013 http://dx.doi.org/10.1093/nar/gkn765 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Cochrane, Guy
Akhtar, Ruth
Bonfield, James
Bower, Lawrence
Demiralp, Fehmi
Faruque, Nadeem
Gibson, Richard
Hoad, Gemma
Hubbard, Tim
Hunter, Christopher
Jang, Mikyung
Juhos, Szilveszter
Leinonen, Rasko
Leonard, Steven
Lin, Quan
Lopez, Rodrigo
Lorenc, Dariusz
McWilliam, Hamish
Mukherjee, Gaurab
Plaister, Sheila
Radhakrishnan, Rajesh
Robinson, Stephen
Sobhany, Siamak
Hoopen, Petra Ten
Vaughan, Robert
Zalunin, Vadim
Birney, Ewan
Petabyte-scale innovations at the European Nucleotide Archive
title Petabyte-scale innovations at the European Nucleotide Archive
title_full Petabyte-scale innovations at the European Nucleotide Archive
title_fullStr Petabyte-scale innovations at the European Nucleotide Archive
title_full_unstemmed Petabyte-scale innovations at the European Nucleotide Archive
title_short Petabyte-scale innovations at the European Nucleotide Archive
title_sort petabyte-scale innovations at the european nucleotide archive
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686451/
https://www.ncbi.nlm.nih.gov/pubmed/18978013
http://dx.doi.org/10.1093/nar/gkn765
work_keys_str_mv AT cochraneguy petabytescaleinnovationsattheeuropeannucleotidearchive
AT akhtarruth petabytescaleinnovationsattheeuropeannucleotidearchive
AT bonfieldjames petabytescaleinnovationsattheeuropeannucleotidearchive
AT bowerlawrence petabytescaleinnovationsattheeuropeannucleotidearchive
AT demiralpfehmi petabytescaleinnovationsattheeuropeannucleotidearchive
AT faruquenadeem petabytescaleinnovationsattheeuropeannucleotidearchive
AT gibsonrichard petabytescaleinnovationsattheeuropeannucleotidearchive
AT hoadgemma petabytescaleinnovationsattheeuropeannucleotidearchive
AT hubbardtim petabytescaleinnovationsattheeuropeannucleotidearchive
AT hunterchristopher petabytescaleinnovationsattheeuropeannucleotidearchive
AT jangmikyung petabytescaleinnovationsattheeuropeannucleotidearchive
AT juhosszilveszter petabytescaleinnovationsattheeuropeannucleotidearchive
AT leinonenrasko petabytescaleinnovationsattheeuropeannucleotidearchive
AT leonardsteven petabytescaleinnovationsattheeuropeannucleotidearchive
AT linquan petabytescaleinnovationsattheeuropeannucleotidearchive
AT lopezrodrigo petabytescaleinnovationsattheeuropeannucleotidearchive
AT lorencdariusz petabytescaleinnovationsattheeuropeannucleotidearchive
AT mcwilliamhamish petabytescaleinnovationsattheeuropeannucleotidearchive
AT mukherjeegaurab petabytescaleinnovationsattheeuropeannucleotidearchive
AT plaistersheila petabytescaleinnovationsattheeuropeannucleotidearchive
AT radhakrishnanrajesh petabytescaleinnovationsattheeuropeannucleotidearchive
AT robinsonstephen petabytescaleinnovationsattheeuropeannucleotidearchive
AT sobhanysiamak petabytescaleinnovationsattheeuropeannucleotidearchive
AT hoopenpetraten petabytescaleinnovationsattheeuropeannucleotidearchive
AT vaughanrobert petabytescaleinnovationsattheeuropeannucleotidearchive
AT zaluninvadim petabytescaleinnovationsattheeuropeannucleotidearchive
AT birneyewan petabytescaleinnovationsattheeuropeannucleotidearchive