Cargando…

Value, but high costs in post-deposition data curation

Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in...

Descripción completa

Detalles Bibliográficos
Autores principales: ten Hoopen, Petra, Amid, Clara, Luigi Buttigieg, Pier, Pafilis, Evangelos, Bravakos, Panos, Cerdeño-Tárraga, Ana M., Gibson, Richard, Kahlke, Tim, Legaki, Aglaia, Narayana Murthy, Kada, Papastefanou, Gabriella, Pereira, Emiliano, Rossello, Marc, Luisa Toribio, Ana, Cochrane, Guy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4747322/
https://www.ncbi.nlm.nih.gov/pubmed/26861660
http://dx.doi.org/10.1093/database/bav126
_version_ 1782414958509686784
author ten Hoopen, Petra
Amid, Clara
Luigi Buttigieg, Pier
Pafilis, Evangelos
Bravakos, Panos
Cerdeño-Tárraga, Ana M.
Gibson, Richard
Kahlke, Tim
Legaki, Aglaia
Narayana Murthy, Kada
Papastefanou, Gabriella
Pereira, Emiliano
Rossello, Marc
Luisa Toribio, Ana
Cochrane, Guy
author_facet ten Hoopen, Petra
Amid, Clara
Luigi Buttigieg, Pier
Pafilis, Evangelos
Bravakos, Panos
Cerdeño-Tárraga, Ana M.
Gibson, Richard
Kahlke, Tim
Legaki, Aglaia
Narayana Murthy, Kada
Papastefanou, Gabriella
Pereira, Emiliano
Rossello, Marc
Luisa Toribio, Ana
Cochrane, Guy
author_sort ten Hoopen, Petra
collection PubMed
description Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena
format Online
Article
Text
id pubmed-4747322
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47473222016-02-10 Value, but high costs in post-deposition data curation ten Hoopen, Petra Amid, Clara Luigi Buttigieg, Pier Pafilis, Evangelos Bravakos, Panos Cerdeño-Tárraga, Ana M. Gibson, Richard Kahlke, Tim Legaki, Aglaia Narayana Murthy, Kada Papastefanou, Gabriella Pereira, Emiliano Rossello, Marc Luisa Toribio, Ana Cochrane, Guy Database (Oxford) Original Article Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena Oxford University Press 2016-02-09 /pmc/articles/PMC4747322/ /pubmed/26861660 http://dx.doi.org/10.1093/database/bav126 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
ten Hoopen, Petra
Amid, Clara
Luigi Buttigieg, Pier
Pafilis, Evangelos
Bravakos, Panos
Cerdeño-Tárraga, Ana M.
Gibson, Richard
Kahlke, Tim
Legaki, Aglaia
Narayana Murthy, Kada
Papastefanou, Gabriella
Pereira, Emiliano
Rossello, Marc
Luisa Toribio, Ana
Cochrane, Guy
Value, but high costs in post-deposition data curation
title Value, but high costs in post-deposition data curation
title_full Value, but high costs in post-deposition data curation
title_fullStr Value, but high costs in post-deposition data curation
title_full_unstemmed Value, but high costs in post-deposition data curation
title_short Value, but high costs in post-deposition data curation
title_sort value, but high costs in post-deposition data curation
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4747322/
https://www.ncbi.nlm.nih.gov/pubmed/26861660
http://dx.doi.org/10.1093/database/bav126
work_keys_str_mv AT tenhoopenpetra valuebuthighcostsinpostdepositiondatacuration
AT amidclara valuebuthighcostsinpostdepositiondatacuration
AT luigibuttigiegpier valuebuthighcostsinpostdepositiondatacuration
AT pafilisevangelos valuebuthighcostsinpostdepositiondatacuration
AT bravakospanos valuebuthighcostsinpostdepositiondatacuration
AT cerdenotarragaanam valuebuthighcostsinpostdepositiondatacuration
AT gibsonrichard valuebuthighcostsinpostdepositiondatacuration
AT kahlketim valuebuthighcostsinpostdepositiondatacuration
AT legakiaglaia valuebuthighcostsinpostdepositiondatacuration
AT narayanamurthykada valuebuthighcostsinpostdepositiondatacuration
AT papastefanougabriella valuebuthighcostsinpostdepositiondatacuration
AT pereiraemiliano valuebuthighcostsinpostdepositiondatacuration
AT rossellomarc valuebuthighcostsinpostdepositiondatacuration
AT luisatoribioana valuebuthighcostsinpostdepositiondatacuration
AT cochraneguy valuebuthighcostsinpostdepositiondatacuration