Cargando…
Value, but high costs in post-deposition data curation
Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4747322/ https://www.ncbi.nlm.nih.gov/pubmed/26861660 http://dx.doi.org/10.1093/database/bav126 |
_version_ | 1782414958509686784 |
---|---|
author | ten Hoopen, Petra Amid, Clara Luigi Buttigieg, Pier Pafilis, Evangelos Bravakos, Panos Cerdeño-Tárraga, Ana M. Gibson, Richard Kahlke, Tim Legaki, Aglaia Narayana Murthy, Kada Papastefanou, Gabriella Pereira, Emiliano Rossello, Marc Luisa Toribio, Ana Cochrane, Guy |
author_facet | ten Hoopen, Petra Amid, Clara Luigi Buttigieg, Pier Pafilis, Evangelos Bravakos, Panos Cerdeño-Tárraga, Ana M. Gibson, Richard Kahlke, Tim Legaki, Aglaia Narayana Murthy, Kada Papastefanou, Gabriella Pereira, Emiliano Rossello, Marc Luisa Toribio, Ana Cochrane, Guy |
author_sort | ten Hoopen, Petra |
collection | PubMed |
description | Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena |
format | Online Article Text |
id | pubmed-4747322 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47473222016-02-10 Value, but high costs in post-deposition data curation ten Hoopen, Petra Amid, Clara Luigi Buttigieg, Pier Pafilis, Evangelos Bravakos, Panos Cerdeño-Tárraga, Ana M. Gibson, Richard Kahlke, Tim Legaki, Aglaia Narayana Murthy, Kada Papastefanou, Gabriella Pereira, Emiliano Rossello, Marc Luisa Toribio, Ana Cochrane, Guy Database (Oxford) Original Article Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena Oxford University Press 2016-02-09 /pmc/articles/PMC4747322/ /pubmed/26861660 http://dx.doi.org/10.1093/database/bav126 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article ten Hoopen, Petra Amid, Clara Luigi Buttigieg, Pier Pafilis, Evangelos Bravakos, Panos Cerdeño-Tárraga, Ana M. Gibson, Richard Kahlke, Tim Legaki, Aglaia Narayana Murthy, Kada Papastefanou, Gabriella Pereira, Emiliano Rossello, Marc Luisa Toribio, Ana Cochrane, Guy Value, but high costs in post-deposition data curation |
title | Value, but high costs in post-deposition data curation |
title_full | Value, but high costs in post-deposition data curation |
title_fullStr | Value, but high costs in post-deposition data curation |
title_full_unstemmed | Value, but high costs in post-deposition data curation |
title_short | Value, but high costs in post-deposition data curation |
title_sort | value, but high costs in post-deposition data curation |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4747322/ https://www.ncbi.nlm.nih.gov/pubmed/26861660 http://dx.doi.org/10.1093/database/bav126 |
work_keys_str_mv | AT tenhoopenpetra valuebuthighcostsinpostdepositiondatacuration AT amidclara valuebuthighcostsinpostdepositiondatacuration AT luigibuttigiegpier valuebuthighcostsinpostdepositiondatacuration AT pafilisevangelos valuebuthighcostsinpostdepositiondatacuration AT bravakospanos valuebuthighcostsinpostdepositiondatacuration AT cerdenotarragaanam valuebuthighcostsinpostdepositiondatacuration AT gibsonrichard valuebuthighcostsinpostdepositiondatacuration AT kahlketim valuebuthighcostsinpostdepositiondatacuration AT legakiaglaia valuebuthighcostsinpostdepositiondatacuration AT narayanamurthykada valuebuthighcostsinpostdepositiondatacuration AT papastefanougabriella valuebuthighcostsinpostdepositiondatacuration AT pereiraemiliano valuebuthighcostsinpostdepositiondatacuration AT rossellomarc valuebuthighcostsinpostdepositiondatacuration AT luisatoribioana valuebuthighcostsinpostdepositiondatacuration AT cochraneguy valuebuthighcostsinpostdepositiondatacuration |