Cargando…

MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding

The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-co...

Descripción completa

Detalles Bibliográficos
Autores principales: Arranz, Vanessa, Pearman, William S., Aguirre, J. David, Liggins, Libby
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334202/
https://www.ncbi.nlm.nih.gov/pubmed/32620910
http://dx.doi.org/10.1038/s41597-020-0549-9
_version_ 1783553888410927104
author Arranz, Vanessa
Pearman, William S.
Aguirre, J. David
Liggins, Libby
author_facet Arranz, Vanessa
Pearman, William S.
Aguirre, J. David
Liggins, Libby
author_sort Arranz, Vanessa
collection PubMed
description The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results.
format Online
Article
Text
id pubmed-7334202
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73342022020-07-09 MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding Arranz, Vanessa Pearman, William S. Aguirre, J. David Liggins, Libby Sci Data Data Descriptor The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results. Nature Publishing Group UK 2020-07-03 /pmc/articles/PMC7334202/ /pubmed/32620910 http://dx.doi.org/10.1038/s41597-020-0549-9 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Arranz, Vanessa
Pearman, William S.
Aguirre, J. David
Liggins, Libby
MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title_full MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title_fullStr MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title_full_unstemmed MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title_short MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
title_sort mares, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334202/
https://www.ncbi.nlm.nih.gov/pubmed/32620910
http://dx.doi.org/10.1038/s41597-020-0549-9
work_keys_str_mv AT arranzvanessa maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding
AT pearmanwilliams maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding
AT aguirrejdavid maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding
AT ligginslibby maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding