Cargando…
MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding
The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-co...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334202/ https://www.ncbi.nlm.nih.gov/pubmed/32620910 http://dx.doi.org/10.1038/s41597-020-0549-9 |
_version_ | 1783553888410927104 |
---|---|
author | Arranz, Vanessa Pearman, William S. Aguirre, J. David Liggins, Libby |
author_facet | Arranz, Vanessa Pearman, William S. Aguirre, J. David Liggins, Libby |
author_sort | Arranz, Vanessa |
collection | PubMed |
description | The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results. |
format | Online Article Text |
id | pubmed-7334202 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-73342022020-07-09 MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding Arranz, Vanessa Pearman, William S. Aguirre, J. David Liggins, Libby Sci Data Data Descriptor The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results. Nature Publishing Group UK 2020-07-03 /pmc/articles/PMC7334202/ /pubmed/32620910 http://dx.doi.org/10.1038/s41597-020-0549-9 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article. |
spellingShingle | Data Descriptor Arranz, Vanessa Pearman, William S. Aguirre, J. David Liggins, Libby MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title | MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title_full | MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title_fullStr | MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title_full_unstemmed | MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title_short | MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
title_sort | mares, a replicable pipeline and curated reference database for marine eukaryote metabarcoding |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334202/ https://www.ncbi.nlm.nih.gov/pubmed/32620910 http://dx.doi.org/10.1038/s41597-020-0549-9 |
work_keys_str_mv | AT arranzvanessa maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding AT pearmanwilliams maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding AT aguirrejdavid maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding AT ligginslibby maresareplicablepipelineandcuratedreferencedatabaseformarineeukaryotemetabarcoding |