Cargando…
GenBank is a reliable resource for 21st century biodiversity research
Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environment...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
National Academy of Sciences
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842603/ https://www.ncbi.nlm.nih.gov/pubmed/31636175 http://dx.doi.org/10.1073/pnas.1911714116 |
_version_ | 1783468071511392256 |
---|---|
author | Leray, Matthieu Knowlton, Nancy Ho, Shian-Lei Nguyen, Bryan N. Machida, Ryuji J. |
author_facet | Leray, Matthieu Knowlton, Nancy Ho, Shian-Lei Nguyen, Bryan N. Machida, Ryuji J. |
author_sort | Leray, Matthieu |
collection | PubMed |
description | Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environmental DNA (eDNA). In such cases, vouchers for individual organisms are rarely obtained, often unidentifiable, or unavailable. Thus, identifying these sequences typically relies on comparisons with sequences from genetic databases, particularly GenBank. While concerns have been raised about biases and inaccuracies in laboratory and analytical methods, comparatively little attention has been paid to the taxonomic reliability of GenBank itself. Here we analyze the metazoan mitochondrial sequences of GenBank using a combination of distance-based clustering and phylogenetic analysis. Because of their comparatively rapid evolutionary rates and consequent high taxonomic resolution, mitochondrial sequences represent an invaluable resource for the detection of the many small and often undescribed organisms that represent the bulk of animal diversity. We show that metazoan identifications in GenBank are surprisingly accurate, even at low taxonomic levels (likely <1% error rate at the genus level). This stands in contrast to previously voiced concerns based on limited analyses of particular groups and the fact that individual researchers currently submit annotated sequences to GenBank without significant external taxonomic validation. Our encouraging results suggest that the rapid uptake of DNA-based approaches is supported by a bioinformatic infrastructure capable of assessing both the losses to biodiversity caused by global change and the effectiveness of conservation efforts aimed at slowing or reversing these losses. |
format | Online Article Text |
id | pubmed-6842603 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | National Academy of Sciences |
record_format | MEDLINE/PubMed |
spelling | pubmed-68426032019-11-15 GenBank is a reliable resource for 21st century biodiversity research Leray, Matthieu Knowlton, Nancy Ho, Shian-Lei Nguyen, Bryan N. Machida, Ryuji J. Proc Natl Acad Sci U S A Biological Sciences Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environmental DNA (eDNA). In such cases, vouchers for individual organisms are rarely obtained, often unidentifiable, or unavailable. Thus, identifying these sequences typically relies on comparisons with sequences from genetic databases, particularly GenBank. While concerns have been raised about biases and inaccuracies in laboratory and analytical methods, comparatively little attention has been paid to the taxonomic reliability of GenBank itself. Here we analyze the metazoan mitochondrial sequences of GenBank using a combination of distance-based clustering and phylogenetic analysis. Because of their comparatively rapid evolutionary rates and consequent high taxonomic resolution, mitochondrial sequences represent an invaluable resource for the detection of the many small and often undescribed organisms that represent the bulk of animal diversity. We show that metazoan identifications in GenBank are surprisingly accurate, even at low taxonomic levels (likely <1% error rate at the genus level). This stands in contrast to previously voiced concerns based on limited analyses of particular groups and the fact that individual researchers currently submit annotated sequences to GenBank without significant external taxonomic validation. Our encouraging results suggest that the rapid uptake of DNA-based approaches is supported by a bioinformatic infrastructure capable of assessing both the losses to biodiversity caused by global change and the effectiveness of conservation efforts aimed at slowing or reversing these losses. National Academy of Sciences 2019-11-05 2019-10-21 /pmc/articles/PMC6842603/ /pubmed/31636175 http://dx.doi.org/10.1073/pnas.1911714116 Text en Copyright © 2019 the Author(s). Published by PNAS. http://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/This open access article is distributed under Creative Commons Attribution License 4.0 (CC BY) (http://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Biological Sciences Leray, Matthieu Knowlton, Nancy Ho, Shian-Lei Nguyen, Bryan N. Machida, Ryuji J. GenBank is a reliable resource for 21st century biodiversity research |
title | GenBank is a reliable resource for 21st century biodiversity research |
title_full | GenBank is a reliable resource for 21st century biodiversity research |
title_fullStr | GenBank is a reliable resource for 21st century biodiversity research |
title_full_unstemmed | GenBank is a reliable resource for 21st century biodiversity research |
title_short | GenBank is a reliable resource for 21st century biodiversity research |
title_sort | genbank is a reliable resource for 21st century biodiversity research |
topic | Biological Sciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6842603/ https://www.ncbi.nlm.nih.gov/pubmed/31636175 http://dx.doi.org/10.1073/pnas.1911714116 |
work_keys_str_mv | AT leraymatthieu genbankisareliableresourcefor21stcenturybiodiversityresearch AT knowltonnancy genbankisareliableresourcefor21stcenturybiodiversityresearch AT hoshianlei genbankisareliableresourcefor21stcenturybiodiversityresearch AT nguyenbryann genbankisareliableresourcefor21stcenturybiodiversityresearch AT machidaryujij genbankisareliableresourcefor21stcenturybiodiversityresearch |