Cargando…
BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries?
Applications of biological knowledge, such as forensics, often require the determination of biological materials to a species level. As such, DNA-based approaches to identification, particularly DNA barcoding, are attracting increased interest. The capacity of DNA barcodes to assign newly encountere...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162515/ https://www.ncbi.nlm.nih.gov/pubmed/32298363 http://dx.doi.org/10.1371/journal.pone.0231814 |
_version_ | 1783523047945273344 |
---|---|
author | Pentinsaari, Mikko Ratnasingham, Sujeevan Miller, Scott E. Hebert, Paul D. N. |
author_facet | Pentinsaari, Mikko Ratnasingham, Sujeevan Miller, Scott E. Hebert, Paul D. N. |
author_sort | Pentinsaari, Mikko |
collection | PubMed |
description | Applications of biological knowledge, such as forensics, often require the determination of biological materials to a species level. As such, DNA-based approaches to identification, particularly DNA barcoding, are attracting increased interest. The capacity of DNA barcodes to assign newly encountered specimens to a species relies upon access to informatics platforms, such as BOLD and GenBank, which host libraries of reference sequences and support the comparison of new sequences to them. As parameterization of these libraries expands, DNA barcoding has the potential to make valuable contributions in diverse applied contexts. However, a recent publication called for caution after finding that both platforms performed poorly in identifying specimens of 17 common insect species. This study follows up on this concern by asking if the misidentifications reflected problems in the reference libraries or in the query sequences used to test them. Because this reanalysis revealed that missteps in acquiring and analyzing the query sequences were responsible for most misidentifications, a workflow is described to minimize such errors in future investigations. The present study also revealed the limitations imposed by the lack of a polished species-level taxonomy for many groups. In such cases, applications can be strengthened by mapping the geographic distributions of sequence-based species proxies rather than waiting for the maturation of formal taxonomic systems based on morphology. |
format | Online Article Text |
id | pubmed-7162515 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-71625152020-04-21 BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? Pentinsaari, Mikko Ratnasingham, Sujeevan Miller, Scott E. Hebert, Paul D. N. PLoS One Research Article Applications of biological knowledge, such as forensics, often require the determination of biological materials to a species level. As such, DNA-based approaches to identification, particularly DNA barcoding, are attracting increased interest. The capacity of DNA barcodes to assign newly encountered specimens to a species relies upon access to informatics platforms, such as BOLD and GenBank, which host libraries of reference sequences and support the comparison of new sequences to them. As parameterization of these libraries expands, DNA barcoding has the potential to make valuable contributions in diverse applied contexts. However, a recent publication called for caution after finding that both platforms performed poorly in identifying specimens of 17 common insect species. This study follows up on this concern by asking if the misidentifications reflected problems in the reference libraries or in the query sequences used to test them. Because this reanalysis revealed that missteps in acquiring and analyzing the query sequences were responsible for most misidentifications, a workflow is described to minimize such errors in future investigations. The present study also revealed the limitations imposed by the lack of a polished species-level taxonomy for many groups. In such cases, applications can be strengthened by mapping the geographic distributions of sequence-based species proxies rather than waiting for the maturation of formal taxonomic systems based on morphology. Public Library of Science 2020-04-16 /pmc/articles/PMC7162515/ /pubmed/32298363 http://dx.doi.org/10.1371/journal.pone.0231814 Text en © 2020 Pentinsaari et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Pentinsaari, Mikko Ratnasingham, Sujeevan Miller, Scott E. Hebert, Paul D. N. BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title | BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title_full | BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title_fullStr | BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title_full_unstemmed | BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title_short | BOLD and GenBank revisited – Do identification errors arise in the lab or in the sequence libraries? |
title_sort | bold and genbank revisited – do identification errors arise in the lab or in the sequence libraries? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162515/ https://www.ncbi.nlm.nih.gov/pubmed/32298363 http://dx.doi.org/10.1371/journal.pone.0231814 |
work_keys_str_mv | AT pentinsaarimikko boldandgenbankrevisiteddoidentificationerrorsariseinthelaborinthesequencelibraries AT ratnasinghamsujeevan boldandgenbankrevisiteddoidentificationerrorsariseinthelaborinthesequencelibraries AT millerscotte boldandgenbankrevisiteddoidentificationerrorsariseinthelaborinthesequencelibraries AT hebertpauldn boldandgenbankrevisiteddoidentificationerrorsariseinthelaborinthesequencelibraries |