Cargando…

Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case

We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Virgilio, Massimiliano, Jordaens, Kurt, Breman, Floris C., Backeljau, Thierry, De Meyer, Marc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3281081/
https://www.ncbi.nlm.nih.gov/pubmed/22359600
http://dx.doi.org/10.1371/journal.pone.0031581
_version_ 1782223916166545408
author Virgilio, Massimiliano
Jordaens, Kurt
Breman, Floris C.
Backeljau, Thierry
De Meyer, Marc
author_facet Virgilio, Massimiliano
Jordaens, Kurt
Breman, Floris C.
Backeljau, Thierry
De Meyer, Marc
author_sort Virgilio, Massimiliano
collection PubMed
description We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods.
format Online
Article
Text
id pubmed-3281081
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32810812012-02-22 Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case Virgilio, Massimiliano Jordaens, Kurt Breman, Floris C. Backeljau, Thierry De Meyer, Marc PLoS One Research Article We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods. Public Library of Science 2012-02-16 /pmc/articles/PMC3281081/ /pubmed/22359600 http://dx.doi.org/10.1371/journal.pone.0031581 Text en Virgilio et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Virgilio, Massimiliano
Jordaens, Kurt
Breman, Floris C.
Backeljau, Thierry
De Meyer, Marc
Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title_full Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title_fullStr Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title_full_unstemmed Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title_short Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case
title_sort identifying insects with incomplete dna barcode libraries, african fruit flies (diptera: tephritidae) as a test case
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3281081/
https://www.ncbi.nlm.nih.gov/pubmed/22359600
http://dx.doi.org/10.1371/journal.pone.0031581
work_keys_str_mv AT virgiliomassimiliano identifyinginsectswithincompletednabarcodelibrariesafricanfruitfliesdipteratephritidaeasatestcase
AT jordaenskurt identifyinginsectswithincompletednabarcodelibrariesafricanfruitfliesdipteratephritidaeasatestcase
AT bremanflorisc identifyinginsectswithincompletednabarcodelibrariesafricanfruitfliesdipteratephritidaeasatestcase
AT backeljauthierry identifyinginsectswithincompletednabarcodelibrariesafricanfruitfliesdipteratephritidaeasatestcase
AT demeyermarc identifyinginsectswithincompletednabarcodelibrariesafricanfruitfliesdipteratephritidaeasatestcase