Cargando…

Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification

Abstract. Identification by DNA barcoding is more likely to be erroneous when it is based on a large distance between the query (the barcode sequence of the specimen to identify) and its best match in a reference barcode library. The number of such false positive identifications can be decreased by...

Descripción completa

Detalles Bibliográficos
Autores principales: Sonet, Gontran, Jordaens, Kurt, Nagy, Zoltán T., Breman, Floris C., De Meyer, Marc, Backeljau, Thierry, Virgilio, Massimiliano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Pensoft Publishers 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890685/
https://www.ncbi.nlm.nih.gov/pubmed/24453565
http://dx.doi.org/10.3897/zookeys.365.6034
_version_ 1782299303645020160
author Sonet, Gontran
Jordaens, Kurt
Nagy, Zoltán T.
Breman, Floris C.
De Meyer, Marc
Backeljau, Thierry
Virgilio, Massimiliano
author_facet Sonet, Gontran
Jordaens, Kurt
Nagy, Zoltán T.
Breman, Floris C.
De Meyer, Marc
Backeljau, Thierry
Virgilio, Massimiliano
author_sort Sonet, Gontran
collection PubMed
description Abstract. Identification by DNA barcoding is more likely to be erroneous when it is based on a large distance between the query (the barcode sequence of the specimen to identify) and its best match in a reference barcode library. The number of such false positive identifications can be decreased by setting a distance threshold above which identification has to be rejected. To this end, we proposed recently to use an ad hoc distance threshold producing identifications with an estimated relative error probability that can be fixed by the user (e.g. 5%). Here we introduce two R functions that automate the calculation of ad hoc distance thresholds for reference libraries of DNA barcodes. The scripts of both functions, a user manual and an example file are available on the JEMU website (http://jemu.myspecies.info/computer-programs) as well as on the comprehensive R archive network (CRAN, http://cran.r-project.org).
format Online
Article
Text
id pubmed-3890685
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Pensoft Publishers
record_format MEDLINE/PubMed
spelling pubmed-38906852014-01-16 Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification Sonet, Gontran Jordaens, Kurt Nagy, Zoltán T. Breman, Floris C. De Meyer, Marc Backeljau, Thierry Virgilio, Massimiliano Zookeys Article Abstract. Identification by DNA barcoding is more likely to be erroneous when it is based on a large distance between the query (the barcode sequence of the specimen to identify) and its best match in a reference barcode library. The number of such false positive identifications can be decreased by setting a distance threshold above which identification has to be rejected. To this end, we proposed recently to use an ad hoc distance threshold producing identifications with an estimated relative error probability that can be fixed by the user (e.g. 5%). Here we introduce two R functions that automate the calculation of ad hoc distance thresholds for reference libraries of DNA barcodes. The scripts of both functions, a user manual and an example file are available on the JEMU website (http://jemu.myspecies.info/computer-programs) as well as on the comprehensive R archive network (CRAN, http://cran.r-project.org). Pensoft Publishers 2013-12-30 /pmc/articles/PMC3890685/ /pubmed/24453565 http://dx.doi.org/10.3897/zookeys.365.6034 Text en Gontran Sonet, Kurt Jordaens, Zoltán T. Nagy, Floris C. Breman, Marc De Meyer, Thierry Backeljau, Massimiliano Virgilio http://creativecommons.org/licenses/by/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Article
Sonet, Gontran
Jordaens, Kurt
Nagy, Zoltán T.
Breman, Floris C.
De Meyer, Marc
Backeljau, Thierry
Virgilio, Massimiliano
Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title_full Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title_fullStr Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title_full_unstemmed Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title_short Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification
title_sort adhoc: an r package to calculate ad hoc distance thresholds for dna barcoding identification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890685/
https://www.ncbi.nlm.nih.gov/pubmed/24453565
http://dx.doi.org/10.3897/zookeys.365.6034
work_keys_str_mv AT sonetgontran adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT jordaenskurt adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT nagyzoltant adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT bremanflorisc adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT demeyermarc adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT backeljauthierry adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification
AT virgiliomassimiliano adhocanrpackagetocalculateadhocdistancethresholdsfordnabarcodingidentification