Cargando…

DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g....

Descripción completa

Detalles Bibliográficos
Autor principal: Little, Damon P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3156709/
https://www.ncbi.nlm.nih.gov/pubmed/21857897
http://dx.doi.org/10.1371/journal.pone.0020552
_version_ 1782210221348749312
author Little, Damon P.
author_facet Little, Damon P.
author_sort Little, Damon P.
collection PubMed
description For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types.
format Online
Article
Text
id pubmed-3156709
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31567092011-08-19 DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability Little, Damon P. PLoS One Research Article For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. Public Library of Science 2011-08-16 /pmc/articles/PMC3156709/ /pubmed/21857897 http://dx.doi.org/10.1371/journal.pone.0020552 Text en Damon P. Little. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Little, Damon P.
DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title_full DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title_fullStr DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title_full_unstemmed DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title_short DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
title_sort dna barcode sequence identification incorporating taxonomic hierarchy and within taxon variability
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3156709/
https://www.ncbi.nlm.nih.gov/pubmed/21857897
http://dx.doi.org/10.1371/journal.pone.0020552
work_keys_str_mv AT littledamonp dnabarcodesequenceidentificationincorporatingtaxonomichierarchyandwithintaxonvariability