Cargando…

PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region

BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chang, Liang, Dong, Gao, Ting, Pang, Xiaohui, Song, Jingyuan, Yao, Hui, Han, Jianping, Liu, Zhihua, Guan, Xiaojun, Jiang, Kun, Li, Huan, Chen, Shilin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278844/
https://www.ncbi.nlm.nih.gov/pubmed/22373238
http://dx.doi.org/10.1186/1471-2105-12-S13-S4
_version_ 1782223614736596992
author Liu, Chang
Liang, Dong
Gao, Ting
Pang, Xiaohui
Song, Jingyuan
Yao, Hui
Han, Jianping
Liu, Zhihua
Guan, Xiaojun
Jiang, Kun
Li, Huan
Chen, Shilin
author_facet Liu, Chang
Liang, Dong
Gao, Ting
Pang, Xiaohui
Song, Jingyuan
Yao, Hui
Han, Jianping
Liu, Zhihua
Guan, Xiaojun
Jiang, Kun
Li, Huan
Chen, Shilin
author_sort Liu, Chang
collection PubMed
description BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. RESULTS: First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. CONCLUSIONS: The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org.
format Online
Article
Text
id pubmed-3278844
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32788442012-02-14 PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region Liu, Chang Liang, Dong Gao, Ting Pang, Xiaohui Song, Jingyuan Yao, Hui Han, Jianping Liu, Zhihua Guan, Xiaojun Jiang, Kun Li, Huan Chen, Shilin BMC Bioinformatics Proceedings BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. RESULTS: First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. CONCLUSIONS: The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org. BioMed Central 2011-11-30 /pmc/articles/PMC3278844/ /pubmed/22373238 http://dx.doi.org/10.1186/1471-2105-12-S13-S4 Text en Copyright ©2011 Liu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Liu, Chang
Liang, Dong
Gao, Ting
Pang, Xiaohui
Song, Jingyuan
Yao, Hui
Han, Jianping
Liu, Zhihua
Guan, Xiaojun
Jiang, Kun
Li, Huan
Chen, Shilin
PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title_full PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title_fullStr PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title_full_unstemmed PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title_short PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
title_sort ptigs-idit, a system for species identification by dna sequences of the psba-trnh intergenic spacer region
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278844/
https://www.ncbi.nlm.nih.gov/pubmed/22373238
http://dx.doi.org/10.1186/1471-2105-12-S13-S4
work_keys_str_mv AT liuchang ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT liangdong ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT gaoting ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT pangxiaohui ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT songjingyuan ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT yaohui ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT hanjianping ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT liuzhihua ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT guanxiaojun ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT jiangkun ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT lihuan ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion
AT chenshilin ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion