Cargando…
PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278844/ https://www.ncbi.nlm.nih.gov/pubmed/22373238 http://dx.doi.org/10.1186/1471-2105-12-S13-S4 |
_version_ | 1782223614736596992 |
---|---|
author | Liu, Chang Liang, Dong Gao, Ting Pang, Xiaohui Song, Jingyuan Yao, Hui Han, Jianping Liu, Zhihua Guan, Xiaojun Jiang, Kun Li, Huan Chen, Shilin |
author_facet | Liu, Chang Liang, Dong Gao, Ting Pang, Xiaohui Song, Jingyuan Yao, Hui Han, Jianping Liu, Zhihua Guan, Xiaojun Jiang, Kun Li, Huan Chen, Shilin |
author_sort | Liu, Chang |
collection | PubMed |
description | BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. RESULTS: First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. CONCLUSIONS: The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org. |
format | Online Article Text |
id | pubmed-3278844 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32788442012-02-14 PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region Liu, Chang Liang, Dong Gao, Ting Pang, Xiaohui Song, Jingyuan Yao, Hui Han, Jianping Liu, Zhihua Guan, Xiaojun Jiang, Kun Li, Huan Chen, Shilin BMC Bioinformatics Proceedings BACKGROUND: DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. RESULTS: First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. CONCLUSIONS: The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org. BioMed Central 2011-11-30 /pmc/articles/PMC3278844/ /pubmed/22373238 http://dx.doi.org/10.1186/1471-2105-12-S13-S4 Text en Copyright ©2011 Liu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Liu, Chang Liang, Dong Gao, Ting Pang, Xiaohui Song, Jingyuan Yao, Hui Han, Jianping Liu, Zhihua Guan, Xiaojun Jiang, Kun Li, Huan Chen, Shilin PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title | PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title_full | PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title_fullStr | PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title_full_unstemmed | PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title_short | PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region |
title_sort | ptigs-idit, a system for species identification by dna sequences of the psba-trnh intergenic spacer region |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278844/ https://www.ncbi.nlm.nih.gov/pubmed/22373238 http://dx.doi.org/10.1186/1471-2105-12-S13-S4 |
work_keys_str_mv | AT liuchang ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT liangdong ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT gaoting ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT pangxiaohui ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT songjingyuan ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT yaohui ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT hanjianping ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT liuzhihua ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT guanxiaojun ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT jiangkun ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT lihuan ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion AT chenshilin ptigsiditasystemforspeciesidentificationbydnasequencesofthepsbatrnhintergenicspacerregion |