Cargando…
Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet....
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419211/ https://www.ncbi.nlm.nih.gov/pubmed/22905214 http://dx.doi.org/10.1371/journal.pone.0043126 |
_version_ | 1782240704212238336 |
---|---|
author | Xiong, Dapeng Xiao, Fen Liu, Li Hu, Kai Tan, Yanping He, Shunmin Gao, Xieping |
author_facet | Xiong, Dapeng Xiao, Fen Liu, Li Hu, Kai Tan, Yanping He, Shunmin Gao, Xieping |
author_sort | Xiong, Dapeng |
collection | PubMed |
description | BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It’s known that different properties may mean different information, so the single property can’t sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn’t been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1–GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection. |
format | Online Article Text |
id | pubmed-3419211 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34192112012-08-19 Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively Xiong, Dapeng Xiao, Fen Liu, Li Hu, Kai Tan, Yanping He, Shunmin Gao, Xieping PLoS One Research Article BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It’s known that different properties may mean different information, so the single property can’t sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn’t been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1–GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection. Public Library of Science 2012-08-14 /pmc/articles/PMC3419211/ /pubmed/22905214 http://dx.doi.org/10.1371/journal.pone.0043126 Text en © 2012 Xiong et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Xiong, Dapeng Xiao, Fen Liu, Li Hu, Kai Tan, Yanping He, Shunmin Gao, Xieping Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title | Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title_full | Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title_fullStr | Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title_full_unstemmed | Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title_short | Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively |
title_sort | towards a better detection of horizontally transferred genes by combining unusual properties effectively |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419211/ https://www.ncbi.nlm.nih.gov/pubmed/22905214 http://dx.doi.org/10.1371/journal.pone.0043126 |
work_keys_str_mv | AT xiongdapeng towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT xiaofen towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT liuli towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT hukai towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT tanyanping towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT heshunmin towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively AT gaoxieping towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively |