Cargando…

Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively

BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet....

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Dapeng, Xiao, Fen, Liu, Li, Hu, Kai, Tan, Yanping, He, Shunmin, Gao, Xieping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419211/
https://www.ncbi.nlm.nih.gov/pubmed/22905214
http://dx.doi.org/10.1371/journal.pone.0043126
_version_ 1782240704212238336
author Xiong, Dapeng
Xiao, Fen
Liu, Li
Hu, Kai
Tan, Yanping
He, Shunmin
Gao, Xieping
author_facet Xiong, Dapeng
Xiao, Fen
Liu, Li
Hu, Kai
Tan, Yanping
He, Shunmin
Gao, Xieping
author_sort Xiong, Dapeng
collection PubMed
description BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It’s known that different properties may mean different information, so the single property can’t sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn’t been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1–GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection.
format Online
Article
Text
id pubmed-3419211
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34192112012-08-19 Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively Xiong, Dapeng Xiao, Fen Liu, Li Hu, Kai Tan, Yanping He, Shunmin Gao, Xieping PLoS One Research Article BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It’s known that different properties may mean different information, so the single property can’t sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn’t been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1–GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection. Public Library of Science 2012-08-14 /pmc/articles/PMC3419211/ /pubmed/22905214 http://dx.doi.org/10.1371/journal.pone.0043126 Text en © 2012 Xiong et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xiong, Dapeng
Xiao, Fen
Liu, Li
Hu, Kai
Tan, Yanping
He, Shunmin
Gao, Xieping
Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title_full Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title_fullStr Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title_full_unstemmed Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title_short Towards a Better Detection of Horizontally Transferred Genes by Combining Unusual Properties Effectively
title_sort towards a better detection of horizontally transferred genes by combining unusual properties effectively
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419211/
https://www.ncbi.nlm.nih.gov/pubmed/22905214
http://dx.doi.org/10.1371/journal.pone.0043126
work_keys_str_mv AT xiongdapeng towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT xiaofen towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT liuli towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT hukai towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT tanyanping towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT heshunmin towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively
AT gaoxieping towardsabetterdetectionofhorizontallytransferredgenesbycombiningunusualpropertieseffectively