Cargando…

Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species

Genes that are indispensable for survival are essential genes. Many features have been proposed for computational prediction of essential genes. In this paper, the least absolute shrinkage and selection operator method was used to screen key sequence-based features related to gene essentiality. To a...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xiao, Wang, Bao-Jin, Xu, Luo, Tang, Hong-Ling, Xu, Guo-Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5373589/
https://www.ncbi.nlm.nih.gov/pubmed/28358836
http://dx.doi.org/10.1371/journal.pone.0174638
_version_ 1782518791103578112
author Liu, Xiao
Wang, Bao-Jin
Xu, Luo
Tang, Hong-Ling
Xu, Guo-Qing
author_facet Liu, Xiao
Wang, Bao-Jin
Xu, Luo
Tang, Hong-Ling
Xu, Guo-Qing
author_sort Liu, Xiao
collection PubMed
description Genes that are indispensable for survival are essential genes. Many features have been proposed for computational prediction of essential genes. In this paper, the least absolute shrinkage and selection operator method was used to screen key sequence-based features related to gene essentiality. To assess the effects, the selected features were used to predict the essential genes from 31 bacterial species based on a support vector machine classifier. For all 31 bacterial objects (21 Gram-negative objects and ten Gram-positive objects), the features in the three datasets were reduced from 57, 59, and 58, to 40, 37, and 38, respectively, without loss of prediction accuracy. Results showed that some features were redundant for gene essentiality, so could be eliminated from future analyses. The selected features contained more complex (or key) biological information for gene essentiality, and could be of use in related research projects, such as gene prediction, synthetic biology, and drug design.
format Online
Article
Text
id pubmed-5373589
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-53735892017-04-07 Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species Liu, Xiao Wang, Bao-Jin Xu, Luo Tang, Hong-Ling Xu, Guo-Qing PLoS One Research Article Genes that are indispensable for survival are essential genes. Many features have been proposed for computational prediction of essential genes. In this paper, the least absolute shrinkage and selection operator method was used to screen key sequence-based features related to gene essentiality. To assess the effects, the selected features were used to predict the essential genes from 31 bacterial species based on a support vector machine classifier. For all 31 bacterial objects (21 Gram-negative objects and ten Gram-positive objects), the features in the three datasets were reduced from 57, 59, and 58, to 40, 37, and 38, respectively, without loss of prediction accuracy. Results showed that some features were redundant for gene essentiality, so could be eliminated from future analyses. The selected features contained more complex (or key) biological information for gene essentiality, and could be of use in related research projects, such as gene prediction, synthetic biology, and drug design. Public Library of Science 2017-03-30 /pmc/articles/PMC5373589/ /pubmed/28358836 http://dx.doi.org/10.1371/journal.pone.0174638 Text en © 2017 Liu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Liu, Xiao
Wang, Bao-Jin
Xu, Luo
Tang, Hong-Ling
Xu, Guo-Qing
Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title_full Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title_fullStr Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title_full_unstemmed Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title_short Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
title_sort selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5373589/
https://www.ncbi.nlm.nih.gov/pubmed/28358836
http://dx.doi.org/10.1371/journal.pone.0174638
work_keys_str_mv AT liuxiao selectionofkeysequencebasedfeaturesforpredictionofessentialgenesin31diversebacterialspecies
AT wangbaojin selectionofkeysequencebasedfeaturesforpredictionofessentialgenesin31diversebacterialspecies
AT xuluo selectionofkeysequencebasedfeaturesforpredictionofessentialgenesin31diversebacterialspecies
AT tanghongling selectionofkeysequencebasedfeaturesforpredictionofessentialgenesin31diversebacterialspecies
AT xuguoqing selectionofkeysequencebasedfeaturesforpredictionofessentialgenesin31diversebacterialspecies