Cargando…
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which h...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5021884/ https://www.ncbi.nlm.nih.gov/pubmed/27660763 http://dx.doi.org/10.1155/2016/7639397 |
_version_ | 1782453413851693056 |
---|---|
author | Hua, Hong-Li Zhang, Fa-Zhan Labena, Abraham Alemayehu Dong, Chuan Jin, Yan-Ting Guo, Feng-Biao |
author_facet | Hua, Hong-Li Zhang, Fa-Zhan Labena, Abraham Alemayehu Dong, Chuan Jin, Yan-Ting Guo, Feng-Biao |
author_sort | Hua, Hong-Li |
collection | PubMed |
description | Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge. |
format | Online Article Text |
id | pubmed-5021884 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-50218842016-09-22 An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms Hua, Hong-Li Zhang, Fa-Zhan Labena, Abraham Alemayehu Dong, Chuan Jin, Yan-Ting Guo, Feng-Biao Biomed Res Int Research Article Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge. Hindawi Publishing Corporation 2016 2016-08-30 /pmc/articles/PMC5021884/ /pubmed/27660763 http://dx.doi.org/10.1155/2016/7639397 Text en Copyright © 2016 Hong-Li Hua et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Hua, Hong-Li Zhang, Fa-Zhan Labena, Abraham Alemayehu Dong, Chuan Jin, Yan-Ting Guo, Feng-Biao An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title | An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_full | An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_fullStr | An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_full_unstemmed | An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_short | An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_sort | approach for predicting essential genes using multiple homology mapping and machine learning algorithms |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5021884/ https://www.ncbi.nlm.nih.gov/pubmed/27660763 http://dx.doi.org/10.1155/2016/7639397 |
work_keys_str_mv | AT huahongli anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT zhangfazhan anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT labenaabrahamalemayehu anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT dongchuan anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT jinyanting anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT guofengbiao anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT huahongli approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT zhangfazhan approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT labenaabrahamalemayehu approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT dongchuan approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT jinyanting approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT guofengbiao approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms |