Cargando…
A new computational strategy for identifying essential proteins based on network topological properties and biological information
Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There ar...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5533339/ https://www.ncbi.nlm.nih.gov/pubmed/28753682 http://dx.doi.org/10.1371/journal.pone.0182031 |
_version_ | 1783253604868554752 |
---|---|
author | Qin, Chao Sun, Yongqi Dong, Yadong |
author_facet | Qin, Chao Sun, Yongqi Dong, Yadong |
author_sort | Qin, Chao |
collection | PubMed |
description | Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively. |
format | Online Article Text |
id | pubmed-5533339 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-55333392017-08-07 A new computational strategy for identifying essential proteins based on network topological properties and biological information Qin, Chao Sun, Yongqi Dong, Yadong PLoS One Research Article Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively. Public Library of Science 2017-07-28 /pmc/articles/PMC5533339/ /pubmed/28753682 http://dx.doi.org/10.1371/journal.pone.0182031 Text en © 2017 Qin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Qin, Chao Sun, Yongqi Dong, Yadong A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title | A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title_full | A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title_fullStr | A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title_full_unstemmed | A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title_short | A new computational strategy for identifying essential proteins based on network topological properties and biological information |
title_sort | new computational strategy for identifying essential proteins based on network topological properties and biological information |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5533339/ https://www.ncbi.nlm.nih.gov/pubmed/28753682 http://dx.doi.org/10.1371/journal.pone.0182031 |
work_keys_str_mv | AT qinchao anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation AT sunyongqi anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation AT dongyadong anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation AT qinchao newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation AT sunyongqi newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation AT dongyadong newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation |