Cargando…

A new computational strategy for identifying essential proteins based on network topological properties and biological information

Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Qin, Chao, Sun, Yongqi, Dong, Yadong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5533339/
https://www.ncbi.nlm.nih.gov/pubmed/28753682
http://dx.doi.org/10.1371/journal.pone.0182031
_version_ 1783253604868554752
author Qin, Chao
Sun, Yongqi
Dong, Yadong
author_facet Qin, Chao
Sun, Yongqi
Dong, Yadong
author_sort Qin, Chao
collection PubMed
description Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively.
format Online
Article
Text
id pubmed-5533339
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-55333392017-08-07 A new computational strategy for identifying essential proteins based on network topological properties and biological information Qin, Chao Sun, Yongqi Dong, Yadong PLoS One Research Article Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively. Public Library of Science 2017-07-28 /pmc/articles/PMC5533339/ /pubmed/28753682 http://dx.doi.org/10.1371/journal.pone.0182031 Text en © 2017 Qin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Qin, Chao
Sun, Yongqi
Dong, Yadong
A new computational strategy for identifying essential proteins based on network topological properties and biological information
title A new computational strategy for identifying essential proteins based on network topological properties and biological information
title_full A new computational strategy for identifying essential proteins based on network topological properties and biological information
title_fullStr A new computational strategy for identifying essential proteins based on network topological properties and biological information
title_full_unstemmed A new computational strategy for identifying essential proteins based on network topological properties and biological information
title_short A new computational strategy for identifying essential proteins based on network topological properties and biological information
title_sort new computational strategy for identifying essential proteins based on network topological properties and biological information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5533339/
https://www.ncbi.nlm.nih.gov/pubmed/28753682
http://dx.doi.org/10.1371/journal.pone.0182031
work_keys_str_mv AT qinchao anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation
AT sunyongqi anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation
AT dongyadong anewcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation
AT qinchao newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation
AT sunyongqi newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation
AT dongyadong newcomputationalstrategyforidentifyingessentialproteinsbasedonnetworktopologicalpropertiesandbiologicalinformation