Cargando…
A novel essential protein identification method based on PPI networks and gene expression data
BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120700/ https://www.ncbi.nlm.nih.gov/pubmed/33985429 http://dx.doi.org/10.1186/s12859-021-04175-8 |
_version_ | 1783692149939765248 |
---|---|
author | Zhong, Jiancheng Tang, Chao Peng, Wei Xie, Minzhu Sun, Yusui Tang, Qiang Xiao, Qiu Yang, Jiahong |
author_facet | Zhong, Jiancheng Tang, Chao Peng, Wei Xie, Minzhu Sun, Yusui Tang, Qiang Xiao, Qiu Yang, Jiahong |
author_sort | Zhong, Jiancheng |
collection | PubMed |
description | BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. RESULTS: In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. CONCLUSIONS: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. |
format | Online Article Text |
id | pubmed-8120700 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-81207002021-05-17 A novel essential protein identification method based on PPI networks and gene expression data Zhong, Jiancheng Tang, Chao Peng, Wei Xie, Minzhu Sun, Yusui Tang, Qiang Xiao, Qiu Yang, Jiahong BMC Bioinformatics Research Article BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. RESULTS: In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. CONCLUSIONS: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. BioMed Central 2021-05-13 /pmc/articles/PMC8120700/ /pubmed/33985429 http://dx.doi.org/10.1186/s12859-021-04175-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Zhong, Jiancheng Tang, Chao Peng, Wei Xie, Minzhu Sun, Yusui Tang, Qiang Xiao, Qiu Yang, Jiahong A novel essential protein identification method based on PPI networks and gene expression data |
title | A novel essential protein identification method based on PPI networks and gene expression data |
title_full | A novel essential protein identification method based on PPI networks and gene expression data |
title_fullStr | A novel essential protein identification method based on PPI networks and gene expression data |
title_full_unstemmed | A novel essential protein identification method based on PPI networks and gene expression data |
title_short | A novel essential protein identification method based on PPI networks and gene expression data |
title_sort | novel essential protein identification method based on ppi networks and gene expression data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120700/ https://www.ncbi.nlm.nih.gov/pubmed/33985429 http://dx.doi.org/10.1186/s12859-021-04175-8 |
work_keys_str_mv | AT zhongjiancheng anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT tangchao anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT pengwei anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT xieminzhu anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT sunyusui anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT tangqiang anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT xiaoqiu anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT yangjiahong anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT zhongjiancheng novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT tangchao novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT pengwei novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT xieminzhu novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT sunyusui novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT tangqiang novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT xiaoqiu novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata AT yangjiahong novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata |