Cargando…

A novel essential protein identification method based on PPI networks and gene expression data

BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhong, Jiancheng, Tang, Chao, Peng, Wei, Xie, Minzhu, Sun, Yusui, Tang, Qiang, Xiao, Qiu, Yang, Jiahong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120700/
https://www.ncbi.nlm.nih.gov/pubmed/33985429
http://dx.doi.org/10.1186/s12859-021-04175-8
_version_ 1783692149939765248
author Zhong, Jiancheng
Tang, Chao
Peng, Wei
Xie, Minzhu
Sun, Yusui
Tang, Qiang
Xiao, Qiu
Yang, Jiahong
author_facet Zhong, Jiancheng
Tang, Chao
Peng, Wei
Xie, Minzhu
Sun, Yusui
Tang, Qiang
Xiao, Qiu
Yang, Jiahong
author_sort Zhong, Jiancheng
collection PubMed
description BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. RESULTS: In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. CONCLUSIONS: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network.
format Online
Article
Text
id pubmed-8120700
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81207002021-05-17 A novel essential protein identification method based on PPI networks and gene expression data Zhong, Jiancheng Tang, Chao Peng, Wei Xie, Minzhu Sun, Yusui Tang, Qiang Xiao, Qiu Yang, Jiahong BMC Bioinformatics Research Article BACKGROUND: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. RESULTS: In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. CONCLUSIONS: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. BioMed Central 2021-05-13 /pmc/articles/PMC8120700/ /pubmed/33985429 http://dx.doi.org/10.1186/s12859-021-04175-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Zhong, Jiancheng
Tang, Chao
Peng, Wei
Xie, Minzhu
Sun, Yusui
Tang, Qiang
Xiao, Qiu
Yang, Jiahong
A novel essential protein identification method based on PPI networks and gene expression data
title A novel essential protein identification method based on PPI networks and gene expression data
title_full A novel essential protein identification method based on PPI networks and gene expression data
title_fullStr A novel essential protein identification method based on PPI networks and gene expression data
title_full_unstemmed A novel essential protein identification method based on PPI networks and gene expression data
title_short A novel essential protein identification method based on PPI networks and gene expression data
title_sort novel essential protein identification method based on ppi networks and gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120700/
https://www.ncbi.nlm.nih.gov/pubmed/33985429
http://dx.doi.org/10.1186/s12859-021-04175-8
work_keys_str_mv AT zhongjiancheng anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT tangchao anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT pengwei anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT xieminzhu anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT sunyusui anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT tangqiang anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT xiaoqiu anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT yangjiahong anovelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT zhongjiancheng novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT tangchao novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT pengwei novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT xieminzhu novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT sunyusui novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT tangqiang novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT xiaoqiu novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata
AT yangjiahong novelessentialproteinidentificationmethodbasedonppinetworksandgeneexpressiondata