Cargando…

A seed expansion-based method to identify essential proteins by integrating protein–protein interaction sub-networks and multiple biological characteristics

BACKGROUND: The identification of essential proteins is of great significance in biology and pathology. However, protein–protein interaction (PPI) data obtained through high-throughput technology include a high number of false positives. To overcome this limitation, numerous computational algorithms...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, He, Liu, Guixia, Cao, Xintian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688502/
https://www.ncbi.nlm.nih.gov/pubmed/38036960
http://dx.doi.org/10.1186/s12859-023-05583-8
Descripción
Sumario:BACKGROUND: The identification of essential proteins is of great significance in biology and pathology. However, protein–protein interaction (PPI) data obtained through high-throughput technology include a high number of false positives. To overcome this limitation, numerous computational algorithms based on biological characteristics and topological features have been proposed to identify essential proteins. RESULTS: In this paper, we propose a novel method named SESN for identifying essential proteins. It is a seed expansion method based on PPI sub-networks and multiple biological characteristics. Firstly, SESN utilizes gene expression data to construct PPI sub-networks. Secondly, seed expansion is performed simultaneously in each sub-network, and the expansion process is based on the topological features of predicted essential proteins. Thirdly, the error correction mechanism is based on multiple biological characteristics and the entire PPI network. Finally, SESN analyzes the impact of each biological characteristic, including protein complex, gene expression data, GO annotations, and subcellular localization, and adopts the biological data with the best experimental results. The output of SESN is a set of predicted essential proteins. CONCLUSIONS: The analysis of each component of SESN indicates the effectiveness of all components. We conduct comparison experiments using three datasets from two species, and the experimental results demonstrate that SESN achieves superior performance compared to other methods.