Cargando…

Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms

Protein–protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resour...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dunham, Brandan, Ganapathiraju, Madhavi K.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8746451/ https://www.ncbi.nlm.nih.gov/pubmed/35011283 http://dx.doi.org/10.3390/molecules27010041

_version_	1784630589159112704
author	Dunham, Brandan Ganapathiraju, Madhavi K.
author_facet	Dunham, Brandan Ganapathiraju, Madhavi K.
author_sort	Dunham, Brandan
collection	PubMed
description	Protein–protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resource limitations, computational prediction of PPIs is essential and accurately assessing the performance of algorithms is required before further application or translation. However, many published methods compose their evaluation datasets incorrectly, using a higher proportion of positive class data than occuring naturally, leading to exaggerated performance. We re-implemented various published algorithms and evaluated them on datasets with realistic data compositions and found that their performance is overstated in original publications; with several methods outperformed by our control models built on ‘illogical’ and random number features. We conclude that these methods are influenced by an over-characterization of some proteins in the literature and due to scale-free nature of PPI network and that they fail when tested on all possible protein pairs. Additionally, we found that sequence-only-based algorithms performed worse than those that employ functional and expression features. We present a benchmark evaluation of many published algorithms for PPI prediction. The source code of our implementations and the benchmark datasets created here are made available in open source.
format	Online Article Text
id	pubmed-8746451
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-87464512022-01-11 Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms Dunham, Brandan Ganapathiraju, Madhavi K. Molecules Article Protein–protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resource limitations, computational prediction of PPIs is essential and accurately assessing the performance of algorithms is required before further application or translation. However, many published methods compose their evaluation datasets incorrectly, using a higher proportion of positive class data than occuring naturally, leading to exaggerated performance. We re-implemented various published algorithms and evaluated them on datasets with realistic data compositions and found that their performance is overstated in original publications; with several methods outperformed by our control models built on ‘illogical’ and random number features. We conclude that these methods are influenced by an over-characterization of some proteins in the literature and due to scale-free nature of PPI network and that they fail when tested on all possible protein pairs. Additionally, we found that sequence-only-based algorithms performed worse than those that employ functional and expression features. We present a benchmark evaluation of many published algorithms for PPI prediction. The source code of our implementations and the benchmark datasets created here are made available in open source. MDPI 2021-12-22 /pmc/articles/PMC8746451/ /pubmed/35011283 http://dx.doi.org/10.3390/molecules27010041 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Dunham, Brandan Ganapathiraju, Madhavi K. Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title	Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title_full	Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title_fullStr	Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title_full_unstemmed	Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title_short	Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms
title_sort	benchmark evaluation of protein–protein interaction prediction algorithms
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8746451/ https://www.ncbi.nlm.nih.gov/pubmed/35011283 http://dx.doi.org/10.3390/molecules27010041
work_keys_str_mv	AT dunhambrandan benchmarkevaluationofproteinproteininteractionpredictionalgorithms AT ganapathirajumadhavik benchmarkevaluationofproteinproteininteractionpredictionalgorithms

Benchmark Evaluation of Protein–Protein Interaction Prediction Algorithms

Ejemplares similares