Cargando…

Which clustering algorithm is better for predicting protein complexes?

BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell....

Descripción completa

Detalles Bibliográficos
Autores principales: Moschopoulos, Charalampos N, Pavlopoulos, Georgios A, Iacucci, Ernesto, Aerts, Jan, Likothanassis, Spiridon, Schneider, Reinhard, Kossida, Sophia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267700/
https://www.ncbi.nlm.nih.gov/pubmed/22185599
http://dx.doi.org/10.1186/1756-0500-4-549
_version_ 1782222308757209088
author Moschopoulos, Charalampos N
Pavlopoulos, Georgios A
Iacucci, Ernesto
Aerts, Jan
Likothanassis, Spiridon
Schneider, Reinhard
Kossida, Sophia
author_facet Moschopoulos, Charalampos N
Pavlopoulos, Georgios A
Iacucci, Ernesto
Aerts, Jan
Likothanassis, Spiridon
Schneider, Reinhard
Kossida, Sophia
author_sort Moschopoulos, Charalampos N
collection PubMed
description BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. RESULTS: In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. CONCLUSIONS: While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm
format Online
Article
Text
id pubmed-3267700
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32677002012-01-30 Which clustering algorithm is better for predicting protein complexes? Moschopoulos, Charalampos N Pavlopoulos, Georgios A Iacucci, Ernesto Aerts, Jan Likothanassis, Spiridon Schneider, Reinhard Kossida, Sophia BMC Res Notes Research Article BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. RESULTS: In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. CONCLUSIONS: While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm BioMed Central 2011-12-20 /pmc/articles/PMC3267700/ /pubmed/22185599 http://dx.doi.org/10.1186/1756-0500-4-549 Text en Copyright ©2011 Moschopoulos et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
spellingShingle Research Article
Moschopoulos, Charalampos N
Pavlopoulos, Georgios A
Iacucci, Ernesto
Aerts, Jan
Likothanassis, Spiridon
Schneider, Reinhard
Kossida, Sophia
Which clustering algorithm is better for predicting protein complexes?
title Which clustering algorithm is better for predicting protein complexes?
title_full Which clustering algorithm is better for predicting protein complexes?
title_fullStr Which clustering algorithm is better for predicting protein complexes?
title_full_unstemmed Which clustering algorithm is better for predicting protein complexes?
title_short Which clustering algorithm is better for predicting protein complexes?
title_sort which clustering algorithm is better for predicting protein complexes?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267700/
https://www.ncbi.nlm.nih.gov/pubmed/22185599
http://dx.doi.org/10.1186/1756-0500-4-549
work_keys_str_mv AT moschopouloscharalamposn whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT pavlopoulosgeorgiosa whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT iacucciernesto whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT aertsjan whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT likothanassisspiridon whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT schneiderreinhard whichclusteringalgorithmisbetterforpredictingproteincomplexes
AT kossidasophia whichclusteringalgorithmisbetterforpredictingproteincomplexes