Cargando…
Which clustering algorithm is better for predicting protein complexes?
BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell....
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267700/ https://www.ncbi.nlm.nih.gov/pubmed/22185599 http://dx.doi.org/10.1186/1756-0500-4-549 |
_version_ | 1782222308757209088 |
---|---|
author | Moschopoulos, Charalampos N Pavlopoulos, Georgios A Iacucci, Ernesto Aerts, Jan Likothanassis, Spiridon Schneider, Reinhard Kossida, Sophia |
author_facet | Moschopoulos, Charalampos N Pavlopoulos, Georgios A Iacucci, Ernesto Aerts, Jan Likothanassis, Spiridon Schneider, Reinhard Kossida, Sophia |
author_sort | Moschopoulos, Charalampos N |
collection | PubMed |
description | BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. RESULTS: In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. CONCLUSIONS: While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm |
format | Online Article Text |
id | pubmed-3267700 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32677002012-01-30 Which clustering algorithm is better for predicting protein complexes? Moschopoulos, Charalampos N Pavlopoulos, Georgios A Iacucci, Ernesto Aerts, Jan Likothanassis, Spiridon Schneider, Reinhard Kossida, Sophia BMC Res Notes Research Article BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. RESULTS: In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. CONCLUSIONS: While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm BioMed Central 2011-12-20 /pmc/articles/PMC3267700/ /pubmed/22185599 http://dx.doi.org/10.1186/1756-0500-4-549 Text en Copyright ©2011 Moschopoulos et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited |
spellingShingle | Research Article Moschopoulos, Charalampos N Pavlopoulos, Georgios A Iacucci, Ernesto Aerts, Jan Likothanassis, Spiridon Schneider, Reinhard Kossida, Sophia Which clustering algorithm is better for predicting protein complexes? |
title | Which clustering algorithm is better for predicting protein complexes? |
title_full | Which clustering algorithm is better for predicting protein complexes? |
title_fullStr | Which clustering algorithm is better for predicting protein complexes? |
title_full_unstemmed | Which clustering algorithm is better for predicting protein complexes? |
title_short | Which clustering algorithm is better for predicting protein complexes? |
title_sort | which clustering algorithm is better for predicting protein complexes? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267700/ https://www.ncbi.nlm.nih.gov/pubmed/22185599 http://dx.doi.org/10.1186/1756-0500-4-549 |
work_keys_str_mv | AT moschopouloscharalamposn whichclusteringalgorithmisbetterforpredictingproteincomplexes AT pavlopoulosgeorgiosa whichclusteringalgorithmisbetterforpredictingproteincomplexes AT iacucciernesto whichclusteringalgorithmisbetterforpredictingproteincomplexes AT aertsjan whichclusteringalgorithmisbetterforpredictingproteincomplexes AT likothanassisspiridon whichclusteringalgorithmisbetterforpredictingproteincomplexes AT schneiderreinhard whichclusteringalgorithmisbetterforpredictingproteincomplexes AT kossidasophia whichclusteringalgorithmisbetterforpredictingproteincomplexes |