Cargando…

Identifying protein complexes directly from high-throughput TAP data with Markov random fields

BACKGROUND: Predicting protein complexes from experimental data remains a challenge due to limited resolution and stochastic errors of high-throughput methods. Current algorithms to reconstruct the complexes typically rely on a two-step process. First, they construct an interaction graph from the da...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rungsarityotin, Wasinee, Krause, Roland, Schödl, Arno, Schliep, Alexander
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2222659/ https://www.ncbi.nlm.nih.gov/pubmed/18093306 http://dx.doi.org/10.1186/1471-2105-8-482

_version_	1782149366583132160
author	Rungsarityotin, Wasinee Krause, Roland Schödl, Arno Schliep, Alexander
author_facet	Rungsarityotin, Wasinee Krause, Roland Schödl, Arno Schliep, Alexander
author_sort	Rungsarityotin, Wasinee
collection	PubMed
description	BACKGROUND: Predicting protein complexes from experimental data remains a challenge due to limited resolution and stochastic errors of high-throughput methods. Current algorithms to reconstruct the complexes typically rely on a two-step process. First, they construct an interaction graph from the data, predominantly using heuristics, and subsequently cluster its vertices to identify protein complexes. RESULTS: We propose a model-based identification of protein complexes directly from the experimental observations. Our model of protein complexes based on Markov random fields explicitly incorporates false negative and false positive errors and exhibits a high robustness to noise. A model-based quality score for the resulting clusters allows us to identify reliable predictions in the complete data set. Comparisons with prior work on reference data sets shows favorable results, particularly for larger unfiltered data sets. Additional information on predictions, including the source code under the GNU Public License can be found at http://algorithmics.molgen.mpg.de/Static/Supplements/ProteinComplexes. CONCLUSION: We can identify complexes in the data obtained from high-throughput experiments without prior elimination of proteins or weak interactions. The few parameters of our model, which does not rely on heuristics, can be estimated using maximum likelihood without a reference data set. This is particularly important for protein complex studies in organisms that do not have an established reference frame of known protein complexes.
format	Text
id	pubmed-2222659
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-22226592008-02-02 Identifying protein complexes directly from high-throughput TAP data with Markov random fields Rungsarityotin, Wasinee Krause, Roland Schödl, Arno Schliep, Alexander BMC Bioinformatics Research Article BACKGROUND: Predicting protein complexes from experimental data remains a challenge due to limited resolution and stochastic errors of high-throughput methods. Current algorithms to reconstruct the complexes typically rely on a two-step process. First, they construct an interaction graph from the data, predominantly using heuristics, and subsequently cluster its vertices to identify protein complexes. RESULTS: We propose a model-based identification of protein complexes directly from the experimental observations. Our model of protein complexes based on Markov random fields explicitly incorporates false negative and false positive errors and exhibits a high robustness to noise. A model-based quality score for the resulting clusters allows us to identify reliable predictions in the complete data set. Comparisons with prior work on reference data sets shows favorable results, particularly for larger unfiltered data sets. Additional information on predictions, including the source code under the GNU Public License can be found at http://algorithmics.molgen.mpg.de/Static/Supplements/ProteinComplexes. CONCLUSION: We can identify complexes in the data obtained from high-throughput experiments without prior elimination of proteins or weak interactions. The few parameters of our model, which does not rely on heuristics, can be estimated using maximum likelihood without a reference data set. This is particularly important for protein complex studies in organisms that do not have an established reference frame of known protein complexes. BioMed Central 2007-12-19 /pmc/articles/PMC2222659/ /pubmed/18093306 http://dx.doi.org/10.1186/1471-2105-8-482 Text en Copyright © 2007 Rungsarityotin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Rungsarityotin, Wasinee Krause, Roland Schödl, Arno Schliep, Alexander Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title	Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title_full	Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title_fullStr	Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title_full_unstemmed	Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title_short	Identifying protein complexes directly from high-throughput TAP data with Markov random fields
title_sort	identifying protein complexes directly from high-throughput tap data with markov random fields
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2222659/ https://www.ncbi.nlm.nih.gov/pubmed/18093306 http://dx.doi.org/10.1186/1471-2105-8-482
work_keys_str_mv	AT rungsarityotinwasinee identifyingproteincomplexesdirectlyfromhighthroughputtapdatawithmarkovrandomfields AT krauseroland identifyingproteincomplexesdirectlyfromhighthroughputtapdatawithmarkovrandomfields AT schodlarno identifyingproteincomplexesdirectlyfromhighthroughputtapdatawithmarkovrandomfields AT schliepalexander identifyingproteincomplexesdirectlyfromhighthroughputtapdatawithmarkovrandomfields

Identifying protein complexes directly from high-throughput TAP data with Markov random fields

Ejemplares similares