Cargando…

Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels

BACKGROUND: Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy: whole-proteins, domains, and residues. Each level offers a distinct and complementary set of features for computa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yip, Kevin Y, Kim, Philip M, McDermott, Drew, Gerstein, Mark
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2734556/ https://www.ncbi.nlm.nih.gov/pubmed/19656385 http://dx.doi.org/10.1186/1471-2105-10-241

_version_	1782171157237071872
author	Yip, Kevin Y Kim, Philip M McDermott, Drew Gerstein, Mark
author_facet	Yip, Kevin Y Kim, Philip M McDermott, Drew Gerstein, Mark
author_sort	Yip, Kevin Y
collection	PubMed
description	BACKGROUND: Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy: whole-proteins, domains, and residues. Each level offers a distinct and complementary set of features for computationally predicting interactions, including functional genomic features of whole proteins, evolutionary features of domain families and physical-chemical features of individual residues. The predictions at each level could benefit from using the features at all three levels. However, it is not trivial as the features are provided at different granularity. RESULTS: To link up the predictions at the three levels, we propose a multi-level machine-learning framework that allows for explicit information flow between the levels. We demonstrate, using representative yeast interaction networks, that our algorithm is able to utilize complementary feature sets to make more accurate predictions at the three levels than when the three problems are approached independently. To facilitate application of our multi-level learning framework, we discuss three key aspects of multi-level learning and the corresponding design choices that we have made in the implementation of a concrete learning algorithm. 1) Architecture of information flow: we show the greater flexibility of bidirectional flow over independent levels and unidirectional flow; 2) Coupling mechanism of the different levels: We show how this can be accomplished via augmenting the training sets at each level, and discuss the prevention of error propagation between different levels by means of soft coupling; 3) Sparseness of data: We show that the multi-level framework compounds data sparsity issues, and discuss how this can be dealt with by building local models in information-rich parts of the data. Our proof-of-concept learning algorithm demonstrates the advantage of combining levels, and opens up opportunities for further research. AVAILABILITY: The software and a readme file can be downloaded at . The programs are written in Java, and can be run on any platform with Java 1.4 or higher and Apache Ant 1.7.0 or higher installed. The software can be used without a license.
format	Text
id	pubmed-2734556
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-27345562009-08-29 Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels Yip, Kevin Y Kim, Philip M McDermott, Drew Gerstein, Mark BMC Bioinformatics Methodology Article BACKGROUND: Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy: whole-proteins, domains, and residues. Each level offers a distinct and complementary set of features for computationally predicting interactions, including functional genomic features of whole proteins, evolutionary features of domain families and physical-chemical features of individual residues. The predictions at each level could benefit from using the features at all three levels. However, it is not trivial as the features are provided at different granularity. RESULTS: To link up the predictions at the three levels, we propose a multi-level machine-learning framework that allows for explicit information flow between the levels. We demonstrate, using representative yeast interaction networks, that our algorithm is able to utilize complementary feature sets to make more accurate predictions at the three levels than when the three problems are approached independently. To facilitate application of our multi-level learning framework, we discuss three key aspects of multi-level learning and the corresponding design choices that we have made in the implementation of a concrete learning algorithm. 1) Architecture of information flow: we show the greater flexibility of bidirectional flow over independent levels and unidirectional flow; 2) Coupling mechanism of the different levels: We show how this can be accomplished via augmenting the training sets at each level, and discuss the prevention of error propagation between different levels by means of soft coupling; 3) Sparseness of data: We show that the multi-level framework compounds data sparsity issues, and discuss how this can be dealt with by building local models in information-rich parts of the data. Our proof-of-concept learning algorithm demonstrates the advantage of combining levels, and opens up opportunities for further research. AVAILABILITY: The software and a readme file can be downloaded at . The programs are written in Java, and can be run on any platform with Java 1.4 or higher and Apache Ant 1.7.0 or higher installed. The software can be used without a license. BioMed Central 2009-08-05 /pmc/articles/PMC2734556/ /pubmed/19656385 http://dx.doi.org/10.1186/1471-2105-10-241 Text en Copyright © 2009 Yip et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Yip, Kevin Y Kim, Philip M McDermott, Drew Gerstein, Mark Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title	Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title_full	Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title_fullStr	Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title_full_unstemmed	Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title_short	Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
title_sort	multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2734556/ https://www.ncbi.nlm.nih.gov/pubmed/19656385 http://dx.doi.org/10.1186/1471-2105-10-241
work_keys_str_mv	AT yipkeviny multilevellearningimprovingthepredictionofproteindomainandresidueinteractionsbyallowinginformationflowbetweenlevels AT kimphilipm multilevellearningimprovingthepredictionofproteindomainandresidueinteractionsbyallowinginformationflowbetweenlevels AT mcdermottdrew multilevellearningimprovingthepredictionofproteindomainandresidueinteractionsbyallowinginformationflowbetweenlevels AT gersteinmark multilevellearningimprovingthepredictionofproteindomainandresidueinteractionsbyallowinginformationflowbetweenlevels

Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels

Ejemplares similares