Cargando…

Active learning for human protein-protein interaction prediction

BACKGROUND: Biological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Act...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mohamed, Thahir P, Carbonell, Jaime G, Ganapathiraju, Madhavi K
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009530/ https://www.ncbi.nlm.nih.gov/pubmed/20122232 http://dx.doi.org/10.1186/1471-2105-11-S1-S57

_version_	1782194700169510912
author	Mohamed, Thahir P Carbonell, Jaime G Ganapathiraju, Madhavi K
author_facet	Mohamed, Thahir P Carbonell, Jaime G Ganapathiraju, Madhavi K
author_sort	Mohamed, Thahir P
collection	PubMed
description	BACKGROUND: Biological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Active machine learning can guide the selection of pairs of proteins for future experimental characterization in order to accelerate accurate prediction of the human protein interactome. RESULTS: Random forest (RF) has previously been shown to be effective for predicting protein-protein interactions. Here, four different active learning algorithms have been devised for selection of protein pairs to be used to train the RF. With labels of as few as 500 protein-pairs selected using any of the four active learning methods described here, the classifier achieved a higher F-score (harmonic mean of Precision and Recall) than with 3000 randomly chosen protein-pairs. F-score of predicted interactions is shown to increase by about 15% with active learning in comparison to that with random selection of data. CONCLUSION: Active learning algorithms enable learning more accurate classifiers with much lesser labelled data and prove to be useful in applications where manual annotation of data is formidable. Active learning techniques demonstrated here can also be applied to other proteomics applications such as protein structure prediction and classification.
format	Text
id	pubmed-3009530
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-30095302010-12-23 Active learning for human protein-protein interaction prediction Mohamed, Thahir P Carbonell, Jaime G Ganapathiraju, Madhavi K BMC Bioinformatics Research BACKGROUND: Biological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Active machine learning can guide the selection of pairs of proteins for future experimental characterization in order to accelerate accurate prediction of the human protein interactome. RESULTS: Random forest (RF) has previously been shown to be effective for predicting protein-protein interactions. Here, four different active learning algorithms have been devised for selection of protein pairs to be used to train the RF. With labels of as few as 500 protein-pairs selected using any of the four active learning methods described here, the classifier achieved a higher F-score (harmonic mean of Precision and Recall) than with 3000 randomly chosen protein-pairs. F-score of predicted interactions is shown to increase by about 15% with active learning in comparison to that with random selection of data. CONCLUSION: Active learning algorithms enable learning more accurate classifiers with much lesser labelled data and prove to be useful in applications where manual annotation of data is formidable. Active learning techniques demonstrated here can also be applied to other proteomics applications such as protein structure prediction and classification. BioMed Central 2010-01-18 /pmc/articles/PMC3009530/ /pubmed/20122232 http://dx.doi.org/10.1186/1471-2105-11-S1-S57 Text en Copyright ©2010 Mohamed et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Mohamed, Thahir P Carbonell, Jaime G Ganapathiraju, Madhavi K Active learning for human protein-protein interaction prediction
title	Active learning for human protein-protein interaction prediction
title_full	Active learning for human protein-protein interaction prediction
title_fullStr	Active learning for human protein-protein interaction prediction
title_full_unstemmed	Active learning for human protein-protein interaction prediction
title_short	Active learning for human protein-protein interaction prediction
title_sort	active learning for human protein-protein interaction prediction
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009530/ https://www.ncbi.nlm.nih.gov/pubmed/20122232 http://dx.doi.org/10.1186/1471-2105-11-S1-S57
work_keys_str_mv	AT mohamedthahirp activelearningforhumanproteinproteininteractionprediction AT carbonelljaimeg activelearningforhumanproteinproteininteractionprediction AT ganapathirajumadhavik activelearningforhumanproteinproteininteractionprediction

Active learning for human protein-protein interaction prediction

Ejemplares similares