Cargando…

False positive reduction in protein-protein interaction predictions using gene ontology annotations

BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mahdavi, Mahmoud A, Lin, Yen-Han
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/ https://www.ncbi.nlm.nih.gov/pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262

_version_	1782134468371283968
author	Mahdavi, Mahmoud A Lin, Yen-Han
author_facet	Mahdavi, Mahmoud A Lin, Yen-Han
author_sort	Mahdavi, Mahmoud A
collection	PubMed
description	BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results.
format	Text
id	pubmed-1941744
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-19417442007-08-09 False positive reduction in protein-protein interaction predictions using gene ontology annotations Mahdavi, Mahmoud A Lin, Yen-Han BMC Bioinformatics Research Article BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results. BioMed Central 2007-07-23 /pmc/articles/PMC1941744/ /pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262 Text en Copyright © 2007 Mahdavi and Lin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Mahdavi, Mahmoud A Lin, Yen-Han False positive reduction in protein-protein interaction predictions using gene ontology annotations
title	False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_full	False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_fullStr	False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_full_unstemmed	False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_short	False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_sort	false positive reduction in protein-protein interaction predictions using gene ontology annotations
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/ https://www.ncbi.nlm.nih.gov/pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262
work_keys_str_mv	AT mahdavimahmouda falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations AT linyenhan falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations

False positive reduction in protein-protein interaction predictions using gene ontology annotations

Ejemplares similares