Cargando…

False positive reduction in protein-protein interaction predictions using gene ontology annotations

BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahdavi, Mahmoud A, Lin, Yen-Han
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/
https://www.ncbi.nlm.nih.gov/pubmed/17645798
http://dx.doi.org/10.1186/1471-2105-8-262
_version_ 1782134468371283968
author Mahdavi, Mahmoud A
Lin, Yen-Han
author_facet Mahdavi, Mahmoud A
Lin, Yen-Han
author_sort Mahdavi, Mahmoud A
collection PubMed
description BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results.
format Text
id pubmed-1941744
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19417442007-08-09 False positive reduction in protein-protein interaction predictions using gene ontology annotations Mahdavi, Mahmoud A Lin, Yen-Han BMC Bioinformatics Research Article BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results. BioMed Central 2007-07-23 /pmc/articles/PMC1941744/ /pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262 Text en Copyright © 2007 Mahdavi and Lin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mahdavi, Mahmoud A
Lin, Yen-Han
False positive reduction in protein-protein interaction predictions using gene ontology annotations
title False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_full False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_fullStr False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_full_unstemmed False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_short False positive reduction in protein-protein interaction predictions using gene ontology annotations
title_sort false positive reduction in protein-protein interaction predictions using gene ontology annotations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/
https://www.ncbi.nlm.nih.gov/pubmed/17645798
http://dx.doi.org/10.1186/1471-2105-8-262
work_keys_str_mv AT mahdavimahmouda falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations
AT linyenhan falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations