Cargando…
False positive reduction in protein-protein interaction predictions using gene ontology annotations
BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/ https://www.ncbi.nlm.nih.gov/pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262 |
_version_ | 1782134468371283968 |
---|---|
author | Mahdavi, Mahmoud A Lin, Yen-Han |
author_facet | Mahdavi, Mahmoud A Lin, Yen-Han |
author_sort | Mahdavi, Mahmoud A |
collection | PubMed |
description | BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results. |
format | Text |
id | pubmed-1941744 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-19417442007-08-09 False positive reduction in protein-protein interaction predictions using gene ontology annotations Mahdavi, Mahmoud A Lin, Yen-Han BMC Bioinformatics Research Article BACKGROUND: Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. RESULTS: Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. CONCLUSION: Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results. BioMed Central 2007-07-23 /pmc/articles/PMC1941744/ /pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262 Text en Copyright © 2007 Mahdavi and Lin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Mahdavi, Mahmoud A Lin, Yen-Han False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title | False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title_full | False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title_fullStr | False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title_full_unstemmed | False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title_short | False positive reduction in protein-protein interaction predictions using gene ontology annotations |
title_sort | false positive reduction in protein-protein interaction predictions using gene ontology annotations |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1941744/ https://www.ncbi.nlm.nih.gov/pubmed/17645798 http://dx.doi.org/10.1186/1471-2105-8-262 |
work_keys_str_mv | AT mahdavimahmouda falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations AT linyenhan falsepositivereductioninproteinproteininteractionpredictionsusinggeneontologyannotations |