Cargando…

Integrating experimental and literature protein-protein interaction data for protein complex prediction

BACKGROUND: Accurate determination of protein complexes is crucial for understanding cellular organization and function. High-throughput experimental techniques have generated a large amount of protein-protein interaction (PPI) data, allowing prediction of protein complexes from PPI networks. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yijia, Lin, Hongfei, Yang, Zhihao, Wang, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331718/
https://www.ncbi.nlm.nih.gov/pubmed/25708571
http://dx.doi.org/10.1186/1471-2164-16-S2-S4
_version_ 1782357765967052800
author Zhang, Yijia
Lin, Hongfei
Yang, Zhihao
Wang, Jian
author_facet Zhang, Yijia
Lin, Hongfei
Yang, Zhihao
Wang, Jian
author_sort Zhang, Yijia
collection PubMed
description BACKGROUND: Accurate determination of protein complexes is crucial for understanding cellular organization and function. High-throughput experimental techniques have generated a large amount of protein-protein interaction (PPI) data, allowing prediction of protein complexes from PPI networks. However, the high-throughput data often includes false positives and false negatives, making accurate prediction of protein complexes difficult. METHOD: The biomedical literature contains large quantities of PPI data that, along with high-throughput experimental PPI data, are valuable for protein complex prediction. In this study, we employ a natural language processing technique to extract PPI data from the biomedical literature. This data is subsequently integrated with high-throughput PPI and gene ontology data by constructing attributed PPI networks, and a novel method for predicting protein complexes from the attributed PPI networks is proposed. This method allows calculation of the relative contribution of high-throughput and biomedical literature PPI data. RESULTS: Many well-characterized protein complexes are accurately predicted by this method when apply to two different yeast PPI datasets. The results show that (i) biomedical literature PPI data can effectively improve the performance of protein complex prediction; (ii) our method makes good use of high-throughput and biomedical literature PPI data along with gene ontology data to achieve state-of-the-art protein complex prediction capabilities.
format Online
Article
Text
id pubmed-4331718
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43317182015-03-19 Integrating experimental and literature protein-protein interaction data for protein complex prediction Zhang, Yijia Lin, Hongfei Yang, Zhihao Wang, Jian BMC Genomics Proceedings BACKGROUND: Accurate determination of protein complexes is crucial for understanding cellular organization and function. High-throughput experimental techniques have generated a large amount of protein-protein interaction (PPI) data, allowing prediction of protein complexes from PPI networks. However, the high-throughput data often includes false positives and false negatives, making accurate prediction of protein complexes difficult. METHOD: The biomedical literature contains large quantities of PPI data that, along with high-throughput experimental PPI data, are valuable for protein complex prediction. In this study, we employ a natural language processing technique to extract PPI data from the biomedical literature. This data is subsequently integrated with high-throughput PPI and gene ontology data by constructing attributed PPI networks, and a novel method for predicting protein complexes from the attributed PPI networks is proposed. This method allows calculation of the relative contribution of high-throughput and biomedical literature PPI data. RESULTS: Many well-characterized protein complexes are accurately predicted by this method when apply to two different yeast PPI datasets. The results show that (i) biomedical literature PPI data can effectively improve the performance of protein complex prediction; (ii) our method makes good use of high-throughput and biomedical literature PPI data along with gene ontology data to achieve state-of-the-art protein complex prediction capabilities. BioMed Central 2015-01-21 /pmc/articles/PMC4331718/ /pubmed/25708571 http://dx.doi.org/10.1186/1471-2164-16-S2-S4 Text en Copyright © 2015 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Zhang, Yijia
Lin, Hongfei
Yang, Zhihao
Wang, Jian
Integrating experimental and literature protein-protein interaction data for protein complex prediction
title Integrating experimental and literature protein-protein interaction data for protein complex prediction
title_full Integrating experimental and literature protein-protein interaction data for protein complex prediction
title_fullStr Integrating experimental and literature protein-protein interaction data for protein complex prediction
title_full_unstemmed Integrating experimental and literature protein-protein interaction data for protein complex prediction
title_short Integrating experimental and literature protein-protein interaction data for protein complex prediction
title_sort integrating experimental and literature protein-protein interaction data for protein complex prediction
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331718/
https://www.ncbi.nlm.nih.gov/pubmed/25708571
http://dx.doi.org/10.1186/1471-2164-16-S2-S4
work_keys_str_mv AT zhangyijia integratingexperimentalandliteratureproteinproteininteractiondataforproteincomplexprediction
AT linhongfei integratingexperimentalandliteratureproteinproteininteractiondataforproteincomplexprediction
AT yangzhihao integratingexperimentalandliteratureproteinproteininteractiondataforproteincomplexprediction
AT wangjian integratingexperimentalandliteratureproteinproteininteractiondataforproteincomplexprediction