Cargando…

Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection

BACKGROUND: Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein-protein interactions (PPIs), making it possible to predict protein complexes from protein-protein interactio...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Zhi Hao, Yu, Feng Ying, Lin, Hong Fei, Wang, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243118/
https://www.ncbi.nlm.nih.gov/pubmed/25350598
http://dx.doi.org/10.1186/1755-8794-7-S2-S3
_version_ 1782346063247572992
author Yang, Zhi Hao
Yu, Feng Ying
Lin, Hong Fei
Wang, Jian
author_facet Yang, Zhi Hao
Yu, Feng Ying
Lin, Hong Fei
Wang, Jian
author_sort Yang, Zhi Hao
collection PubMed
description BACKGROUND: Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein-protein interactions (PPIs), making it possible to predict protein complexes from protein-protein interaction networks. On the other hand, the rapidly growing biomedical literature provides a significantly large and readily available source of interaction data, which can be integrated into the protein network for better complex detection performance. METHODS: We present an approach of integrating PPI datasets with the PPI data from biomedical literature for protein complex detection. The approach applies a sophisticated natural language processing system, PPIExtractor, to extract PPI data from biomedical literature. These data are then integrated into the PPI datasets for complex detection. RESULTS: The experimental results of the state-of-the-art complex detection method, ClusterONE, on five yeast PPI datasets verify our method's effectiveness: compared with the original PPI datasets, the average improvements of 3.976 and 5.416 percentage units in the maximum matching ratio (MMR) are achieved on the new networks using the MIPS and SGD gold standards, respectively. In addition, our approach also proves to be effective for three other complex detection algorithms proposed in recent years, i.e. CMC, COACH and RRW. CONCLUSIONS: The rapidly growing biomedical literature provides a significantly large, readily available and relatively accurate source of interaction data, which can be integrated into the protein network for better protein complex detection performance.
format Online
Article
Text
id pubmed-4243118
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42431182014-11-26 Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection Yang, Zhi Hao Yu, Feng Ying Lin, Hong Fei Wang, Jian BMC Med Genomics Research BACKGROUND: Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein-protein interactions (PPIs), making it possible to predict protein complexes from protein-protein interaction networks. On the other hand, the rapidly growing biomedical literature provides a significantly large and readily available source of interaction data, which can be integrated into the protein network for better complex detection performance. METHODS: We present an approach of integrating PPI datasets with the PPI data from biomedical literature for protein complex detection. The approach applies a sophisticated natural language processing system, PPIExtractor, to extract PPI data from biomedical literature. These data are then integrated into the PPI datasets for complex detection. RESULTS: The experimental results of the state-of-the-art complex detection method, ClusterONE, on five yeast PPI datasets verify our method's effectiveness: compared with the original PPI datasets, the average improvements of 3.976 and 5.416 percentage units in the maximum matching ratio (MMR) are achieved on the new networks using the MIPS and SGD gold standards, respectively. In addition, our approach also proves to be effective for three other complex detection algorithms proposed in recent years, i.e. CMC, COACH and RRW. CONCLUSIONS: The rapidly growing biomedical literature provides a significantly large, readily available and relatively accurate source of interaction data, which can be integrated into the protein network for better protein complex detection performance. BioMed Central 2014-10-22 /pmc/articles/PMC4243118/ /pubmed/25350598 http://dx.doi.org/10.1186/1755-8794-7-S2-S3 Text en Copyright © 2014 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Yang, Zhi Hao
Yu, Feng Ying
Lin, Hong Fei
Wang, Jian
Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title_full Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title_fullStr Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title_full_unstemmed Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title_short Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
title_sort integrating ppi datasets with the ppi data from biomedical literature for protein complex detection
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243118/
https://www.ncbi.nlm.nih.gov/pubmed/25350598
http://dx.doi.org/10.1186/1755-8794-7-S2-S3
work_keys_str_mv AT yangzhihao integratingppidatasetswiththeppidatafrombiomedicalliteratureforproteincomplexdetection
AT yufengying integratingppidatasetswiththeppidatafrombiomedicalliteratureforproteincomplexdetection
AT linhongfei integratingppidatasetswiththeppidatafrombiomedicalliteratureforproteincomplexdetection
AT wangjian integratingppidatasetswiththeppidatafrombiomedicalliteratureforproteincomplexdetection