Cargando…

PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset

BACKGROUND: Proteins interact with other proteins or biomolecules in complexes to perform cellular functions. Existing protein-protein interaction (PPI) databases and protein complex databases for human proteins are not organized to provide protein complex information or facilitate the discovery of...

Descripción completa

Detalles Bibliográficos
Autores principales: Kikugawa, Shingo, Nishikata, Kensaku, Murakami, Katsuhiko, Sato, Yoshiharu, Suzuki, Mami, Altaf-Ul-Amin, Md, Kanaya, Shigehiko, Imanishi, Tadashi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521179/
https://www.ncbi.nlm.nih.gov/pubmed/23282181
http://dx.doi.org/10.1186/1752-0509-6-S2-S7
_version_ 1782252899190964224
author Kikugawa, Shingo
Nishikata, Kensaku
Murakami, Katsuhiko
Sato, Yoshiharu
Suzuki, Mami
Altaf-Ul-Amin, Md
Kanaya, Shigehiko
Imanishi, Tadashi
author_facet Kikugawa, Shingo
Nishikata, Kensaku
Murakami, Katsuhiko
Sato, Yoshiharu
Suzuki, Mami
Altaf-Ul-Amin, Md
Kanaya, Shigehiko
Imanishi, Tadashi
author_sort Kikugawa, Shingo
collection PubMed
description BACKGROUND: Proteins interact with other proteins or biomolecules in complexes to perform cellular functions. Existing protein-protein interaction (PPI) databases and protein complex databases for human proteins are not organized to provide protein complex information or facilitate the discovery of novel subunits. Data integration of PPIs focused specifically on protein complexes, subunits, and their functions. Predicted candidate complexes or subunits are also important for experimental biologists. DESCRIPTION: Based on integrated PPI data and literature, we have developed a human protein complex database with a complex quality index (PCDq), which includes both known and predicted complexes and subunits. We integrated six PPI data (BIND, DIP, MINT, HPRD, IntAct, and GNP_Y2H), and predicted human protein complexes by finding densely connected regions in the PPI networks. They were curated with the literature so that missing proteins were complemented and some complexes were merged, resulting in 1,264 complexes comprising 9,268 proteins with 32,198 PPIs. The evidence level of each subunit was assigned as a categorical variable. This indicated whether it was a known subunit, and a specific function was inferable from sequence or network analysis. To summarize the categories of all the subunits in a complex, we devised a complex quality index (CQI) and assigned it to each complex. We examined the proportion of consistency of Gene Ontology (GO) terms among protein subunits of a complex. Next, we compared the expression profiles of the corresponding genes and found that many proteins in larger complexes tend to be expressed cooperatively at the transcript level. The proportion of duplicated genes in a complex was evaluated. Finally, we identified 78 hypothetical proteins that were annotated as subunits of 82 complexes, which included known complexes. Of these hypothetical proteins, after our prediction had been made, four were reported to be actual subunits of the assigned protein complexes. CONCLUSIONS: We constructed a new protein complex database PCDq including both predicted and curated human protein complexes. CQI is a useful source of experimentally confirmed information about protein complexes and subunits. The predicted protein complexes can provide functional clues about hypothetical proteins. PCDq is freely available at http://h-invitational.jp/hinv/pcdq/.
format Online
Article
Text
id pubmed-3521179
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35211792012-12-14 PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset Kikugawa, Shingo Nishikata, Kensaku Murakami, Katsuhiko Sato, Yoshiharu Suzuki, Mami Altaf-Ul-Amin, Md Kanaya, Shigehiko Imanishi, Tadashi BMC Syst Biol Proceedings BACKGROUND: Proteins interact with other proteins or biomolecules in complexes to perform cellular functions. Existing protein-protein interaction (PPI) databases and protein complex databases for human proteins are not organized to provide protein complex information or facilitate the discovery of novel subunits. Data integration of PPIs focused specifically on protein complexes, subunits, and their functions. Predicted candidate complexes or subunits are also important for experimental biologists. DESCRIPTION: Based on integrated PPI data and literature, we have developed a human protein complex database with a complex quality index (PCDq), which includes both known and predicted complexes and subunits. We integrated six PPI data (BIND, DIP, MINT, HPRD, IntAct, and GNP_Y2H), and predicted human protein complexes by finding densely connected regions in the PPI networks. They were curated with the literature so that missing proteins were complemented and some complexes were merged, resulting in 1,264 complexes comprising 9,268 proteins with 32,198 PPIs. The evidence level of each subunit was assigned as a categorical variable. This indicated whether it was a known subunit, and a specific function was inferable from sequence or network analysis. To summarize the categories of all the subunits in a complex, we devised a complex quality index (CQI) and assigned it to each complex. We examined the proportion of consistency of Gene Ontology (GO) terms among protein subunits of a complex. Next, we compared the expression profiles of the corresponding genes and found that many proteins in larger complexes tend to be expressed cooperatively at the transcript level. The proportion of duplicated genes in a complex was evaluated. Finally, we identified 78 hypothetical proteins that were annotated as subunits of 82 complexes, which included known complexes. Of these hypothetical proteins, after our prediction had been made, four were reported to be actual subunits of the assigned protein complexes. CONCLUSIONS: We constructed a new protein complex database PCDq including both predicted and curated human protein complexes. CQI is a useful source of experimentally confirmed information about protein complexes and subunits. The predicted protein complexes can provide functional clues about hypothetical proteins. PCDq is freely available at http://h-invitational.jp/hinv/pcdq/. BioMed Central 2012-12-12 /pmc/articles/PMC3521179/ /pubmed/23282181 http://dx.doi.org/10.1186/1752-0509-6-S2-S7 Text en Copyright ©2012 Kikugawa et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Kikugawa, Shingo
Nishikata, Kensaku
Murakami, Katsuhiko
Sato, Yoshiharu
Suzuki, Mami
Altaf-Ul-Amin, Md
Kanaya, Shigehiko
Imanishi, Tadashi
PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title_full PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title_fullStr PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title_full_unstemmed PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title_short PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset
title_sort pcdq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from h-invitational protein-protein interactions integrative dataset
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521179/
https://www.ncbi.nlm.nih.gov/pubmed/23282181
http://dx.doi.org/10.1186/1752-0509-6-S2-S7
work_keys_str_mv AT kikugawashingo pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT nishikatakensaku pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT murakamikatsuhiko pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT satoyoshiharu pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT suzukimami pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT altafulaminmd pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT kanayashigehiko pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset
AT imanishitadashi pcdqhumanproteincomplexdatabasewithqualityindexwhichsummarizesdifferentlevelsofevidencesofproteincomplexespredictedfromhinvitationalproteinproteininteractionsintegrativedataset