Cargando…

Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations

Understanding gene functions and their associated abnormal phenotypes is crucial in the prevention, diagnosis and treatment against diseases. The Human Phenotype Ontology (HPO) is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. However, the curren...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yuan, He, Ruirui, Qu, Yingjie, Zhu, Yuan, Li, Dianke, Ling, Xinping, Xia, Simin, Li, Zhenqiu, Li, Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9406402/
https://www.ncbi.nlm.nih.gov/pubmed/36010562
http://dx.doi.org/10.3390/cells11162485
_version_ 1784774112145571840
author Liu, Yuan
He, Ruirui
Qu, Yingjie
Zhu, Yuan
Li, Dianke
Ling, Xinping
Xia, Simin
Li, Zhenqiu
Li, Dong
author_facet Liu, Yuan
He, Ruirui
Qu, Yingjie
Zhu, Yuan
Li, Dianke
Ling, Xinping
Xia, Simin
Li, Zhenqiu
Li, Dong
author_sort Liu, Yuan
collection PubMed
description Understanding gene functions and their associated abnormal phenotypes is crucial in the prevention, diagnosis and treatment against diseases. The Human Phenotype Ontology (HPO) is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. However, the current HPO annotations are far from completion, and only a small fraction of human protein-coding genes has HPO annotations. Thus, it is necessary to predict protein-phenotype associations using computational methods. Protein sequences can indicate the structure and function of the proteins, and interacting proteins are more likely to have same function. It is promising to integrate these features for predicting HPO annotations of human protein. We developed GraphPheno, a semi-supervised method based on graph autoencoders, which does not require feature engineering to capture deep features from protein sequences, while also taking into account the topological properties in the protein–protein interaction network to predict the relationships between human genes/proteins and abnormal phenotypes. Cross validation and independent dataset tests show that GraphPheno has satisfactory prediction performance. The algorithm is further confirmed on automatic HPO annotation for no-knowledge proteins under the benchmark of the second Critical Assessment of Functional Annotation, 2013–2014 (CAFA2), where GraphPheno surpasses most existing methods. Further bioinformatics analysis shows that predicted certain phenotype-associated genes using GraphPheno share similar biological properties with known ones. In a case study on the phenotype of abnormality of mitochondrial respiratory chain, top prioritized genes are validated by recent papers. We believe that GraphPheno will help to reveal more associations between genes and phenotypes, and contribute to the discovery of drug targets.
format Online
Article
Text
id pubmed-9406402
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94064022022-08-26 Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations Liu, Yuan He, Ruirui Qu, Yingjie Zhu, Yuan Li, Dianke Ling, Xinping Xia, Simin Li, Zhenqiu Li, Dong Cells Article Understanding gene functions and their associated abnormal phenotypes is crucial in the prevention, diagnosis and treatment against diseases. The Human Phenotype Ontology (HPO) is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. However, the current HPO annotations are far from completion, and only a small fraction of human protein-coding genes has HPO annotations. Thus, it is necessary to predict protein-phenotype associations using computational methods. Protein sequences can indicate the structure and function of the proteins, and interacting proteins are more likely to have same function. It is promising to integrate these features for predicting HPO annotations of human protein. We developed GraphPheno, a semi-supervised method based on graph autoencoders, which does not require feature engineering to capture deep features from protein sequences, while also taking into account the topological properties in the protein–protein interaction network to predict the relationships between human genes/proteins and abnormal phenotypes. Cross validation and independent dataset tests show that GraphPheno has satisfactory prediction performance. The algorithm is further confirmed on automatic HPO annotation for no-knowledge proteins under the benchmark of the second Critical Assessment of Functional Annotation, 2013–2014 (CAFA2), where GraphPheno surpasses most existing methods. Further bioinformatics analysis shows that predicted certain phenotype-associated genes using GraphPheno share similar biological properties with known ones. In a case study on the phenotype of abnormality of mitochondrial respiratory chain, top prioritized genes are validated by recent papers. We believe that GraphPheno will help to reveal more associations between genes and phenotypes, and contribute to the discovery of drug targets. MDPI 2022-08-10 /pmc/articles/PMC9406402/ /pubmed/36010562 http://dx.doi.org/10.3390/cells11162485 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Yuan
He, Ruirui
Qu, Yingjie
Zhu, Yuan
Li, Dianke
Ling, Xinping
Xia, Simin
Li, Zhenqiu
Li, Dong
Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title_full Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title_fullStr Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title_full_unstemmed Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title_short Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations
title_sort integration of human protein sequence and protein-protein interaction data by graph autoencoder to identify novel protein-abnormal phenotype associations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9406402/
https://www.ncbi.nlm.nih.gov/pubmed/36010562
http://dx.doi.org/10.3390/cells11162485
work_keys_str_mv AT liuyuan integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT heruirui integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT quyingjie integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT zhuyuan integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT lidianke integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT lingxinping integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT xiasimin integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT lizhenqiu integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations
AT lidong integrationofhumanproteinsequenceandproteinproteininteractiondatabygraphautoencodertoidentifynovelproteinabnormalphenotypeassociations