Cargando…

SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data

MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distrib...

Descripción completa

Detalles Bibliográficos
Autores principales: Bi, Xuehua, Liang, Weiyang, Zhao, Qichang, Wang, Jianxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10666204/
https://www.ncbi.nlm.nih.gov/pubmed/37941450
http://dx.doi.org/10.1093/bioinformatics/btad662
_version_ 1785138989666140160
author Bi, Xuehua
Liang, Weiyang
Zhao, Qichang
Wang, Jianxin
author_facet Bi, Xuehua
Liang, Weiyang
Zhao, Qichang
Wang, Jianxin
author_sort Bi, Xuehua
collection PubMed
description MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distribution and a lack of labeled data in small categories. RESULTS: To address the problem of labeled-data scarcity, we propose a self-supervised learning strategy for gene–phenotype association prediction, called SSLpheno. Our approach utilizes an attributed network that integrates protein–protein interactions and gene ontology data. We apply a Laplacian-based filter to ensure feature smoothness and use self-supervised training to optimize node feature representation. Specifically, we calculate the cosine similarity of feature vectors and select positive and negative sample nodes for reconstruction training labels. We employ a deep neural network for multi-label classification of phenotypes in the downstream task. Our experimental results demonstrate that SSLpheno outperforms state-of-the-art methods, especially in categories with fewer annotations. Moreover, our case studies illustrate the potential of SSLpheno as an effective prescreening tool for gene–phenotype association identification. AVAILABILITY AND IMPLEMENTATION: https://github.com/bixuehua/SSLpheno.
format Online
Article
Text
id pubmed-10666204
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106662042023-11-06 SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data Bi, Xuehua Liang, Weiyang Zhao, Qichang Wang, Jianxin Bioinformatics Original Paper MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distribution and a lack of labeled data in small categories. RESULTS: To address the problem of labeled-data scarcity, we propose a self-supervised learning strategy for gene–phenotype association prediction, called SSLpheno. Our approach utilizes an attributed network that integrates protein–protein interactions and gene ontology data. We apply a Laplacian-based filter to ensure feature smoothness and use self-supervised training to optimize node feature representation. Specifically, we calculate the cosine similarity of feature vectors and select positive and negative sample nodes for reconstruction training labels. We employ a deep neural network for multi-label classification of phenotypes in the downstream task. Our experimental results demonstrate that SSLpheno outperforms state-of-the-art methods, especially in categories with fewer annotations. Moreover, our case studies illustrate the potential of SSLpheno as an effective prescreening tool for gene–phenotype association identification. AVAILABILITY AND IMPLEMENTATION: https://github.com/bixuehua/SSLpheno. Oxford University Press 2023-11-06 /pmc/articles/PMC10666204/ /pubmed/37941450 http://dx.doi.org/10.1093/bioinformatics/btad662 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Bi, Xuehua
Liang, Weiyang
Zhao, Qichang
Wang, Jianxin
SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title_full SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title_fullStr SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title_full_unstemmed SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title_short SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
title_sort sslpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10666204/
https://www.ncbi.nlm.nih.gov/pubmed/37941450
http://dx.doi.org/10.1093/bioinformatics/btad662
work_keys_str_mv AT bixuehua sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata
AT liangweiyang sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata
AT zhaoqichang sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata
AT wangjianxin sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata