Cargando…
SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data
MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distrib...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10666204/ https://www.ncbi.nlm.nih.gov/pubmed/37941450 http://dx.doi.org/10.1093/bioinformatics/btad662 |
_version_ | 1785138989666140160 |
---|---|
author | Bi, Xuehua Liang, Weiyang Zhao, Qichang Wang, Jianxin |
author_facet | Bi, Xuehua Liang, Weiyang Zhao, Qichang Wang, Jianxin |
author_sort | Bi, Xuehua |
collection | PubMed |
description | MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distribution and a lack of labeled data in small categories. RESULTS: To address the problem of labeled-data scarcity, we propose a self-supervised learning strategy for gene–phenotype association prediction, called SSLpheno. Our approach utilizes an attributed network that integrates protein–protein interactions and gene ontology data. We apply a Laplacian-based filter to ensure feature smoothness and use self-supervised training to optimize node feature representation. Specifically, we calculate the cosine similarity of feature vectors and select positive and negative sample nodes for reconstruction training labels. We employ a deep neural network for multi-label classification of phenotypes in the downstream task. Our experimental results demonstrate that SSLpheno outperforms state-of-the-art methods, especially in categories with fewer annotations. Moreover, our case studies illustrate the potential of SSLpheno as an effective prescreening tool for gene–phenotype association identification. AVAILABILITY AND IMPLEMENTATION: https://github.com/bixuehua/SSLpheno. |
format | Online Article Text |
id | pubmed-10666204 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106662042023-11-06 SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data Bi, Xuehua Liang, Weiyang Zhao, Qichang Wang, Jianxin Bioinformatics Original Paper MOTIVATION: Medical genomics faces significant challenges in interpreting disease phenotype and genetic heterogeneity. Despite the establishment of standardized disease phenotype databases, computational methods for predicting gene–phenotype associations still suffer from imbalanced category distribution and a lack of labeled data in small categories. RESULTS: To address the problem of labeled-data scarcity, we propose a self-supervised learning strategy for gene–phenotype association prediction, called SSLpheno. Our approach utilizes an attributed network that integrates protein–protein interactions and gene ontology data. We apply a Laplacian-based filter to ensure feature smoothness and use self-supervised training to optimize node feature representation. Specifically, we calculate the cosine similarity of feature vectors and select positive and negative sample nodes for reconstruction training labels. We employ a deep neural network for multi-label classification of phenotypes in the downstream task. Our experimental results demonstrate that SSLpheno outperforms state-of-the-art methods, especially in categories with fewer annotations. Moreover, our case studies illustrate the potential of SSLpheno as an effective prescreening tool for gene–phenotype association identification. AVAILABILITY AND IMPLEMENTATION: https://github.com/bixuehua/SSLpheno. Oxford University Press 2023-11-06 /pmc/articles/PMC10666204/ /pubmed/37941450 http://dx.doi.org/10.1093/bioinformatics/btad662 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Bi, Xuehua Liang, Weiyang Zhao, Qichang Wang, Jianxin SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title | SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title_full | SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title_fullStr | SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title_full_unstemmed | SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title_short | SSLpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
title_sort | sslpheno: a self-supervised learning approach for gene–phenotype association prediction using protein–protein interactions and gene ontology data |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10666204/ https://www.ncbi.nlm.nih.gov/pubmed/37941450 http://dx.doi.org/10.1093/bioinformatics/btad662 |
work_keys_str_mv | AT bixuehua sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata AT liangweiyang sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata AT zhaoqichang sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata AT wangjianxin sslphenoaselfsupervisedlearningapproachforgenephenotypeassociationpredictionusingproteinproteininteractionsandgeneontologydata |