Cargando…

Multi-domain knowledge graph embeddings for gene-disease association prediction

BACKGROUND: Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowle...

Descripción completa

Detalles Bibliográficos
Autores principales: Nunes, Susana, Sousa, Rita T., Pesquita, Catia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426189/
https://www.ncbi.nlm.nih.gov/pubmed/37580835
http://dx.doi.org/10.1186/s13326-023-00291-x
_version_ 1785090005659549696
author Nunes, Susana
Sousa, Rita T.
Pesquita, Catia
author_facet Nunes, Susana
Sousa, Rita T.
Pesquita, Catia
author_sort Nunes, Susana
collection PubMed
description BACKGROUND: Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowledge described in ontologies, which can then be explored by machine learning algorithms. However, state-of-the-art knowledge graph embeddings are produced over a single ontology or multiple but disconnected ones, ignoring the impact that considering multiple interconnected domains can have on complex tasks such as gene-disease association prediction. RESULTS: We propose a novel approach to predict gene-disease associations using rich semantic representations based on knowledge graph embeddings over multiple ontologies linked by logical definitions and compound ontology mappings. The experiments showed that considering richer knowledge graphs significantly improves gene-disease prediction and that different knowledge graph embeddings methods benefit more from distinct types of semantic richness. CONCLUSIONS: This work demonstrated the potential for knowledge graph embeddings across multiple and interconnected biomedical ontologies to support gene-disease prediction. It also paved the way for considering other ontologies or tackling other tasks where multiple perspectives over the data can be beneficial. All software and data are freely available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13326-023-00291-x.
format Online
Article
Text
id pubmed-10426189
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104261892023-08-16 Multi-domain knowledge graph embeddings for gene-disease association prediction Nunes, Susana Sousa, Rita T. Pesquita, Catia J Biomed Semantics Research BACKGROUND: Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowledge described in ontologies, which can then be explored by machine learning algorithms. However, state-of-the-art knowledge graph embeddings are produced over a single ontology or multiple but disconnected ones, ignoring the impact that considering multiple interconnected domains can have on complex tasks such as gene-disease association prediction. RESULTS: We propose a novel approach to predict gene-disease associations using rich semantic representations based on knowledge graph embeddings over multiple ontologies linked by logical definitions and compound ontology mappings. The experiments showed that considering richer knowledge graphs significantly improves gene-disease prediction and that different knowledge graph embeddings methods benefit more from distinct types of semantic richness. CONCLUSIONS: This work demonstrated the potential for knowledge graph embeddings across multiple and interconnected biomedical ontologies to support gene-disease prediction. It also paved the way for considering other ontologies or tackling other tasks where multiple perspectives over the data can be beneficial. All software and data are freely available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13326-023-00291-x. BioMed Central 2023-08-14 /pmc/articles/PMC10426189/ /pubmed/37580835 http://dx.doi.org/10.1186/s13326-023-00291-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Nunes, Susana
Sousa, Rita T.
Pesquita, Catia
Multi-domain knowledge graph embeddings for gene-disease association prediction
title Multi-domain knowledge graph embeddings for gene-disease association prediction
title_full Multi-domain knowledge graph embeddings for gene-disease association prediction
title_fullStr Multi-domain knowledge graph embeddings for gene-disease association prediction
title_full_unstemmed Multi-domain knowledge graph embeddings for gene-disease association prediction
title_short Multi-domain knowledge graph embeddings for gene-disease association prediction
title_sort multi-domain knowledge graph embeddings for gene-disease association prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426189/
https://www.ncbi.nlm.nih.gov/pubmed/37580835
http://dx.doi.org/10.1186/s13326-023-00291-x
work_keys_str_mv AT nunessusana multidomainknowledgegraphembeddingsforgenediseaseassociationprediction
AT sousaritat multidomainknowledgegraphembeddingsforgenediseaseassociationprediction
AT pesquitacatia multidomainknowledgegraphembeddingsforgenediseaseassociationprediction