Cargando…
Multi-domain knowledge graph embeddings for gene-disease association prediction
BACKGROUND: Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowle...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426189/ https://www.ncbi.nlm.nih.gov/pubmed/37580835 http://dx.doi.org/10.1186/s13326-023-00291-x |
Sumario: | BACKGROUND: Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowledge described in ontologies, which can then be explored by machine learning algorithms. However, state-of-the-art knowledge graph embeddings are produced over a single ontology or multiple but disconnected ones, ignoring the impact that considering multiple interconnected domains can have on complex tasks such as gene-disease association prediction. RESULTS: We propose a novel approach to predict gene-disease associations using rich semantic representations based on knowledge graph embeddings over multiple ontologies linked by logical definitions and compound ontology mappings. The experiments showed that considering richer knowledge graphs significantly improves gene-disease prediction and that different knowledge graph embeddings methods benefit more from distinct types of semantic richness. CONCLUSIONS: This work demonstrated the potential for knowledge graph embeddings across multiple and interconnected biomedical ontologies to support gene-disease prediction. It also paved the way for considering other ontologies or tackling other tasks where multiple perspectives over the data can be beneficial. All software and data are freely available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13326-023-00291-x. |
---|