Cargando…

Ensembles of knowledge graph embedding models improve predictions for drug discovery

Recent advances in Knowledge Graphs (KGs) and Knowledge Graph Embedding Models (KGEMs) have led to their adoption in a broad range of fields and applications. The current publishing system in machine learning requires newly introduced KGEMs to achieve state-of-the-art performance, surpassing at leas...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rivas-Barragan, Daniel, Domingo-Fernández, Daniel, Gadiya, Yojana, Healey, David
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Problem Solving Protocol
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9677479/ https://www.ncbi.nlm.nih.gov/pubmed/36384050 http://dx.doi.org/10.1093/bib/bbac481

_version_	1784833820186378240
author	Rivas-Barragan, Daniel Domingo-Fernández, Daniel Gadiya, Yojana Healey, David
author_facet	Rivas-Barragan, Daniel Domingo-Fernández, Daniel Gadiya, Yojana Healey, David
author_sort	Rivas-Barragan, Daniel
collection	PubMed
description	Recent advances in Knowledge Graphs (KGs) and Knowledge Graph Embedding Models (KGEMs) have led to their adoption in a broad range of fields and applications. The current publishing system in machine learning requires newly introduced KGEMs to achieve state-of-the-art performance, surpassing at least one benchmark in order to be published. Despite this, dozens of novel architectures are published every year, making it challenging for users, even within the field, to deduce the most suitable configuration for a given application. A typical biomedical application of KGEMs is drug–disease prediction in the context of drug discovery, in which a KGEM is trained to predict triples linking drugs and diseases. These predictions can be later tested in clinical trials following extensive experimental validation. However, given the infeasibility of evaluating each of these predictions and that only a minimal number of candidates can be experimentally tested, models that yield higher precision on the top prioritized triples are preferred. In this paper, we apply the concept of ensemble learning on KGEMs for drug discovery to assess whether combining the predictions of several models can lead to an overall improvement in predictive performance. First, we trained and benchmarked 10 KGEMs to predict drug–disease triples on two independent biomedical KGs designed for drug discovery. Following, we applied different ensemble methods that aggregate the predictions of these models by leveraging the distribution or the position of the predicted triple scores. We then demonstrate how the ensemble models can achieve better results than the original KGEMs by benchmarking the precision (i.e., number of true positives prioritized) of their top predictions. Lastly, we released the source code presented in this work at https://github.com/enveda/kgem-ensembles-in-drug-discovery.
format	Online Article Text
id	pubmed-9677479
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-96774792022-11-21 Ensembles of knowledge graph embedding models improve predictions for drug discovery Rivas-Barragan, Daniel Domingo-Fernández, Daniel Gadiya, Yojana Healey, David Brief Bioinform Problem Solving Protocol Recent advances in Knowledge Graphs (KGs) and Knowledge Graph Embedding Models (KGEMs) have led to their adoption in a broad range of fields and applications. The current publishing system in machine learning requires newly introduced KGEMs to achieve state-of-the-art performance, surpassing at least one benchmark in order to be published. Despite this, dozens of novel architectures are published every year, making it challenging for users, even within the field, to deduce the most suitable configuration for a given application. A typical biomedical application of KGEMs is drug–disease prediction in the context of drug discovery, in which a KGEM is trained to predict triples linking drugs and diseases. These predictions can be later tested in clinical trials following extensive experimental validation. However, given the infeasibility of evaluating each of these predictions and that only a minimal number of candidates can be experimentally tested, models that yield higher precision on the top prioritized triples are preferred. In this paper, we apply the concept of ensemble learning on KGEMs for drug discovery to assess whether combining the predictions of several models can lead to an overall improvement in predictive performance. First, we trained and benchmarked 10 KGEMs to predict drug–disease triples on two independent biomedical KGs designed for drug discovery. Following, we applied different ensemble methods that aggregate the predictions of these models by leveraging the distribution or the position of the predicted triple scores. We then demonstrate how the ensemble models can achieve better results than the original KGEMs by benchmarking the precision (i.e., number of true positives prioritized) of their top predictions. Lastly, we released the source code presented in this work at https://github.com/enveda/kgem-ensembles-in-drug-discovery. Oxford University Press 2022-11-16 /pmc/articles/PMC9677479/ /pubmed/36384050 http://dx.doi.org/10.1093/bib/bbac481 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Problem Solving Protocol Rivas-Barragan, Daniel Domingo-Fernández, Daniel Gadiya, Yojana Healey, David Ensembles of knowledge graph embedding models improve predictions for drug discovery
title	Ensembles of knowledge graph embedding models improve predictions for drug discovery
title_full	Ensembles of knowledge graph embedding models improve predictions for drug discovery
title_fullStr	Ensembles of knowledge graph embedding models improve predictions for drug discovery
title_full_unstemmed	Ensembles of knowledge graph embedding models improve predictions for drug discovery
title_short	Ensembles of knowledge graph embedding models improve predictions for drug discovery
title_sort	ensembles of knowledge graph embedding models improve predictions for drug discovery
topic	Problem Solving Protocol
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9677479/ https://www.ncbi.nlm.nih.gov/pubmed/36384050 http://dx.doi.org/10.1093/bib/bbac481
work_keys_str_mv	AT rivasbarragandaniel ensemblesofknowledgegraphembeddingmodelsimprovepredictionsfordrugdiscovery AT domingofernandezdaniel ensemblesofknowledgegraphembeddingmodelsimprovepredictionsfordrugdiscovery AT gadiyayojana ensemblesofknowledgegraphembeddingmodelsimprovepredictionsfordrugdiscovery AT healeydavid ensemblesofknowledgegraphembeddingmodelsimprovepredictionsfordrugdiscovery

Ensembles of knowledge graph embedding models improve predictions for drug discovery

Ejemplares similares