Cargando…

Identifying disease trajectories with predicate information from a knowledge graph

BACKGROUND: Knowledge graphs can represent the contents of biomedical literature and databases as subject-predicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. Some diseases are often diagnosed in patients in specific temporal sequences,...

Descripción completa

Detalles Bibliográficos
Autores principales: Vlietstra, Wytze J., Vos, Rein, van den Akker, Marjan, van Mulligen, Erik M., Kors, Jan A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7439632/
https://www.ncbi.nlm.nih.gov/pubmed/32819419
http://dx.doi.org/10.1186/s13326-020-00228-8
_version_ 1783573017858670592
author Vlietstra, Wytze J.
Vos, Rein
van den Akker, Marjan
van Mulligen, Erik M.
Kors, Jan A.
author_facet Vlietstra, Wytze J.
Vos, Rein
van den Akker, Marjan
van Mulligen, Erik M.
Kors, Jan A.
author_sort Vlietstra, Wytze J.
collection PubMed
description BACKGROUND: Knowledge graphs can represent the contents of biomedical literature and databases as subject-predicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. Some diseases are often diagnosed in patients in specific temporal sequences, which are referred to as disease trajectories. Here, we determine whether a sequence of two diseases forms a trajectory by leveraging the predicate information from paths between (disease) proteins in a knowledge graph. Furthermore, we determine the added value of directional information of predicates for this task. To do so, we create four feature sets, based on two methods for representing indirect paths, and both with and without directional information of predicates (i.e., which protein is considered subject and which object). The added value of the directional information of predicates is quantified by comparing the classification performance of the feature sets that include or exclude it. RESULTS: Our method achieved a maximum area under the ROC curve of 89.8% and 74.5% when evaluated with two different reference sets. Use of directional information of predicates significantly improved performance by 6.5 and 2.0 percentage points respectively. CONCLUSIONS: Our work demonstrates that predicates between proteins can be used to identify disease trajectories. Using the directional information of predicates significantly improved performance over not using this information.
format Online
Article
Text
id pubmed-7439632
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74396322020-08-21 Identifying disease trajectories with predicate information from a knowledge graph Vlietstra, Wytze J. Vos, Rein van den Akker, Marjan van Mulligen, Erik M. Kors, Jan A. J Biomed Semantics Research BACKGROUND: Knowledge graphs can represent the contents of biomedical literature and databases as subject-predicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. Some diseases are often diagnosed in patients in specific temporal sequences, which are referred to as disease trajectories. Here, we determine whether a sequence of two diseases forms a trajectory by leveraging the predicate information from paths between (disease) proteins in a knowledge graph. Furthermore, we determine the added value of directional information of predicates for this task. To do so, we create four feature sets, based on two methods for representing indirect paths, and both with and without directional information of predicates (i.e., which protein is considered subject and which object). The added value of the directional information of predicates is quantified by comparing the classification performance of the feature sets that include or exclude it. RESULTS: Our method achieved a maximum area under the ROC curve of 89.8% and 74.5% when evaluated with two different reference sets. Use of directional information of predicates significantly improved performance by 6.5 and 2.0 percentage points respectively. CONCLUSIONS: Our work demonstrates that predicates between proteins can be used to identify disease trajectories. Using the directional information of predicates significantly improved performance over not using this information. BioMed Central 2020-08-20 /pmc/articles/PMC7439632/ /pubmed/32819419 http://dx.doi.org/10.1186/s13326-020-00228-8 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Vlietstra, Wytze J.
Vos, Rein
van den Akker, Marjan
van Mulligen, Erik M.
Kors, Jan A.
Identifying disease trajectories with predicate information from a knowledge graph
title Identifying disease trajectories with predicate information from a knowledge graph
title_full Identifying disease trajectories with predicate information from a knowledge graph
title_fullStr Identifying disease trajectories with predicate information from a knowledge graph
title_full_unstemmed Identifying disease trajectories with predicate information from a knowledge graph
title_short Identifying disease trajectories with predicate information from a knowledge graph
title_sort identifying disease trajectories with predicate information from a knowledge graph
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7439632/
https://www.ncbi.nlm.nih.gov/pubmed/32819419
http://dx.doi.org/10.1186/s13326-020-00228-8
work_keys_str_mv AT vlietstrawytzej identifyingdiseasetrajectorieswithpredicateinformationfromaknowledgegraph
AT vosrein identifyingdiseasetrajectorieswithpredicateinformationfromaknowledgegraph
AT vandenakkermarjan identifyingdiseasetrajectorieswithpredicateinformationfromaknowledgegraph
AT vanmulligenerikm identifyingdiseasetrajectorieswithpredicateinformationfromaknowledgegraph
AT korsjana identifyingdiseasetrajectorieswithpredicateinformationfromaknowledgegraph