Cargando…

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches...

Descripción completa

Detalles Bibliográficos
Autores principales: Bharadhwaj, Vinay Srinivas, Ali, Mehdi, Birkenbihl, Colin, Mubeen, Sarah, Lehmann, Jens, Hofmann-Apitius, Martin, Hoyt, Charles Tapley, Domingo-Fernández, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504642/
https://www.ncbi.nlm.nih.gov/pubmed/33964127
http://dx.doi.org/10.1093/bioinformatics/btab340
_version_ 1784581361562025984
author Bharadhwaj, Vinay Srinivas
Ali, Mehdi
Birkenbihl, Colin
Mubeen, Sarah
Lehmann, Jens
Hofmann-Apitius, Martin
Hoyt, Charles Tapley
Domingo-Fernández, Daniel
author_facet Bharadhwaj, Vinay Srinivas
Ali, Mehdi
Birkenbihl, Colin
Mubeen, Sarah
Lehmann, Jens
Hofmann-Apitius, Martin
Hoyt, Charles Tapley
Domingo-Fernández, Daniel
author_sort Bharadhwaj, Vinay Srinivas
collection PubMed
description SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. AVAILABILITY AND IMPLEMENTATION: CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8504642
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85046422021-10-13 CLEP: a hybrid data- and knowledge-driven framework for generating patient representations Bharadhwaj, Vinay Srinivas Ali, Mehdi Birkenbihl, Colin Mubeen, Sarah Lehmann, Jens Hofmann-Apitius, Martin Hoyt, Charles Tapley Domingo-Fernández, Daniel Bioinformatics Original Papers SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. AVAILABILITY AND IMPLEMENTATION: CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-05-08 /pmc/articles/PMC8504642/ /pubmed/33964127 http://dx.doi.org/10.1093/bioinformatics/btab340 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Bharadhwaj, Vinay Srinivas
Ali, Mehdi
Birkenbihl, Colin
Mubeen, Sarah
Lehmann, Jens
Hofmann-Apitius, Martin
Hoyt, Charles Tapley
Domingo-Fernández, Daniel
CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title_full CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title_fullStr CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title_full_unstemmed CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title_short CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
title_sort clep: a hybrid data- and knowledge-driven framework for generating patient representations
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504642/
https://www.ncbi.nlm.nih.gov/pubmed/33964127
http://dx.doi.org/10.1093/bioinformatics/btab340
work_keys_str_mv AT bharadhwajvinaysrinivas clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT alimehdi clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT birkenbihlcolin clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT mubeensarah clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT lehmannjens clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT hofmannapitiusmartin clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT hoytcharlestapley clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations
AT domingofernandezdaniel clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations