Cargando…
CLEP: a hybrid data- and knowledge-driven framework for generating patient representations
SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504642/ https://www.ncbi.nlm.nih.gov/pubmed/33964127 http://dx.doi.org/10.1093/bioinformatics/btab340 |
_version_ | 1784581361562025984 |
---|---|
author | Bharadhwaj, Vinay Srinivas Ali, Mehdi Birkenbihl, Colin Mubeen, Sarah Lehmann, Jens Hofmann-Apitius, Martin Hoyt, Charles Tapley Domingo-Fernández, Daniel |
author_facet | Bharadhwaj, Vinay Srinivas Ali, Mehdi Birkenbihl, Colin Mubeen, Sarah Lehmann, Jens Hofmann-Apitius, Martin Hoyt, Charles Tapley Domingo-Fernández, Daniel |
author_sort | Bharadhwaj, Vinay Srinivas |
collection | PubMed |
description | SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. AVAILABILITY AND IMPLEMENTATION: CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8504642 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-85046422021-10-13 CLEP: a hybrid data- and knowledge-driven framework for generating patient representations Bharadhwaj, Vinay Srinivas Ali, Mehdi Birkenbihl, Colin Mubeen, Sarah Lehmann, Jens Hofmann-Apitius, Martin Hoyt, Charles Tapley Domingo-Fernández, Daniel Bioinformatics Original Papers SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. AVAILABILITY AND IMPLEMENTATION: CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-05-08 /pmc/articles/PMC8504642/ /pubmed/33964127 http://dx.doi.org/10.1093/bioinformatics/btab340 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Bharadhwaj, Vinay Srinivas Ali, Mehdi Birkenbihl, Colin Mubeen, Sarah Lehmann, Jens Hofmann-Apitius, Martin Hoyt, Charles Tapley Domingo-Fernández, Daniel CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title | CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title_full | CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title_fullStr | CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title_full_unstemmed | CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title_short | CLEP: a hybrid data- and knowledge-driven framework for generating patient representations |
title_sort | clep: a hybrid data- and knowledge-driven framework for generating patient representations |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504642/ https://www.ncbi.nlm.nih.gov/pubmed/33964127 http://dx.doi.org/10.1093/bioinformatics/btab340 |
work_keys_str_mv | AT bharadhwajvinaysrinivas clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT alimehdi clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT birkenbihlcolin clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT mubeensarah clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT lehmannjens clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT hofmannapitiusmartin clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT hoytcharlestapley clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations AT domingofernandezdaniel clepahybriddataandknowledgedrivenframeworkforgeneratingpatientrepresentations |