Cargando…

Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports

METHODS: We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of...

Descripción completa

Detalles Bibliográficos
Autores principales: Bagheri, Ayoub, Groenhof, T. Katrien J., Asselbergs, Folkert W., Haitjema, Saskia, Bots, Michiel L., Veldhuis, Wouter B., de Jong, Pim A., Oberski, Daniel L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8285182/
https://www.ncbi.nlm.nih.gov/pubmed/34306597
http://dx.doi.org/10.1155/2021/6663884
_version_ 1783723505781571584
author Bagheri, Ayoub
Groenhof, T. Katrien J.
Asselbergs, Folkert W.
Haitjema, Saskia
Bots, Michiel L.
Veldhuis, Wouter B.
de Jong, Pim A.
Oberski, Daniel L.
author_facet Bagheri, Ayoub
Groenhof, T. Katrien J.
Asselbergs, Folkert W.
Haitjema, Saskia
Bots, Michiel L.
Veldhuis, Wouter B.
de Jong, Pim A.
Oberski, Daniel L.
author_sort Bagheri, Ayoub
collection PubMed
description METHODS: We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM). RESULTS: We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively. CONCLUSIONS: Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors.
format Online
Article
Text
id pubmed-8285182
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-82851822021-07-22 Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports Bagheri, Ayoub Groenhof, T. Katrien J. Asselbergs, Folkert W. Haitjema, Saskia Bots, Michiel L. Veldhuis, Wouter B. de Jong, Pim A. Oberski, Daniel L. J Healthc Eng Research Article METHODS: We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM). RESULTS: We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively. CONCLUSIONS: Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors. Hindawi 2021-07-09 /pmc/articles/PMC8285182/ /pubmed/34306597 http://dx.doi.org/10.1155/2021/6663884 Text en Copyright © 2021 Ayoub Bagheri et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bagheri, Ayoub
Groenhof, T. Katrien J.
Asselbergs, Folkert W.
Haitjema, Saskia
Bots, Michiel L.
Veldhuis, Wouter B.
de Jong, Pim A.
Oberski, Daniel L.
Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title_full Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title_fullStr Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title_full_unstemmed Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title_short Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports
title_sort automatic prediction of recurrence of major cardiovascular events: a text mining study using chest x-ray reports
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8285182/
https://www.ncbi.nlm.nih.gov/pubmed/34306597
http://dx.doi.org/10.1155/2021/6663884
work_keys_str_mv AT bagheriayoub automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT groenhoftkatrienj automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT asselbergsfolkertw automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT haitjemasaskia automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT botsmichiell automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT veldhuiswouterb automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT dejongpima automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports
AT oberskidaniell automaticpredictionofrecurrenceofmajorcardiovasculareventsatextminingstudyusingchestxrayreports