Cargando…
Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models
BACKGROUND: Current atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations; thus, efforts are underway to improve the discriminatory power of ASCVD models. OBJECTIVE: We sought to evaluate the discriminatory power of social media posts to predict the 10-year risk for ASCVD...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8411430/ https://www.ncbi.nlm.nih.gov/pubmed/33605888 http://dx.doi.org/10.2196/24473 |
_version_ | 1783747295599132672 |
---|---|
author | Andy, Anietie U Guntuku, Sharath C Adusumalli, Srinath Asch, David A Groeneveld, Peter W Ungar, Lyle H Merchant, Raina M |
author_facet | Andy, Anietie U Guntuku, Sharath C Adusumalli, Srinath Asch, David A Groeneveld, Peter W Ungar, Lyle H Merchant, Raina M |
author_sort | Andy, Anietie U |
collection | PubMed |
description | BACKGROUND: Current atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations; thus, efforts are underway to improve the discriminatory power of ASCVD models. OBJECTIVE: We sought to evaluate the discriminatory power of social media posts to predict the 10-year risk for ASCVD as compared to that of pooled cohort risk equations (PCEs). METHODS: We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMRs). We retrieved Facebook status updates up to 5 years prior to study enrollment for all consenting patients. We identified patients (N=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine-learning model to predict 10-year ASCVD risk scores. Using a machine-learning model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count, we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively. RESULTS: The machine-learning model predicted the 10-year ASCVD risk scores for the categories <5%, 5%-7.4%, 7.5%-9.9%, and ≥10% with area under the curve (AUC) values of 0.78, 0.57, 0.72, and 0.61, respectively. The machine-learning model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, the machine-learning model predicted the ASCVD risk score with Pearson r=0.26. Using Linguistic Inquiry and Word Count, patients with higher ASCVD scores were more likely to use words associated with sadness (r=0.32). CONCLUSIONS: Language used on social media can provide insights about an individual’s ASCVD risk and inform approaches to risk modification. |
format | Online Article Text |
id | pubmed-8411430 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-84114302021-09-02 Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models Andy, Anietie U Guntuku, Sharath C Adusumalli, Srinath Asch, David A Groeneveld, Peter W Ungar, Lyle H Merchant, Raina M JMIR Cardio Original Paper BACKGROUND: Current atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations; thus, efforts are underway to improve the discriminatory power of ASCVD models. OBJECTIVE: We sought to evaluate the discriminatory power of social media posts to predict the 10-year risk for ASCVD as compared to that of pooled cohort risk equations (PCEs). METHODS: We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMRs). We retrieved Facebook status updates up to 5 years prior to study enrollment for all consenting patients. We identified patients (N=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine-learning model to predict 10-year ASCVD risk scores. Using a machine-learning model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count, we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively. RESULTS: The machine-learning model predicted the 10-year ASCVD risk scores for the categories <5%, 5%-7.4%, 7.5%-9.9%, and ≥10% with area under the curve (AUC) values of 0.78, 0.57, 0.72, and 0.61, respectively. The machine-learning model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, the machine-learning model predicted the ASCVD risk score with Pearson r=0.26. Using Linguistic Inquiry and Word Count, patients with higher ASCVD scores were more likely to use words associated with sadness (r=0.32). CONCLUSIONS: Language used on social media can provide insights about an individual’s ASCVD risk and inform approaches to risk modification. JMIR Publications 2021-02-19 /pmc/articles/PMC8411430/ /pubmed/33605888 http://dx.doi.org/10.2196/24473 Text en ©Anietie U Andy, Sharath C Guntuku, Srinath Adusumalli, David A Asch, Peter W Groeneveld, Lyle H Ungar, Raina M Merchant. Originally published in JMIR Cardio (http://cardio.jmir.org), 19.02.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cardio, is properly cited. The complete bibliographic information, a link to the original publication on http://cardio.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Andy, Anietie U Guntuku, Sharath C Adusumalli, Srinath Asch, David A Groeneveld, Peter W Ungar, Lyle H Merchant, Raina M Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title | Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title_full | Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title_fullStr | Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title_full_unstemmed | Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title_short | Predicting Cardiovascular Risk Using Social Media Data: Performance Evaluation of Machine-Learning Models |
title_sort | predicting cardiovascular risk using social media data: performance evaluation of machine-learning models |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8411430/ https://www.ncbi.nlm.nih.gov/pubmed/33605888 http://dx.doi.org/10.2196/24473 |
work_keys_str_mv | AT andyanietieu predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT guntukusharathc predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT adusumallisrinath predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT aschdavida predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT groeneveldpeterw predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT ungarlyleh predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels AT merchantrainam predictingcardiovascularriskusingsocialmediadataperformanceevaluationofmachinelearningmodels |