Cargando…

Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?

Traditional substance use (SU) surveillance methods, such as surveys, incur substantial lags. Due to the continuously evolving trends in SU, insights obtained via such methods are often outdated. Social media-based sources have been proposed for obtaining timely insights, but methods leveraging such...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yuan-Chi, Al-Garadi, Mohammed Ali, Love, Jennifer S., Cooper, Hannah L. F., Perrone, Jeanmarie, Sarker, Abeed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9974473/
https://www.ncbi.nlm.nih.gov/pubmed/36787355
http://dx.doi.org/10.1073/pnas.2207391120
_version_ 1784898736032317440
author Yang, Yuan-Chi
Al-Garadi, Mohammed Ali
Love, Jennifer S.
Cooper, Hannah L. F.
Perrone, Jeanmarie
Sarker, Abeed
author_facet Yang, Yuan-Chi
Al-Garadi, Mohammed Ali
Love, Jennifer S.
Cooper, Hannah L. F.
Perrone, Jeanmarie
Sarker, Abeed
author_sort Yang, Yuan-Chi
collection PubMed
description Traditional substance use (SU) surveillance methods, such as surveys, incur substantial lags. Due to the continuously evolving trends in SU, insights obtained via such methods are often outdated. Social media-based sources have been proposed for obtaining timely insights, but methods leveraging such data cannot typically provide fine-grained statistics about subpopulations, unlike traditional approaches. We address this gap by developing methods for automatically characterizing a large Twitter nonmedical prescription medication use (NPMU) cohort (n = 288,562) in terms of age-group, race, and gender. Our natural language processing and machine learning methods for automated cohort characterization achieved 0.88 precision (95% CI:0.84 to 0.92) for age-group, 0.90 (95% CI: 0.85 to 0.95) for race, and 94% accuracy (95% CI: 92 to 97) for gender, when evaluated against manually annotated gold-standard data. We compared automatically derived statistics for NPMU of tranquilizers, stimulants, and opioids from Twitter with statistics reported in the National Survey on Drug Use and Health (NSDUH) and the National Emergency Department Sample (NEDS). Distributions automatically estimated from Twitter were mostly consistent with the NSDUH [Spearman r: race: 0.98 (P < 0.005); age-group: 0.67 (P < 0.005); gender: 0.66 (P = 0.27)] and NEDS, with 34/65 (52.3%) of the Twitter-based estimates lying within 95% CIs of estimates from the traditional sources. Explainable differences (e.g., overrepresentation of younger people) were found for age-group-related statistics. Our study demonstrates that accurate subpopulation-specific estimates about SU, particularly NPMU, may be automatically derived from Twitter to obtain earlier insights about targeted subpopulations compared to traditional surveillance approaches.
format Online
Article
Text
id pubmed-9974473
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-99744732023-03-02 Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter? Yang, Yuan-Chi Al-Garadi, Mohammed Ali Love, Jennifer S. Cooper, Hannah L. F. Perrone, Jeanmarie Sarker, Abeed Proc Natl Acad Sci U S A Physical Sciences Traditional substance use (SU) surveillance methods, such as surveys, incur substantial lags. Due to the continuously evolving trends in SU, insights obtained via such methods are often outdated. Social media-based sources have been proposed for obtaining timely insights, but methods leveraging such data cannot typically provide fine-grained statistics about subpopulations, unlike traditional approaches. We address this gap by developing methods for automatically characterizing a large Twitter nonmedical prescription medication use (NPMU) cohort (n = 288,562) in terms of age-group, race, and gender. Our natural language processing and machine learning methods for automated cohort characterization achieved 0.88 precision (95% CI:0.84 to 0.92) for age-group, 0.90 (95% CI: 0.85 to 0.95) for race, and 94% accuracy (95% CI: 92 to 97) for gender, when evaluated against manually annotated gold-standard data. We compared automatically derived statistics for NPMU of tranquilizers, stimulants, and opioids from Twitter with statistics reported in the National Survey on Drug Use and Health (NSDUH) and the National Emergency Department Sample (NEDS). Distributions automatically estimated from Twitter were mostly consistent with the NSDUH [Spearman r: race: 0.98 (P < 0.005); age-group: 0.67 (P < 0.005); gender: 0.66 (P = 0.27)] and NEDS, with 34/65 (52.3%) of the Twitter-based estimates lying within 95% CIs of estimates from the traditional sources. Explainable differences (e.g., overrepresentation of younger people) were found for age-group-related statistics. Our study demonstrates that accurate subpopulation-specific estimates about SU, particularly NPMU, may be automatically derived from Twitter to obtain earlier insights about targeted subpopulations compared to traditional surveillance approaches. National Academy of Sciences 2023-02-14 2023-02-21 /pmc/articles/PMC9974473/ /pubmed/36787355 http://dx.doi.org/10.1073/pnas.2207391120 Text en Copyright © 2023 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Physical Sciences
Yang, Yuan-Chi
Al-Garadi, Mohammed Ali
Love, Jennifer S.
Cooper, Hannah L. F.
Perrone, Jeanmarie
Sarker, Abeed
Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title_full Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title_fullStr Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title_full_unstemmed Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title_short Can accurate demographic information about people who use prescription medications nonmedically be derived from Twitter?
title_sort can accurate demographic information about people who use prescription medications nonmedically be derived from twitter?
topic Physical Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9974473/
https://www.ncbi.nlm.nih.gov/pubmed/36787355
http://dx.doi.org/10.1073/pnas.2207391120
work_keys_str_mv AT yangyuanchi canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter
AT algaradimohammedali canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter
AT lovejennifers canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter
AT cooperhannahlf canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter
AT perronejeanmarie canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter
AT sarkerabeed canaccuratedemographicinformationaboutpeoplewhouseprescriptionmedicationsnonmedicallybederivedfromtwitter