Cargando…
Non-negative matrix factorization temporal topic models and clinical text data identify COVID-19 pandemic effects on primary healthcare and community health in Toronto, Canada
OBJECTIVE: To demonstrate how non-negative matrix factorization can be used to learn a temporal topic model over a large collection of primary care clinical notes, characterizing diverse COVID-19 pandemic effects on the physical/mental/social health of residents of Toronto, Canada. MATERIALS AND MET...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Published by Elsevier Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861144/ https://www.ncbi.nlm.nih.gov/pubmed/35202844 http://dx.doi.org/10.1016/j.jbi.2022.104034 |
Sumario: | OBJECTIVE: To demonstrate how non-negative matrix factorization can be used to learn a temporal topic model over a large collection of primary care clinical notes, characterizing diverse COVID-19 pandemic effects on the physical/mental/social health of residents of Toronto, Canada. MATERIALS AND METHODS: The study employs a retrospective open cohort design, consisting of 382,666 primary care progress notes from 44,828 patients, 54 physicians, and 12 clinics collected 01/01/2017 through 31/12/2020. Non-negative matrix factorization uncovers a meaningful latent topical structure permeating the corpus of primary care notes. The learned latent topical basis is transformed into a multivariate time series data structure. Time series methods and plots showcase the evolution/dynamics of learned topics over the study period and allow the identification of COVID-19 pandemic effects. We perform several post-hoc checks of model robustness to increase trust that descriptive/unsupervised inferences are stable over hyper-parameter configurations and/or data perturbations. RESULTS: Temporal topic modelling uncovers a myriad of pandemic-related effects from the expressive clinical text data. In terms of direct effects on patient-health, topics encoding respiratory disease symptoms display altered dynamics during the pandemic year. Further, the pandemic was associated with a multitude of indirect patient-level effects on topical domains representing mental health, sleep, social and familial dynamics, measurement of vitals/labs, uptake of prevention/screening maneuvers, and referrals to medical specialists. Finally, topic models capture changes in primary care practice patterns resulting from the pandemic, including changes in EMR documentation strategies and the uptake of telemedicine. CONCLUSION: Temporal topic modelling applied to a large corpus of rich primary care clinical text data, can identify a meaningful topical/thematic summarization which can provide policymakers and public health stakeholders a passive, cost-effective, technology for understanding holistic impacts of the COVID-19 pandemic on the primary healthcare system and community/public-health. |
---|