Cargando…

Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches

BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHOD...

Descripción completa

Detalles Bibliográficos
Autores principales: Shimpi, Neel, Glurich, Ingrid, Panny, Aloksagar, Hegde, Harshad, Scannapieco, Frank A., Acharya, Amit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835559/
https://www.ncbi.nlm.nih.gov/pubmed/36643095
http://dx.doi.org/10.3389/fdmed.2022.1005140
_version_ 1784868693062189056
author Shimpi, Neel
Glurich, Ingrid
Panny, Aloksagar
Hegde, Harshad
Scannapieco, Frank A.
Acharya, Amit
author_facet Shimpi, Neel
Glurich, Ingrid
Panny, Aloksagar
Hegde, Harshad
Scannapieco, Frank A.
Acharya, Amit
author_sort Shimpi, Neel
collection PubMed
description BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHODS: Retrospective medical and dental data were retrieved from Marshfield Clinic Health System’s data warehouse and integrated electronic medical-dental health records (iEHR). Retrieved data were pre-processed prior to conducting analyses and included matching of cases to controls by (a) race/ethnicity and (b) 1:1 Case: Control ratio. Variables with >30% missing data were excluded from analysis. Datasets were divided into four subsets: (1) All Pneumonia (all cases and controls); (2) community (CAP)/healthcare associated (HCAP) pneumonias; (3) ventilator-associated (VAP)/hospital-acquired (HAP) pneumonias and (4) aspiration pneumonia (AP). Performance of five algorithms were compared across the four subsets: Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests. Feature (input variables) selection and ten-fold cross validation was performed on all the datasets. An evaluation set (10%) was extracted from the subsets for further validation. Model performance was evaluated in terms of total accuracy, sensitivity, specificity, F-measure, Mathews-correlation-coefficient and area under receiver operating characteristic curve (AUC). RESULTS: In total, 6,034 records (cases and controls) met eligibility for inclusion in the main dataset. After feature selection, the variables retained in the subsets were: All Pneumonia (n = 29 variables), CAP-HCAP (n = 26 variables); VAP-HAP (n = 40 variables) and AP (n = 37 variables), respectively. Variables retained (n = 22) were common across all four pneumonia subsets. Of these, the number of missing teeth, periodontal status, periodontal pocket depth more than 5 mm and number of restored teeth contributed to all the subsets and were retained in the model. MLP outperformed other predictive models for All Pneumonia, CAP-HCAP and AP subsets, while SVM outperformed other models in VAP-HAP subset. CONCLUSION: This study validates previously described associations between poor oral health and pneumonia. Benefits of an integrated medical-dental record and care delivery environment for modeling pneumonia risk are highlighted. Based on findings, risk score development could inform referrals and follow-up in integrated healthcare delivery environment and coordinated patient management.
format Online
Article
Text
id pubmed-9835559
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-98355592023-01-12 Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches Shimpi, Neel Glurich, Ingrid Panny, Aloksagar Hegde, Harshad Scannapieco, Frank A. Acharya, Amit Front Dent Med Article BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHODS: Retrospective medical and dental data were retrieved from Marshfield Clinic Health System’s data warehouse and integrated electronic medical-dental health records (iEHR). Retrieved data were pre-processed prior to conducting analyses and included matching of cases to controls by (a) race/ethnicity and (b) 1:1 Case: Control ratio. Variables with >30% missing data were excluded from analysis. Datasets were divided into four subsets: (1) All Pneumonia (all cases and controls); (2) community (CAP)/healthcare associated (HCAP) pneumonias; (3) ventilator-associated (VAP)/hospital-acquired (HAP) pneumonias and (4) aspiration pneumonia (AP). Performance of five algorithms were compared across the four subsets: Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests. Feature (input variables) selection and ten-fold cross validation was performed on all the datasets. An evaluation set (10%) was extracted from the subsets for further validation. Model performance was evaluated in terms of total accuracy, sensitivity, specificity, F-measure, Mathews-correlation-coefficient and area under receiver operating characteristic curve (AUC). RESULTS: In total, 6,034 records (cases and controls) met eligibility for inclusion in the main dataset. After feature selection, the variables retained in the subsets were: All Pneumonia (n = 29 variables), CAP-HCAP (n = 26 variables); VAP-HAP (n = 40 variables) and AP (n = 37 variables), respectively. Variables retained (n = 22) were common across all four pneumonia subsets. Of these, the number of missing teeth, periodontal status, periodontal pocket depth more than 5 mm and number of restored teeth contributed to all the subsets and were retained in the model. MLP outperformed other predictive models for All Pneumonia, CAP-HCAP and AP subsets, while SVM outperformed other models in VAP-HAP subset. CONCLUSION: This study validates previously described associations between poor oral health and pneumonia. Benefits of an integrated medical-dental record and care delivery environment for modeling pneumonia risk are highlighted. Based on findings, risk score development could inform referrals and follow-up in integrated healthcare delivery environment and coordinated patient management. 2022 2022-09-22 /pmc/articles/PMC9835559/ /pubmed/36643095 http://dx.doi.org/10.3389/fdmed.2022.1005140 Text en https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Article
Shimpi, Neel
Glurich, Ingrid
Panny, Aloksagar
Hegde, Harshad
Scannapieco, Frank A.
Acharya, Amit
Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title_full Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title_fullStr Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title_full_unstemmed Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title_short Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
title_sort identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform ehealth approaches
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835559/
https://www.ncbi.nlm.nih.gov/pubmed/36643095
http://dx.doi.org/10.3389/fdmed.2022.1005140
work_keys_str_mv AT shimpineel identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches
AT glurichingrid identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches
AT pannyaloksagar identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches
AT hegdeharshad identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches
AT scannapiecofranka identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches
AT acharyaamit identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches