Cargando…
Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches
BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHOD...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835559/ https://www.ncbi.nlm.nih.gov/pubmed/36643095 http://dx.doi.org/10.3389/fdmed.2022.1005140 |
_version_ | 1784868693062189056 |
---|---|
author | Shimpi, Neel Glurich, Ingrid Panny, Aloksagar Hegde, Harshad Scannapieco, Frank A. Acharya, Amit |
author_facet | Shimpi, Neel Glurich, Ingrid Panny, Aloksagar Hegde, Harshad Scannapieco, Frank A. Acharya, Amit |
author_sort | Shimpi, Neel |
collection | PubMed |
description | BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHODS: Retrospective medical and dental data were retrieved from Marshfield Clinic Health System’s data warehouse and integrated electronic medical-dental health records (iEHR). Retrieved data were pre-processed prior to conducting analyses and included matching of cases to controls by (a) race/ethnicity and (b) 1:1 Case: Control ratio. Variables with >30% missing data were excluded from analysis. Datasets were divided into four subsets: (1) All Pneumonia (all cases and controls); (2) community (CAP)/healthcare associated (HCAP) pneumonias; (3) ventilator-associated (VAP)/hospital-acquired (HAP) pneumonias and (4) aspiration pneumonia (AP). Performance of five algorithms were compared across the four subsets: Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests. Feature (input variables) selection and ten-fold cross validation was performed on all the datasets. An evaluation set (10%) was extracted from the subsets for further validation. Model performance was evaluated in terms of total accuracy, sensitivity, specificity, F-measure, Mathews-correlation-coefficient and area under receiver operating characteristic curve (AUC). RESULTS: In total, 6,034 records (cases and controls) met eligibility for inclusion in the main dataset. After feature selection, the variables retained in the subsets were: All Pneumonia (n = 29 variables), CAP-HCAP (n = 26 variables); VAP-HAP (n = 40 variables) and AP (n = 37 variables), respectively. Variables retained (n = 22) were common across all four pneumonia subsets. Of these, the number of missing teeth, periodontal status, periodontal pocket depth more than 5 mm and number of restored teeth contributed to all the subsets and were retained in the model. MLP outperformed other predictive models for All Pneumonia, CAP-HCAP and AP subsets, while SVM outperformed other models in VAP-HAP subset. CONCLUSION: This study validates previously described associations between poor oral health and pneumonia. Benefits of an integrated medical-dental record and care delivery environment for modeling pneumonia risk are highlighted. Based on findings, risk score development could inform referrals and follow-up in integrated healthcare delivery environment and coordinated patient management. |
format | Online Article Text |
id | pubmed-9835559 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
record_format | MEDLINE/PubMed |
spelling | pubmed-98355592023-01-12 Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches Shimpi, Neel Glurich, Ingrid Panny, Aloksagar Hegde, Harshad Scannapieco, Frank A. Acharya, Amit Front Dent Med Article BACKGROUND: The objective of this study was to build models that define variables contributing to pneumonia risk by applying supervised Machine Learning-(ML) to medical and oral disease data to define key risk variables contributing to pneumonia emergence for any pneumonia/pneumonia subtypes. METHODS: Retrospective medical and dental data were retrieved from Marshfield Clinic Health System’s data warehouse and integrated electronic medical-dental health records (iEHR). Retrieved data were pre-processed prior to conducting analyses and included matching of cases to controls by (a) race/ethnicity and (b) 1:1 Case: Control ratio. Variables with >30% missing data were excluded from analysis. Datasets were divided into four subsets: (1) All Pneumonia (all cases and controls); (2) community (CAP)/healthcare associated (HCAP) pneumonias; (3) ventilator-associated (VAP)/hospital-acquired (HAP) pneumonias and (4) aspiration pneumonia (AP). Performance of five algorithms were compared across the four subsets: Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests. Feature (input variables) selection and ten-fold cross validation was performed on all the datasets. An evaluation set (10%) was extracted from the subsets for further validation. Model performance was evaluated in terms of total accuracy, sensitivity, specificity, F-measure, Mathews-correlation-coefficient and area under receiver operating characteristic curve (AUC). RESULTS: In total, 6,034 records (cases and controls) met eligibility for inclusion in the main dataset. After feature selection, the variables retained in the subsets were: All Pneumonia (n = 29 variables), CAP-HCAP (n = 26 variables); VAP-HAP (n = 40 variables) and AP (n = 37 variables), respectively. Variables retained (n = 22) were common across all four pneumonia subsets. Of these, the number of missing teeth, periodontal status, periodontal pocket depth more than 5 mm and number of restored teeth contributed to all the subsets and were retained in the model. MLP outperformed other predictive models for All Pneumonia, CAP-HCAP and AP subsets, while SVM outperformed other models in VAP-HAP subset. CONCLUSION: This study validates previously described associations between poor oral health and pneumonia. Benefits of an integrated medical-dental record and care delivery environment for modeling pneumonia risk are highlighted. Based on findings, risk score development could inform referrals and follow-up in integrated healthcare delivery environment and coordinated patient management. 2022 2022-09-22 /pmc/articles/PMC9835559/ /pubmed/36643095 http://dx.doi.org/10.3389/fdmed.2022.1005140 Text en https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Article Shimpi, Neel Glurich, Ingrid Panny, Aloksagar Hegde, Harshad Scannapieco, Frank A. Acharya, Amit Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title | Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title_full | Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title_fullStr | Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title_full_unstemmed | Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title_short | Identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform eHealth approaches |
title_sort | identifying oral disease variables associated with pneumonia emergence by application of machine learning to integrated medical and dental big data to inform ehealth approaches |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835559/ https://www.ncbi.nlm.nih.gov/pubmed/36643095 http://dx.doi.org/10.3389/fdmed.2022.1005140 |
work_keys_str_mv | AT shimpineel identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches AT glurichingrid identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches AT pannyaloksagar identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches AT hegdeharshad identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches AT scannapiecofranka identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches AT acharyaamit identifyingoraldiseasevariablesassociatedwithpneumoniaemergencebyapplicationofmachinelearningtointegratedmedicalanddentalbigdatatoinformehealthapproaches |