Cargando…

Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references

INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a lar...

Descripción completa

Detalles Bibliográficos
Autores principales: Kristensen, Kris, Olesen, Pernille H., Roerbaek, Anna K., Nielsen, Louise, Hansen, Helle K., Cichosz, Simon L., Jensen, Morten H., Hejlesen, Ole
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435934/
https://www.ncbi.nlm.nih.gov/pubmed/37448113
http://dx.doi.org/10.1111/crj.13662
_version_ 1785092215804002304
author Kristensen, Kris
Olesen, Pernille H.
Roerbaek, Anna K.
Nielsen, Louise
Hansen, Helle K.
Cichosz, Simon L.
Jensen, Morten H.
Hejlesen, Ole
author_facet Kristensen, Kris
Olesen, Pernille H.
Roerbaek, Anna K.
Nielsen, Louise
Hansen, Helle K.
Cichosz, Simon L.
Jensen, Morten H.
Hejlesen, Ole
author_sort Kristensen, Kris
collection PubMed
description INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a large and representative cohort of the general population of the US without known lung disease. MATERIALS AND METHODS: FVC, FEV1, body measures, and demographic data from 23 433 people were extracted from NHANES. RF was used to develop different prediction models. The accuracy of RF was compared with the existing Danish clinical references, an improved multiple linear regression (MLR) model, and a model from the literature. RESULTS: The correlation between actual and predicted FVC and FEV1 and the 95% confidence interval for RF were found to be FVC = 0.85 (0.85; 0.86) (p < 0.001), FEV1 = 0.92 (0.92; 0.93) (p < 0.001), and existing clinical references were FVC = 0.66 (0.64; 0.68) (p < 0.001) and FEV1 = 0.69 (0.67; 0.70) (p < 0.001). Slope and intercept for the RF models predicting FVC and FEV1 were FVC 1.06 and −238.04 (mL), FEV1: 0.86 and 455.36 (mL), and for the MLR models, slope and intercept were FVC: 0.99 and 38.56 39 (mL), and FEV1: 1.01 and −56.57‐57 (mL). CONCLUSIONS: The results point toward machine learning models such as RF have the potential to improve the prediction of estimated lung function for individual patients. These predictions are used as reference values and are an important part of assessing spirometry measurements in clinical practice. Further work is necessary in order to reduce the size of the intercepts obtained through these results.
format Online
Article
Text
id pubmed-10435934
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-104359342023-08-19 Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references Kristensen, Kris Olesen, Pernille H. Roerbaek, Anna K. Nielsen, Louise Hansen, Helle K. Cichosz, Simon L. Jensen, Morten H. Hejlesen, Ole Clin Respir J Letters to the Editor INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a large and representative cohort of the general population of the US without known lung disease. MATERIALS AND METHODS: FVC, FEV1, body measures, and demographic data from 23 433 people were extracted from NHANES. RF was used to develop different prediction models. The accuracy of RF was compared with the existing Danish clinical references, an improved multiple linear regression (MLR) model, and a model from the literature. RESULTS: The correlation between actual and predicted FVC and FEV1 and the 95% confidence interval for RF were found to be FVC = 0.85 (0.85; 0.86) (p < 0.001), FEV1 = 0.92 (0.92; 0.93) (p < 0.001), and existing clinical references were FVC = 0.66 (0.64; 0.68) (p < 0.001) and FEV1 = 0.69 (0.67; 0.70) (p < 0.001). Slope and intercept for the RF models predicting FVC and FEV1 were FVC 1.06 and −238.04 (mL), FEV1: 0.86 and 455.36 (mL), and for the MLR models, slope and intercept were FVC: 0.99 and 38.56 39 (mL), and FEV1: 1.01 and −56.57‐57 (mL). CONCLUSIONS: The results point toward machine learning models such as RF have the potential to improve the prediction of estimated lung function for individual patients. These predictions are used as reference values and are an important part of assessing spirometry measurements in clinical practice. Further work is necessary in order to reduce the size of the intercepts obtained through these results. John Wiley and Sons Inc. 2023-07-13 /pmc/articles/PMC10435934/ /pubmed/37448113 http://dx.doi.org/10.1111/crj.13662 Text en © 2023 The Authors. The Clinical Respiratory Journal published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Letters to the Editor
Kristensen, Kris
Olesen, Pernille H.
Roerbaek, Anna K.
Nielsen, Louise
Hansen, Helle K.
Cichosz, Simon L.
Jensen, Morten H.
Hejlesen, Ole
Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title_full Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title_fullStr Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title_full_unstemmed Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title_short Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
title_sort using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
topic Letters to the Editor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435934/
https://www.ncbi.nlm.nih.gov/pubmed/37448113
http://dx.doi.org/10.1111/crj.13662
work_keys_str_mv AT kristensenkris usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT olesenpernilleh usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT roerbaekannak usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT nielsenlouise usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT hansenhellek usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT cichoszsimonl usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT jensenmortenh usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences
AT hejlesenole usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences