Cargando…
Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references
INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a lar...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435934/ https://www.ncbi.nlm.nih.gov/pubmed/37448113 http://dx.doi.org/10.1111/crj.13662 |
_version_ | 1785092215804002304 |
---|---|
author | Kristensen, Kris Olesen, Pernille H. Roerbaek, Anna K. Nielsen, Louise Hansen, Helle K. Cichosz, Simon L. Jensen, Morten H. Hejlesen, Ole |
author_facet | Kristensen, Kris Olesen, Pernille H. Roerbaek, Anna K. Nielsen, Louise Hansen, Helle K. Cichosz, Simon L. Jensen, Morten H. Hejlesen, Ole |
author_sort | Kristensen, Kris |
collection | PubMed |
description | INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a large and representative cohort of the general population of the US without known lung disease. MATERIALS AND METHODS: FVC, FEV1, body measures, and demographic data from 23 433 people were extracted from NHANES. RF was used to develop different prediction models. The accuracy of RF was compared with the existing Danish clinical references, an improved multiple linear regression (MLR) model, and a model from the literature. RESULTS: The correlation between actual and predicted FVC and FEV1 and the 95% confidence interval for RF were found to be FVC = 0.85 (0.85; 0.86) (p < 0.001), FEV1 = 0.92 (0.92; 0.93) (p < 0.001), and existing clinical references were FVC = 0.66 (0.64; 0.68) (p < 0.001) and FEV1 = 0.69 (0.67; 0.70) (p < 0.001). Slope and intercept for the RF models predicting FVC and FEV1 were FVC 1.06 and −238.04 (mL), FEV1: 0.86 and 455.36 (mL), and for the MLR models, slope and intercept were FVC: 0.99 and 38.56 39 (mL), and FEV1: 1.01 and −56.57‐57 (mL). CONCLUSIONS: The results point toward machine learning models such as RF have the potential to improve the prediction of estimated lung function for individual patients. These predictions are used as reference values and are an important part of assessing spirometry measurements in clinical practice. Further work is necessary in order to reduce the size of the intercepts obtained through these results. |
format | Online Article Text |
id | pubmed-10435934 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-104359342023-08-19 Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references Kristensen, Kris Olesen, Pernille H. Roerbaek, Anna K. Nielsen, Louise Hansen, Helle K. Cichosz, Simon L. Jensen, Morten H. Hejlesen, Ole Clin Respir J Letters to the Editor INTRODUCTION: Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a large and representative cohort of the general population of the US without known lung disease. MATERIALS AND METHODS: FVC, FEV1, body measures, and demographic data from 23 433 people were extracted from NHANES. RF was used to develop different prediction models. The accuracy of RF was compared with the existing Danish clinical references, an improved multiple linear regression (MLR) model, and a model from the literature. RESULTS: The correlation between actual and predicted FVC and FEV1 and the 95% confidence interval for RF were found to be FVC = 0.85 (0.85; 0.86) (p < 0.001), FEV1 = 0.92 (0.92; 0.93) (p < 0.001), and existing clinical references were FVC = 0.66 (0.64; 0.68) (p < 0.001) and FEV1 = 0.69 (0.67; 0.70) (p < 0.001). Slope and intercept for the RF models predicting FVC and FEV1 were FVC 1.06 and −238.04 (mL), FEV1: 0.86 and 455.36 (mL), and for the MLR models, slope and intercept were FVC: 0.99 and 38.56 39 (mL), and FEV1: 1.01 and −56.57‐57 (mL). CONCLUSIONS: The results point toward machine learning models such as RF have the potential to improve the prediction of estimated lung function for individual patients. These predictions are used as reference values and are an important part of assessing spirometry measurements in clinical practice. Further work is necessary in order to reduce the size of the intercepts obtained through these results. John Wiley and Sons Inc. 2023-07-13 /pmc/articles/PMC10435934/ /pubmed/37448113 http://dx.doi.org/10.1111/crj.13662 Text en © 2023 The Authors. The Clinical Respiratory Journal published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Letters to the Editor Kristensen, Kris Olesen, Pernille H. Roerbaek, Anna K. Nielsen, Louise Hansen, Helle K. Cichosz, Simon L. Jensen, Morten H. Hejlesen, Ole Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title | Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title_full | Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title_fullStr | Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title_full_unstemmed | Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title_short | Using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
title_sort | using random forest machine learning on data from a large, representative cohort of the general population improves clinical spirometry references |
topic | Letters to the Editor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435934/ https://www.ncbi.nlm.nih.gov/pubmed/37448113 http://dx.doi.org/10.1111/crj.13662 |
work_keys_str_mv | AT kristensenkris usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT olesenpernilleh usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT roerbaekannak usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT nielsenlouise usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT hansenhellek usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT cichoszsimonl usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT jensenmortenh usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences AT hejlesenole usingrandomforestmachinelearningondatafromalargerepresentativecohortofthegeneralpopulationimprovesclinicalspirometryreferences |