Cargando…

Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease

BACKGROUND: This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. METHODS: A total of 153 RA patients were enrolled, including 75 RA-ILD and 78 RA-...

Descripción completa

Detalles Bibliográficos
Autores principales: Qin, Yan, Wang, Yanlin, Meng, Fanxing, Feng, Min, Zhao, Xiangcong, Gao, Chong, Luo, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118651/
https://www.ncbi.nlm.nih.gov/pubmed/35590341
http://dx.doi.org/10.1186/s13075-022-02800-2
_version_ 1784710542454161408
author Qin, Yan
Wang, Yanlin
Meng, Fanxing
Feng, Min
Zhao, Xiangcong
Gao, Chong
Luo, Jing
author_facet Qin, Yan
Wang, Yanlin
Meng, Fanxing
Feng, Min
Zhao, Xiangcong
Gao, Chong
Luo, Jing
author_sort Qin, Yan
collection PubMed
description BACKGROUND: This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. METHODS: A total of 153 RA patients were enrolled, including 75 RA-ILD and 78 RA-non-ILD. Routine laboratory data, the levels of tumor markers and autoantibodies, and clinical manifestations were recorded. Univariate analysis, least absolute shrinkage and selection operator (LASSO), random forest (RF), and partial least square (PLS) were performed, and the receiver operating characteristic (ROC) curves were plotted. RESULTS: Univariate analysis showed that, compared to RA-non-ILD, patients with RA-ILD were older (p < 0.001), had higher white blood cell (p = 0.003) and neutrophil counts (p = 0.017), had higher erythrocyte sedimentation rate (p = 0.003) and C-reactive protein (p = 0.003), had higher levels of KL-6 (p < 0.001), D-dimer (p < 0.001), fibrinogen (p < 0.001), fibrinogen degradation products (p < 0.001), lactate dehydrogenase (p < 0.001), hydroxybutyrate dehydrogenase (p < 0.001), carbohydrate antigen (CA) 19–9 (p < 0.001), carcinoembryonic antigen (p = 0.001), and CA242 (p < 0.001), but a significantly lower albumin level (p = 0.003). The areas under the curves (AUCs) of the LASSO, RF, and PLS models attained 0.95 in terms of differentiating patients with RA-ILD from those without. When data from the univariate analysis and the top 10 indicators of the three machine learning models were combined, the most discriminatory markers were age and the KL-6, D-dimer, and CA19-9, with AUCs of 0.814 [95% confidence interval (CI) 0.731–0.880], 0.749 (95% CI 0.660–0.824), 0.749 (95% CI 0.660–0.824), and 0.727 (95% CI 0.637–0.805), respectively. When all four markers were combined, the AUC reached 0.928 (95% CI 0.865–0.968). Notably, neither the KL-6 nor the CA19-9 level correlated with disease activity in RA-ILD group. CONCLUSIONS: The levels of KL-6, D-dimer, and tumor markers greatly aided RA-ILD identification. Machine learning algorithms combined with traditional biostatistical analysis can diagnose patients with RA-ILD and identify biomarkers potentially associated with the disease. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13075-022-02800-2.
format Online
Article
Text
id pubmed-9118651
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91186512022-05-20 Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease Qin, Yan Wang, Yanlin Meng, Fanxing Feng, Min Zhao, Xiangcong Gao, Chong Luo, Jing Arthritis Res Ther Research BACKGROUND: This study aimed to search for blood biomarkers among the profiles of patients with RA-ILD by using machine learning classifiers and probe correlations between the markers and the characteristics of RA-ILD. METHODS: A total of 153 RA patients were enrolled, including 75 RA-ILD and 78 RA-non-ILD. Routine laboratory data, the levels of tumor markers and autoantibodies, and clinical manifestations were recorded. Univariate analysis, least absolute shrinkage and selection operator (LASSO), random forest (RF), and partial least square (PLS) were performed, and the receiver operating characteristic (ROC) curves were plotted. RESULTS: Univariate analysis showed that, compared to RA-non-ILD, patients with RA-ILD were older (p < 0.001), had higher white blood cell (p = 0.003) and neutrophil counts (p = 0.017), had higher erythrocyte sedimentation rate (p = 0.003) and C-reactive protein (p = 0.003), had higher levels of KL-6 (p < 0.001), D-dimer (p < 0.001), fibrinogen (p < 0.001), fibrinogen degradation products (p < 0.001), lactate dehydrogenase (p < 0.001), hydroxybutyrate dehydrogenase (p < 0.001), carbohydrate antigen (CA) 19–9 (p < 0.001), carcinoembryonic antigen (p = 0.001), and CA242 (p < 0.001), but a significantly lower albumin level (p = 0.003). The areas under the curves (AUCs) of the LASSO, RF, and PLS models attained 0.95 in terms of differentiating patients with RA-ILD from those without. When data from the univariate analysis and the top 10 indicators of the three machine learning models were combined, the most discriminatory markers were age and the KL-6, D-dimer, and CA19-9, with AUCs of 0.814 [95% confidence interval (CI) 0.731–0.880], 0.749 (95% CI 0.660–0.824), 0.749 (95% CI 0.660–0.824), and 0.727 (95% CI 0.637–0.805), respectively. When all four markers were combined, the AUC reached 0.928 (95% CI 0.865–0.968). Notably, neither the KL-6 nor the CA19-9 level correlated with disease activity in RA-ILD group. CONCLUSIONS: The levels of KL-6, D-dimer, and tumor markers greatly aided RA-ILD identification. Machine learning algorithms combined with traditional biostatistical analysis can diagnose patients with RA-ILD and identify biomarkers potentially associated with the disease. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13075-022-02800-2. BioMed Central 2022-05-19 2022 /pmc/articles/PMC9118651/ /pubmed/35590341 http://dx.doi.org/10.1186/s13075-022-02800-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Qin, Yan
Wang, Yanlin
Meng, Fanxing
Feng, Min
Zhao, Xiangcong
Gao, Chong
Luo, Jing
Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title_full Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title_fullStr Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title_full_unstemmed Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title_short Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
title_sort identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118651/
https://www.ncbi.nlm.nih.gov/pubmed/35590341
http://dx.doi.org/10.1186/s13075-022-02800-2
work_keys_str_mv AT qinyan identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT wangyanlin identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT mengfanxing identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT fengmin identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT zhaoxiangcong identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT gaochong identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease
AT luojing identificationofbiomarkersbymachinelearningclassifierstoassistdiagnoserheumatoidarthritisassociatedinterstitiallungdisease