Cargando…
An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance fr...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106088/ https://www.ncbi.nlm.nih.gov/pubmed/33987321 http://dx.doi.org/10.21037/atm-20-4704 |
_version_ | 1783689712813211648 |
---|---|
author | Yin, Fangtao Zhu, Hongyu Hong, Songlin Sun, Chen Wang, Jie Sun, Mengting Xu, Lin Wang, Xiaoxiao Yin, Rong |
author_facet | Yin, Fangtao Zhu, Hongyu Hong, Songlin Sun, Chen Wang, Jie Sun, Mengting Xu, Lin Wang, Xiaoxiao Yin, Rong |
author_sort | Yin, Fangtao |
collection | PubMed |
description | BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world. METHODS: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the “Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis” (TRIPOD) statement for the reporting of prediction models. RESULTS: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle- to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females. CONCLUSIONS: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging. |
format | Online Article Text |
id | pubmed-8106088 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-81060882021-05-12 An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes Yin, Fangtao Zhu, Hongyu Hong, Songlin Sun, Chen Wang, Jie Sun, Mengting Xu, Lin Wang, Xiaoxiao Yin, Rong Ann Transl Med Original Article BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world. METHODS: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the “Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis” (TRIPOD) statement for the reporting of prediction models. RESULTS: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle- to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females. CONCLUSIONS: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging. AME Publishing Company 2021-04 /pmc/articles/PMC8106088/ /pubmed/33987321 http://dx.doi.org/10.21037/atm-20-4704 Text en 2021 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Yin, Fangtao Zhu, Hongyu Hong, Songlin Sun, Chen Wang, Jie Sun, Mengting Xu, Lin Wang, Xiaoxiao Yin, Rong An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title | An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title_full | An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title_fullStr | An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title_full_unstemmed | An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title_short | An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes |
title_sort | application of machine learning based on real-world data: mining features of fibrinogen in clinical stages of lung cancer between sexes |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106088/ https://www.ncbi.nlm.nih.gov/pubmed/33987321 http://dx.doi.org/10.21037/atm-20-4704 |
work_keys_str_mv | AT yinfangtao anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT zhuhongyu anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT hongsonglin anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT sunchen anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT wangjie anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT sunmengting anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT xulin anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT wangxiaoxiao anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT yinrong anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT yinfangtao applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT zhuhongyu applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT hongsonglin applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT sunchen applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT wangjie applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT sunmengting applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT xulin applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT wangxiaoxiao applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes AT yinrong applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes |