Cargando…

An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes

BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Yin, Fangtao, Zhu, Hongyu, Hong, Songlin, Sun, Chen, Wang, Jie, Sun, Mengting, Xu, Lin, Wang, Xiaoxiao, Yin, Rong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106088/
https://www.ncbi.nlm.nih.gov/pubmed/33987321
http://dx.doi.org/10.21037/atm-20-4704
_version_ 1783689712813211648
author Yin, Fangtao
Zhu, Hongyu
Hong, Songlin
Sun, Chen
Wang, Jie
Sun, Mengting
Xu, Lin
Wang, Xiaoxiao
Yin, Rong
author_facet Yin, Fangtao
Zhu, Hongyu
Hong, Songlin
Sun, Chen
Wang, Jie
Sun, Mengting
Xu, Lin
Wang, Xiaoxiao
Yin, Rong
author_sort Yin, Fangtao
collection PubMed
description BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world. METHODS: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the “Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis” (TRIPOD) statement for the reporting of prediction models. RESULTS: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle- to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females. CONCLUSIONS: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging.
format Online
Article
Text
id pubmed-8106088
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-81060882021-05-12 An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes Yin, Fangtao Zhu, Hongyu Hong, Songlin Sun, Chen Wang, Jie Sun, Mengting Xu, Lin Wang, Xiaoxiao Yin, Rong Ann Transl Med Original Article BACKGROUND: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world. METHODS: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the “Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis” (TRIPOD) statement for the reporting of prediction models. RESULTS: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle- to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females. CONCLUSIONS: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging. AME Publishing Company 2021-04 /pmc/articles/PMC8106088/ /pubmed/33987321 http://dx.doi.org/10.21037/atm-20-4704 Text en 2021 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Original Article
Yin, Fangtao
Zhu, Hongyu
Hong, Songlin
Sun, Chen
Wang, Jie
Sun, Mengting
Xu, Lin
Wang, Xiaoxiao
Yin, Rong
An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title_full An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title_fullStr An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title_full_unstemmed An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title_short An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes
title_sort application of machine learning based on real-world data: mining features of fibrinogen in clinical stages of lung cancer between sexes
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106088/
https://www.ncbi.nlm.nih.gov/pubmed/33987321
http://dx.doi.org/10.21037/atm-20-4704
work_keys_str_mv AT yinfangtao anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT zhuhongyu anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT hongsonglin anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT sunchen anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT wangjie anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT sunmengting anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT xulin anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT wangxiaoxiao anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT yinrong anapplicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT yinfangtao applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT zhuhongyu applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT hongsonglin applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT sunchen applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT wangjie applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT sunmengting applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT xulin applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT wangxiaoxiao applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes
AT yinrong applicationofmachinelearningbasedonrealworlddataminingfeaturesoffibrinogeninclinicalstagesoflungcancerbetweensexes