Cargando…

Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data

Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Hui, Lin, Jianmei, Xiao, Yanhong, Zheng, Wenwen, Zhao, Lu, Yang, Xiangling, Zhong, Minsheng, Liu, Huanliang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606732/
https://www.ncbi.nlm.nih.gov/pubmed/34806496
http://dx.doi.org/10.1177/15330338211058352
_version_ 1784602397834739712
author Li, Hui
Lin, Jianmei
Xiao, Yanhong
Zheng, Wenwen
Zhao, Lu
Yang, Xiangling
Zhong, Minsheng
Liu, Huanliang
author_facet Li, Hui
Lin, Jianmei
Xiao, Yanhong
Zheng, Wenwen
Zhao, Lu
Yang, Xiangling
Zhong, Minsheng
Liu, Huanliang
author_sort Li, Hui
collection PubMed
description Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate screening variables. Methods: This was a retrospective study that used data from electronic medical records of patients with CRC and healthy individuals between July 2017 and June 2018. Laboratory data, including liver enzymes, lipid profiles, complete blood counts, and tumor biomarkers, were extracted from the electronic medical records. Five machine learning models (logistic regression, random forest, k-nearest neighbors, support vector machine [SVM], and naïve Bayes) were used to identify CRC. The performances were evaluated using the areas under the curve (AUCs), sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV). Results: A total of 1164 electronic medical records (CRC patients: 582; healthy controls: 582) were included. The logistic regression model achieved the highest performance in identifying CRC (AUC: 0.865, sensitivity: 89.5%, specificity: 83.5%, PPV: 84.4%, NPV: 88.9%). The first four weighted features in the model were carcinoembryonic antigen (CEA), hemoglobin (HGB), lipoprotein (a) (Lp(a)), and high-density lipoprotein (HDL). A diagnostic model for CRC was established based on the four indicators, with an AUC of 0.849 (0.840-0.860) for identifying all CRC patients, and it performed best in discriminating patients with late colon cancer from healthy individuals with an AUC of 0.905 (0.889-0.929). Conclusions: The logistic regression model based on CEA, HGB, Lp(a), and HDL might be a powerful, noninvasive, and cost-effective method to identify CRC.
format Online
Article
Text
id pubmed-8606732
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-86067322021-11-23 Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data Li, Hui Lin, Jianmei Xiao, Yanhong Zheng, Wenwen Zhao, Lu Yang, Xiangling Zhong, Minsheng Liu, Huanliang Technol Cancer Res Treat Original Article Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate screening variables. Methods: This was a retrospective study that used data from electronic medical records of patients with CRC and healthy individuals between July 2017 and June 2018. Laboratory data, including liver enzymes, lipid profiles, complete blood counts, and tumor biomarkers, were extracted from the electronic medical records. Five machine learning models (logistic regression, random forest, k-nearest neighbors, support vector machine [SVM], and naïve Bayes) were used to identify CRC. The performances were evaluated using the areas under the curve (AUCs), sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV). Results: A total of 1164 electronic medical records (CRC patients: 582; healthy controls: 582) were included. The logistic regression model achieved the highest performance in identifying CRC (AUC: 0.865, sensitivity: 89.5%, specificity: 83.5%, PPV: 84.4%, NPV: 88.9%). The first four weighted features in the model were carcinoembryonic antigen (CEA), hemoglobin (HGB), lipoprotein (a) (Lp(a)), and high-density lipoprotein (HDL). A diagnostic model for CRC was established based on the four indicators, with an AUC of 0.849 (0.840-0.860) for identifying all CRC patients, and it performed best in discriminating patients with late colon cancer from healthy individuals with an AUC of 0.905 (0.889-0.929). Conclusions: The logistic regression model based on CEA, HGB, Lp(a), and HDL might be a powerful, noninvasive, and cost-effective method to identify CRC. SAGE Publications 2021-11-20 /pmc/articles/PMC8606732/ /pubmed/34806496 http://dx.doi.org/10.1177/15330338211058352 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Article
Li, Hui
Lin, Jianmei
Xiao, Yanhong
Zheng, Wenwen
Zhao, Lu
Yang, Xiangling
Zhong, Minsheng
Liu, Huanliang
Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_full Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_fullStr Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_full_unstemmed Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_short Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_sort colorectal cancer detected by machine learning models using conventional laboratory test data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606732/
https://www.ncbi.nlm.nih.gov/pubmed/34806496
http://dx.doi.org/10.1177/15330338211058352
work_keys_str_mv AT lihui colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT linjianmei colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT xiaoyanhong colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT zhengwenwen colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT zhaolu colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT yangxiangling colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT zhongminsheng colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT liuhuanliang colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata