Cargando…

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study

A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low...

Descripción completa

Detalles Bibliográficos
Autores principales:	Taninaga, Junichi, Nishiyama, Yu, Fujibayashi, Kazutoshi, Gunji, Toshiaki, Sasabe, Noriko, Iijima, Kimiko, Naito, Toshio
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712020/ https://www.ncbi.nlm.nih.gov/pubmed/31455831 http://dx.doi.org/10.1038/s41598-019-48769-y

_version_	1783446604539232256
author	Taninaga, Junichi Nishiyama, Yu Fujibayashi, Kazutoshi Gunji, Toshiaki Sasabe, Noriko Iijima, Kimiko Naito, Toshio
author_facet	Taninaga, Junichi Nishiyama, Yu Fujibayashi, Kazutoshi Gunji, Toshiaki Sasabe, Noriko Iijima, Kimiko Naito, Toshio
author_sort	Taninaga, Junichi
collection	PubMed
description	A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low risk of developing gastric cancer. We used XGBoost, a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among many input variables and outcomes using the boosting approach to machine learning. Longitudinal and comprehensive medical check-up data were collected from 25,942 participants who underwent multiple endoscopies from 2006 to 2017 at a single facility in Japan. The participants were classified into a case group (y = 1) or a control group (y = 0) if gastric cancer was or was not detected, respectively, during a 122-month period. Among 1,431 total participants (89 cases and 1,342 controls), 1,144 (80%) were randomly selected for use in training 10 classification models; the remaining 287 (20%) were used to evaluate the models. The results showed that XGBoost outperformed logistic regression and showed the highest area under the curve value (0.899). Accumulating more data in the facility and performing further analyses including other input variables may help expand the clinical utility.
format	Online Article Text
id	pubmed-6712020
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-67120202019-09-13 Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study Taninaga, Junichi Nishiyama, Yu Fujibayashi, Kazutoshi Gunji, Toshiaki Sasabe, Noriko Iijima, Kimiko Naito, Toshio Sci Rep Article A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low risk of developing gastric cancer. We used XGBoost, a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among many input variables and outcomes using the boosting approach to machine learning. Longitudinal and comprehensive medical check-up data were collected from 25,942 participants who underwent multiple endoscopies from 2006 to 2017 at a single facility in Japan. The participants were classified into a case group (y = 1) or a control group (y = 0) if gastric cancer was or was not detected, respectively, during a 122-month period. Among 1,431 total participants (89 cases and 1,342 controls), 1,144 (80%) were randomly selected for use in training 10 classification models; the remaining 287 (20%) were used to evaluate the models. The results showed that XGBoost outperformed logistic regression and showed the highest area under the curve value (0.899). Accumulating more data in the facility and performing further analyses including other input variables may help expand the clinical utility. Nature Publishing Group UK 2019-08-27 /pmc/articles/PMC6712020/ /pubmed/31455831 http://dx.doi.org/10.1038/s41598-019-48769-y Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Taninaga, Junichi Nishiyama, Yu Fujibayashi, Kazutoshi Gunji, Toshiaki Sasabe, Noriko Iijima, Kimiko Naito, Toshio Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title	Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title_full	Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title_fullStr	Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title_full_unstemmed	Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title_short	Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study
title_sort	prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: a case-control study
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712020/ https://www.ncbi.nlm.nih.gov/pubmed/31455831 http://dx.doi.org/10.1038/s41598-019-48769-y
work_keys_str_mv	AT taninagajunichi predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT nishiyamayu predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT fujibayashikazutoshi predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT gunjitoshiaki predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT sasabenoriko predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT iijimakimiko predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy AT naitotoshio predictionoffuturegastriccancerriskusingamachinelearningalgorithmandcomprehensivemedicalcheckupdataacasecontrolstudy

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study

Ejemplares similares