Cargando…

Identifying Disease of Interest With Deep Learning Using Diagnosis Code

BACKGROUND: Autoencoder (AE) is one of the deep learning techniques that uses an artificial neural network to reconstruct its input data in the output layer. We constructed a novel supervised AE model and tested its performance in the prediction of a co-existence of the disease of interest only usin...

Descripción completa

Detalles Bibliográficos
Autores principales: Cho, Yoon-Sik, Kim, Eunsun, Stafford, Patrick L., Oh, Min-hwan, Kwon, Younghoon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Korean Academy of Medical Sciences 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027541/
https://www.ncbi.nlm.nih.gov/pubmed/36942391
http://dx.doi.org/10.3346/jkms.2023.38.e77
_version_ 1784909725979115520
author Cho, Yoon-Sik
Kim, Eunsun
Stafford, Patrick L.
Oh, Min-hwan
Kwon, Younghoon
author_facet Cho, Yoon-Sik
Kim, Eunsun
Stafford, Patrick L.
Oh, Min-hwan
Kwon, Younghoon
author_sort Cho, Yoon-Sik
collection PubMed
description BACKGROUND: Autoencoder (AE) is one of the deep learning techniques that uses an artificial neural network to reconstruct its input data in the output layer. We constructed a novel supervised AE model and tested its performance in the prediction of a co-existence of the disease of interest only using diagnostic codes. METHODS: Diagnostic codes of one million randomly sampled patients listed in the Korean National Health Information Database in 2019 were used to train, validate, and test the prediction model. The first used AE solely for a feature engineering tool for an input of a classifier. Supervised Multi-Layer Perceptron (sMLP) was added to train a classifier to predict a binary level with latent representation as an input (AE + sMLP). The second model simultaneously updated the parameters in the AE and the connected MLP classifier during the learning process (End-to-End Supervised AE [EEsAE]). We tested the performances of these two models against baseline models, eXtreme Gradient Boosting (XGB) and naïve Bayes, in the prediction of co-existing gastric cancer diagnosis. RESULTS: The proposed EEsAE model yielded the highest F1-score and highest area under the curve (0.86). The EEsAE and AE + sMLP gave the highest recalls. XGB yielded the highest precision. Ablation study revealed that iron deficiency anemia, gastroesophageal reflux disease, essential hypertension, gastric ulcers, benign prostate hyperplasia, and shoulder lesion were the top 6 most influential diagnoses on performance. CONCLUSION: A novel EEsAE model showed promising performance in the prediction of a disease of interest.
format Online
Article
Text
id pubmed-10027541
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher The Korean Academy of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-100275412023-03-22 Identifying Disease of Interest With Deep Learning Using Diagnosis Code Cho, Yoon-Sik Kim, Eunsun Stafford, Patrick L. Oh, Min-hwan Kwon, Younghoon J Korean Med Sci Original Article BACKGROUND: Autoencoder (AE) is one of the deep learning techniques that uses an artificial neural network to reconstruct its input data in the output layer. We constructed a novel supervised AE model and tested its performance in the prediction of a co-existence of the disease of interest only using diagnostic codes. METHODS: Diagnostic codes of one million randomly sampled patients listed in the Korean National Health Information Database in 2019 were used to train, validate, and test the prediction model. The first used AE solely for a feature engineering tool for an input of a classifier. Supervised Multi-Layer Perceptron (sMLP) was added to train a classifier to predict a binary level with latent representation as an input (AE + sMLP). The second model simultaneously updated the parameters in the AE and the connected MLP classifier during the learning process (End-to-End Supervised AE [EEsAE]). We tested the performances of these two models against baseline models, eXtreme Gradient Boosting (XGB) and naïve Bayes, in the prediction of co-existing gastric cancer diagnosis. RESULTS: The proposed EEsAE model yielded the highest F1-score and highest area under the curve (0.86). The EEsAE and AE + sMLP gave the highest recalls. XGB yielded the highest precision. Ablation study revealed that iron deficiency anemia, gastroesophageal reflux disease, essential hypertension, gastric ulcers, benign prostate hyperplasia, and shoulder lesion were the top 6 most influential diagnoses on performance. CONCLUSION: A novel EEsAE model showed promising performance in the prediction of a disease of interest. The Korean Academy of Medical Sciences 2023-03-03 /pmc/articles/PMC10027541/ /pubmed/36942391 http://dx.doi.org/10.3346/jkms.2023.38.e77 Text en © 2023 The Korean Academy of Medical Sciences. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Cho, Yoon-Sik
Kim, Eunsun
Stafford, Patrick L.
Oh, Min-hwan
Kwon, Younghoon
Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title_full Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title_fullStr Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title_full_unstemmed Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title_short Identifying Disease of Interest With Deep Learning Using Diagnosis Code
title_sort identifying disease of interest with deep learning using diagnosis code
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027541/
https://www.ncbi.nlm.nih.gov/pubmed/36942391
http://dx.doi.org/10.3346/jkms.2023.38.e77
work_keys_str_mv AT choyoonsik identifyingdiseaseofinterestwithdeeplearningusingdiagnosiscode
AT kimeunsun identifyingdiseaseofinterestwithdeeplearningusingdiagnosiscode
AT staffordpatrickl identifyingdiseaseofinterestwithdeeplearningusingdiagnosiscode
AT ohminhwan identifyingdiseaseofinterestwithdeeplearningusingdiagnosiscode
AT kwonyounghoon identifyingdiseaseofinterestwithdeeplearningusingdiagnosiscode