Cargando…

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers

BACKGROUND: Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by c...

Descripción completa

Detalles Bibliográficos
Autores principales: Jang, Ryoungwoo, Kim, Namkug, Jang, Miso, Lee, Kyung Hwa, Lee, Sang Min, Lee, Kyung Hee, Noh, Han Na, Seo, Joon Beom
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7435602/
https://www.ncbi.nlm.nih.gov/pubmed/32749222
http://dx.doi.org/10.2196/18089
_version_ 1783572360348041216
author Jang, Ryoungwoo
Kim, Namkug
Jang, Miso
Lee, Kyung Hwa
Lee, Sang Min
Lee, Kyung Hee
Noh, Han Na
Seo, Joon Beom
author_facet Jang, Ryoungwoo
Kim, Namkug
Jang, Miso
Lee, Kyung Hwa
Lee, Sang Min
Lee, Kyung Hee
Noh, Han Na
Seo, Joon Beom
author_sort Jang, Ryoungwoo
collection PubMed
description BACKGROUND: Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors. OBJECTIVE: This study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling. METHODS: We trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist. RESULTS: In comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%. CONCLUSIONS: The deep learning–based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis.
format Online
Article
Text
id pubmed-7435602
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74356022020-08-31 Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers Jang, Ryoungwoo Kim, Namkug Jang, Miso Lee, Kyung Hwa Lee, Sang Min Lee, Kyung Hee Noh, Han Na Seo, Joon Beom JMIR Med Inform Original Paper BACKGROUND: Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors. OBJECTIVE: This study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling. METHODS: We trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist. RESULTS: In comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%. CONCLUSIONS: The deep learning–based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis. JMIR Publications 2020-08-04 /pmc/articles/PMC7435602/ /pubmed/32749222 http://dx.doi.org/10.2196/18089 Text en ©Ryoungwoo Jang, Namkug Kim, Miso Jang, Kyung Hwa Lee, Sang Min Lee, Kyung Hee Lee, Han Na Noh, Joon Beom Seo. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 04.08.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Jang, Ryoungwoo
Kim, Namkug
Jang, Miso
Lee, Kyung Hwa
Lee, Sang Min
Lee, Kyung Hee
Noh, Han Na
Seo, Joon Beom
Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title_full Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title_fullStr Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title_full_unstemmed Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title_short Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers
title_sort assessment of the robustness of convolutional neural networks in labeling noise by using chest x-ray images from multiple centers
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7435602/
https://www.ncbi.nlm.nih.gov/pubmed/32749222
http://dx.doi.org/10.2196/18089
work_keys_str_mv AT jangryoungwoo assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT kimnamkug assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT jangmiso assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT leekyunghwa assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT leesangmin assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT leekyunghee assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT nohhanna assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters
AT seojoonbeom assessmentoftherobustnessofconvolutionalneuralnetworksinlabelingnoisebyusingchestxrayimagesfrommultiplecenters