Cargando…

Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test

BACKGROUND: Accurate interpretation of chest radiographs requires years of medical training, and many countries face a shortage of medical professionals to meet such requirements. Recent advancements in artificial intelligence (AI) have aided diagnoses; however, their performance is often limited du...

Descripción completa

Detalles Bibliográficos
Autores principales: Myong, Youho, Yoon, Dan, Kim, Byeong Soo, Kim, Young Gyun, Sim, Yongsik, Lee, Suji, Yoon, Jiyoung, Cho, Minwoo, Kim, Sungwan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096231/
https://www.ncbi.nlm.nih.gov/pubmed/37043456
http://dx.doi.org/10.1371/journal.pone.0279349
_version_ 1785024284051111936
author Myong, Youho
Yoon, Dan
Kim, Byeong Soo
Kim, Young Gyun
Sim, Yongsik
Lee, Suji
Yoon, Jiyoung
Cho, Minwoo
Kim, Sungwan
author_facet Myong, Youho
Yoon, Dan
Kim, Byeong Soo
Kim, Young Gyun
Sim, Yongsik
Lee, Suji
Yoon, Jiyoung
Cho, Minwoo
Kim, Sungwan
author_sort Myong, Youho
collection PubMed
description BACKGROUND: Accurate interpretation of chest radiographs requires years of medical training, and many countries face a shortage of medical professionals to meet such requirements. Recent advancements in artificial intelligence (AI) have aided diagnoses; however, their performance is often limited due to data imbalance. The aim of this study was to augment imbalanced medical data using generative adversarial networks (GANs) and evaluate the clinical quality of the generated images via a multi-center visual Turing test. METHODS: Using six chest radiograph datasets, (MIMIC, CheXPert, CXR8, JSRT, VBD, and OpenI), starGAN v2 generated chest radiographs with specific pathologies. Five board-certified radiologists from three university hospitals, each with at least five years of clinical experience, evaluated the image quality through a visual Turing test. Further evaluations were performed to investigate whether GAN augmentation enhanced the convolutional neural network (CNN) classifier performances. RESULTS: In terms of identifying GAN images as artificial, there was no significant difference in the sensitivity between radiologists and random guessing (result of radiologists: 147/275 (53.5%) vs result of random guessing: 137.5/275, (50%); p = .284). GAN augmentation enhanced CNN classifier performance by 11.7%. CONCLUSION: Radiologists effectively classified chest pathologies with synthesized radiographs, suggesting that the images contained adequate clinical information. Furthermore, GAN augmentation enhanced CNN performance, providing a bypass to overcome data imbalance in medical AI training. CNN based methods rely on the amount and quality of training data; the present study showed that GAN augmentation could effectively augment training data for medical AI.
format Online
Article
Text
id pubmed-10096231
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100962312023-04-13 Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test Myong, Youho Yoon, Dan Kim, Byeong Soo Kim, Young Gyun Sim, Yongsik Lee, Suji Yoon, Jiyoung Cho, Minwoo Kim, Sungwan PLoS One Research Article BACKGROUND: Accurate interpretation of chest radiographs requires years of medical training, and many countries face a shortage of medical professionals to meet such requirements. Recent advancements in artificial intelligence (AI) have aided diagnoses; however, their performance is often limited due to data imbalance. The aim of this study was to augment imbalanced medical data using generative adversarial networks (GANs) and evaluate the clinical quality of the generated images via a multi-center visual Turing test. METHODS: Using six chest radiograph datasets, (MIMIC, CheXPert, CXR8, JSRT, VBD, and OpenI), starGAN v2 generated chest radiographs with specific pathologies. Five board-certified radiologists from three university hospitals, each with at least five years of clinical experience, evaluated the image quality through a visual Turing test. Further evaluations were performed to investigate whether GAN augmentation enhanced the convolutional neural network (CNN) classifier performances. RESULTS: In terms of identifying GAN images as artificial, there was no significant difference in the sensitivity between radiologists and random guessing (result of radiologists: 147/275 (53.5%) vs result of random guessing: 137.5/275, (50%); p = .284). GAN augmentation enhanced CNN classifier performance by 11.7%. CONCLUSION: Radiologists effectively classified chest pathologies with synthesized radiographs, suggesting that the images contained adequate clinical information. Furthermore, GAN augmentation enhanced CNN performance, providing a bypass to overcome data imbalance in medical AI training. CNN based methods rely on the amount and quality of training data; the present study showed that GAN augmentation could effectively augment training data for medical AI. Public Library of Science 2023-04-12 /pmc/articles/PMC10096231/ /pubmed/37043456 http://dx.doi.org/10.1371/journal.pone.0279349 Text en © 2023 Myong et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Myong, Youho
Yoon, Dan
Kim, Byeong Soo
Kim, Young Gyun
Sim, Yongsik
Lee, Suji
Yoon, Jiyoung
Cho, Minwoo
Kim, Sungwan
Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title_full Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title_fullStr Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title_full_unstemmed Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title_short Evaluating diagnostic content of AI-generated chest radiography: A multi-center visual Turing test
title_sort evaluating diagnostic content of ai-generated chest radiography: a multi-center visual turing test
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10096231/
https://www.ncbi.nlm.nih.gov/pubmed/37043456
http://dx.doi.org/10.1371/journal.pone.0279349
work_keys_str_mv AT myongyouho evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT yoondan evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT kimbyeongsoo evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT kimyounggyun evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT simyongsik evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT leesuji evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT yoonjiyoung evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT chominwoo evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest
AT kimsungwan evaluatingdiagnosticcontentofaigeneratedchestradiographyamulticentervisualturingtest