Cargando…

Improved Training Efficiency for Retinopathy of Prematurity Deep Learning Models Using Comparison versus Class Labels

PURPOSE: To compare the efficacy and efficiency of training neural networks for medical image classification using comparison labels indicating relative disease severity versus diagnostic class labels from a retinopathy of prematurity (ROP) image dataset. DESIGN: Evaluation of diagnostic test or tec...

Descripción completa

Detalles Bibliográficos
Autores principales: Hanif, Adam, Yıldız, İlkay, Tian, Peng, Kalkanlı, Beyza, Erdoğmuş, Deniz, Ioannidis, Stratis, Dy, Jennifer, Kalpathy-Cramer, Jayashree, Ostmo, Susan, Jonas, Karyn, Chan, R. V. Paul, Chiang, Michael F., Campbell, J. Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9560533/
https://www.ncbi.nlm.nih.gov/pubmed/36249702
http://dx.doi.org/10.1016/j.xops.2022.100122
Descripción
Sumario:PURPOSE: To compare the efficacy and efficiency of training neural networks for medical image classification using comparison labels indicating relative disease severity versus diagnostic class labels from a retinopathy of prematurity (ROP) image dataset. DESIGN: Evaluation of diagnostic test or technology. PARTICIPANTS: Deep learning neural networks trained on expert-labeled wide-angle retinal images obtained from patients undergoing diagnostic ROP examinations obtained as part of the Imaging and Informatics in ROP (i-ROP) cohort study. METHODS: Neural networks were trained with either class or comparison labels indicating plus disease severity in ROP retinal fundus images from 2 datasets. After training and validation, all networks underwent evaluation using a separate test dataset in 1 of 2 binary classification tasks: normal versus abnormal or plus versus nonplus. MAIN OUTCOME MEASURES: Area under the receiver operating characteristic curve (AUC) values were measured to assess network performance. RESULTS: Given the same number of labels, neural networks learned more efficiently by comparison, generating significantly higher AUCs in both classification tasks across both datasets. Similarly, given the same number of images, comparison learning developed networks with significantly higher AUCs across both classification tasks in 1 of 2 datasets. The difference in efficiency and accuracy between models trained on either label type decreased as the size of the training set increased. CONCLUSIONS: Comparison labels individually are more informative and more abundant per sample than class labels. These findings indicate a potential means of overcoming the common obstacle of data variability and scarcity when training neural networks for medical image classification tasks.