Cargando…

More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification

AIM: To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning. METHODS: Error rates in 10 experimental settings were assessed. The performance of pretrained models and data augmentation were examined as possible contributing f...

Descripción completa

Detalles Bibliográficos
Autores principales:	Glavaški, Mila, Velicki, Lazar
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Croatian Medical Schools 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468729/ https://www.ncbi.nlm.nih.gov/pubmed/36046934 http://dx.doi.org/10.3325/cmj.2022.63.370

Descripción
Sumario:	AIM: To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning. METHODS: Error rates in 10 experimental settings were assessed. The performance of pretrained models and data augmentation were examined as possible contributing factors. RESULTS: Error rates in experimental settings using MR images of different patients for training and test sets were ten times higher than those in experimental settings using MR images of the same patients (four disease groups with whole-chest images, 46.80% vs 2.06%; four disease groups without whole-chest images, 49.09% vs 1.29%; sex classification with whole-chest images, 16.02% vs 0.96%; and sex classification without whole-chest images, 23.56% vs 0.30%). Error rates were higher when data augmentation was applied to settings that used MR images of different patients for training and test sets. CONCLUSION: When deep learning is applied to MR image classification, training and test sets should consist of MR images of different patients. Models built on training and test sets consisting of images of the same patients yield optimistic error rates and lead to wrong conclusions. MR images of neighboring slices are so similar that they cause data leakage effect.

More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification

Ejemplares similares