Cargando…

More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification

AIM: To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning. METHODS: Error rates in 10 experimental settings were assessed. The performance of pretrained models and data augmentation were examined as possible contributing f...

Descripción completa

Detalles Bibliográficos
Autores principales: Glavaški, Mila, Velicki, Lazar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Croatian Medical Schools 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468729/
https://www.ncbi.nlm.nih.gov/pubmed/36046934
http://dx.doi.org/10.3325/cmj.2022.63.370
_version_ 1784788480300154880
author Glavaški, Mila
Velicki, Lazar
author_facet Glavaški, Mila
Velicki, Lazar
author_sort Glavaški, Mila
collection PubMed
description AIM: To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning. METHODS: Error rates in 10 experimental settings were assessed. The performance of pretrained models and data augmentation were examined as possible contributing factors. RESULTS: Error rates in experimental settings using MR images of different patients for training and test sets were ten times higher than those in experimental settings using MR images of the same patients (four disease groups with whole-chest images, 46.80% vs 2.06%; four disease groups without whole-chest images, 49.09% vs 1.29%; sex classification with whole-chest images, 16.02% vs 0.96%; and sex classification without whole-chest images, 23.56% vs 0.30%). Error rates were higher when data augmentation was applied to settings that used MR images of different patients for training and test sets. CONCLUSION: When deep learning is applied to MR image classification, training and test sets should consist of MR images of different patients. Models built on training and test sets consisting of images of the same patients yield optimistic error rates and lead to wrong conclusions. MR images of neighboring slices are so similar that they cause data leakage effect.
format Online
Article
Text
id pubmed-9468729
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Croatian Medical Schools
record_format MEDLINE/PubMed
spelling pubmed-94687292022-09-29 More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification Glavaški, Mila Velicki, Lazar Croat Med J Research Article AIM: To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning. METHODS: Error rates in 10 experimental settings were assessed. The performance of pretrained models and data augmentation were examined as possible contributing factors. RESULTS: Error rates in experimental settings using MR images of different patients for training and test sets were ten times higher than those in experimental settings using MR images of the same patients (four disease groups with whole-chest images, 46.80% vs 2.06%; four disease groups without whole-chest images, 49.09% vs 1.29%; sex classification with whole-chest images, 16.02% vs 0.96%; and sex classification without whole-chest images, 23.56% vs 0.30%). Error rates were higher when data augmentation was applied to settings that used MR images of different patients for training and test sets. CONCLUSION: When deep learning is applied to MR image classification, training and test sets should consist of MR images of different patients. Models built on training and test sets consisting of images of the same patients yield optimistic error rates and lead to wrong conclusions. MR images of neighboring slices are so similar that they cause data leakage effect. Croatian Medical Schools 2022-08 /pmc/articles/PMC9468729/ /pubmed/36046934 http://dx.doi.org/10.3325/cmj.2022.63.370 Text en Copyright © 2022 by the Croatian Medical Journal. All rights reserved. https://creativecommons.org/licenses/by/2.5/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Glavaški, Mila
Velicki, Lazar
More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title_full More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title_fullStr More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title_full_unstemmed More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title_short More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
title_sort more slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468729/
https://www.ncbi.nlm.nih.gov/pubmed/36046934
http://dx.doi.org/10.3325/cmj.2022.63.370
work_keys_str_mv AT glavaskimila moresliceslesstrutheffectsofdifferenttestsetdesignstrategiesformagneticresonanceimageclassification
AT velickilazar moresliceslesstrutheffectsofdifferenttestsetdesignstrategiesformagneticresonanceimageclassification