Cargando…

Performance of five automated white matter hyperintensity segmentation methods in a multicenter dataset

White matter hyperintensities (WMHs) are a common manifestation of cerebral small vessel disease, that is increasingly studied with large, pooled multicenter datasets. This data pooling increases statistical power, but poses challenges for automated WMH segmentation. Although there is extensive lite...

Descripción completa

Detalles Bibliográficos
Autores principales: Heinen, Rutger, Steenwijk, Martijn D., Barkhof, Frederik, Biesbroek, J. Matthijs, van der Flier, Wiesje M., Kuijf, Hugo J., Prins, Niels D., Vrenken, Hugo, Biessels, Geert Jan, de Bresser, Jeroen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6856351/
https://www.ncbi.nlm.nih.gov/pubmed/31727919
http://dx.doi.org/10.1038/s41598-019-52966-0
Descripción
Sumario:White matter hyperintensities (WMHs) are a common manifestation of cerebral small vessel disease, that is increasingly studied with large, pooled multicenter datasets. This data pooling increases statistical power, but poses challenges for automated WMH segmentation. Although there is extensive literature on the evaluation of automated WMH segmentation methods, such evaluations in a multicenter setting are lacking. We performed WMH segmentations in sixty patients scanned on six different magnetic resonance imaging (MRI) scanners (10 patients per scanner) using five freely available and fully-automated WMH segmentation methods (Cascade, kNN-TTP, Lesion-TOADS, LST-LGA and LST-LPA). Different MRI scanner vendors and field strengths were included. We compared these automated WMH segmentations with manual WMH segmentations as a reference. Performance of each method both within and across scanners was assessed using spatial and volumetric correspondence with the reference segmentations by Dice’s similarity coefficient (DSC) and intra-class correlation coefficient (ICC) respectively. We found the best performance, both within and across scanners, for kNN-TTP, followed by LST-LPA and LST-LGA, with worse performance for Lesion-TOADS and Cascade. Our findings can serve as a guide for choosing a method and also highlight the importance to further improve and evaluate consistency of methods in a multicenter setting.