Cargando…

Convolutional neural network -based phantom image scoring for mammography quality control

BACKGROUND: Visual evaluation of phantom images is an important, but time-consuming part of mammography quality control (QC). Consistent scoring of phantom images over the device’s lifetime is highly desirable. Recently, convolutional neural networks (CNNs) have been applied to a wide range of image...

Descripción completa

Detalles Bibliográficos
Autores principales: Sundell, Veli-Matti, Mäkelä, Teemu, Vitikainen, Anne-Mari, Kaasalainen, Touko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9727908/
https://www.ncbi.nlm.nih.gov/pubmed/36476319
http://dx.doi.org/10.1186/s12880-022-00944-w
Descripción
Sumario:BACKGROUND: Visual evaluation of phantom images is an important, but time-consuming part of mammography quality control (QC). Consistent scoring of phantom images over the device’s lifetime is highly desirable. Recently, convolutional neural networks (CNNs) have been applied to a wide range of image classification problems, performing with a high accuracy. The purpose of this study was to automate mammography QC phantom scoring task by training CNN models to mimic a human reviewer. METHODS: Eight CNN variations consisting of three to ten convolutional layers were trained for detecting targets (fibres, microcalcifications and masses) in American College of Radiology (ACR) accreditation phantom images and the results were compared with human scoring. Regular and artificially degraded/improved QC phantom images from eight mammography devices were visually evaluated by one reviewer. These images were used in training the CNN models. A separate test set consisted of daily QC images from the eight devices and separately acquired images with varying dose levels. These were scored by four reviewers and considered the ground truth for CNN performance testing. RESULTS: Although hyper-parameter search space was limited, an optimal network depth after which additional layers resulted in decreased accuracy was identified. The highest scoring accuracy (95%) was achieved with the CNN consisting of six convolutional layers. The highest deviation between the CNN and the reviewers was found at lowest dose levels. No significant difference emerged between the visual reviews and CNN results except in case of smallest masses. CONCLUSION: A CNN-based automatic mammography QC phantom scoring system can score phantom images in a good agreement with human reviewers, and can therefore be of benefit in mammography QC.