Cargando…

Benchmarking Deep Learning Models for Tooth Structure Segmentation

A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide gu...

Descripción completa

Detalles Bibliográficos
Autores principales: Schneider, L., Arsiwala-Scheppach, L., Krois, J., Meyer-Lueckel, H., Bressem, K.K., Niehues, S.M., Schwendicke, F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9516600/
https://www.ncbi.nlm.nih.gov/pubmed/35686357
http://dx.doi.org/10.1177/00220345221100169
_version_ 1784798742115778560
author Schneider, L.
Arsiwala-Scheppach, L.
Krois, J.
Meyer-Lueckel, H.
Bressem, K.K.
Niehues, S.M.
Schwendicke, F.
author_facet Schneider, L.
Arsiwala-Scheppach, L.
Krois, J.
Meyer-Lueckel, H.
Bressem, K.K.
Niehues, S.M.
Schwendicke, F.
author_sort Schneider, L.
collection PubMed
description A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide guidance in the model development process and may allow developers to make better decisions. However, comprehensive benchmarking has not been performed in dentistry yet. We aimed to benchmark a range of architecture designs for 1 specific, exemplary case: tooth structure segmentation on dental bitewing radiographs. We built 72 models for tooth structure (enamel, dentin, pulp, fillings, crowns) segmentation by combining 6 different DL network architectures (U-Net, U-Net++, Feature Pyramid Networks, LinkNet, Pyramid Scene Parsing Network, Mask Attention Network) with 12 encoders from 3 different encoder families (ResNet, VGG, DenseNet) of varying depth (e.g., VGG13, VGG16, VGG19). On each model design, 3 initialization strategies (ImageNet, CheXpert, random initialization) were applied, resulting overall into 216 trained models, which were trained up to 200 epochs with the Adam optimizer (learning rate = 0.0001) and a batch size of 32. Our data set consisted of 1,625 human-annotated dental bitewing radiographs. We used a 5-fold cross-validation scheme and quantified model performances primarily by the F1-score. Initialization with ImageNet or CheXpert weights significantly outperformed random initialization (P < 0.05). Deeper and more complex models did not necessarily perform better than less complex alternatives. VGG-based models were more robust across model configurations, while more complex models (e.g., from the ResNet family) achieved peak performances. In conclusion, initializing models with pretrained weights may be recommended when training models for dental radiographic analysis. Less complex model architectures may be competitive alternatives if computational resources and training time are restricting factors. Models developed and found superior on nondental data sets may not show this behavior for dental domain-specific tasks.
format Online
Article
Text
id pubmed-9516600
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-95166002022-09-29 Benchmarking Deep Learning Models for Tooth Structure Segmentation Schneider, L. Arsiwala-Scheppach, L. Krois, J. Meyer-Lueckel, H. Bressem, K.K. Niehues, S.M. Schwendicke, F. J Dent Res Research Reports A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide guidance in the model development process and may allow developers to make better decisions. However, comprehensive benchmarking has not been performed in dentistry yet. We aimed to benchmark a range of architecture designs for 1 specific, exemplary case: tooth structure segmentation on dental bitewing radiographs. We built 72 models for tooth structure (enamel, dentin, pulp, fillings, crowns) segmentation by combining 6 different DL network architectures (U-Net, U-Net++, Feature Pyramid Networks, LinkNet, Pyramid Scene Parsing Network, Mask Attention Network) with 12 encoders from 3 different encoder families (ResNet, VGG, DenseNet) of varying depth (e.g., VGG13, VGG16, VGG19). On each model design, 3 initialization strategies (ImageNet, CheXpert, random initialization) were applied, resulting overall into 216 trained models, which were trained up to 200 epochs with the Adam optimizer (learning rate = 0.0001) and a batch size of 32. Our data set consisted of 1,625 human-annotated dental bitewing radiographs. We used a 5-fold cross-validation scheme and quantified model performances primarily by the F1-score. Initialization with ImageNet or CheXpert weights significantly outperformed random initialization (P < 0.05). Deeper and more complex models did not necessarily perform better than less complex alternatives. VGG-based models were more robust across model configurations, while more complex models (e.g., from the ResNet family) achieved peak performances. In conclusion, initializing models with pretrained weights may be recommended when training models for dental radiographic analysis. Less complex model architectures may be competitive alternatives if computational resources and training time are restricting factors. Models developed and found superior on nondental data sets may not show this behavior for dental domain-specific tasks. SAGE Publications 2022-06-09 2022-10 /pmc/articles/PMC9516600/ /pubmed/35686357 http://dx.doi.org/10.1177/00220345221100169 Text en © International Association for Dental Research and American Association for Dental, Oral, and Craniofacial Research 2022 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Research Reports
Schneider, L.
Arsiwala-Scheppach, L.
Krois, J.
Meyer-Lueckel, H.
Bressem, K.K.
Niehues, S.M.
Schwendicke, F.
Benchmarking Deep Learning Models for Tooth Structure Segmentation
title Benchmarking Deep Learning Models for Tooth Structure Segmentation
title_full Benchmarking Deep Learning Models for Tooth Structure Segmentation
title_fullStr Benchmarking Deep Learning Models for Tooth Structure Segmentation
title_full_unstemmed Benchmarking Deep Learning Models for Tooth Structure Segmentation
title_short Benchmarking Deep Learning Models for Tooth Structure Segmentation
title_sort benchmarking deep learning models for tooth structure segmentation
topic Research Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9516600/
https://www.ncbi.nlm.nih.gov/pubmed/35686357
http://dx.doi.org/10.1177/00220345221100169
work_keys_str_mv AT schneiderl benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT arsiwalascheppachl benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT kroisj benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT meyerlueckelh benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT bressemkk benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT niehuessm benchmarkingdeeplearningmodelsfortoothstructuresegmentation
AT schwendickef benchmarkingdeeplearningmodelsfortoothstructuresegmentation