Cargando…

Novel loss functions for ensemble-based medical image classification

Medical images commonly exhibit multiple abnormalities. Predicting them requires multi-class classifiers whose training and desired reliable performance can be affected by a combination of factors, such as, dataset size, data source, distribution, and the loss function used to train deep neural netw...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajaraman, Sivaramakrishnan, Zamzmi, Ghada, Antani, Sameer K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8718001/
https://www.ncbi.nlm.nih.gov/pubmed/34968393
http://dx.doi.org/10.1371/journal.pone.0261307
Descripción
Sumario:Medical images commonly exhibit multiple abnormalities. Predicting them requires multi-class classifiers whose training and desired reliable performance can be affected by a combination of factors, such as, dataset size, data source, distribution, and the loss function used to train deep neural networks. Currently, the cross-entropy loss remains the de-facto loss function for training deep learning classifiers. This loss function, however, asserts equal learning from all classes, leading to a bias toward the majority class. Although the choice of the loss function impacts model performance, to the best of our knowledge, we observed that no literature exists that performs a comprehensive analysis and selection of an appropriate loss function toward the classification task under study. In this work, we benchmark various state-of-the-art loss functions, critically analyze model performance, and propose improved loss functions for a multi-class classification task. We select a pediatric chest X-ray (CXR) dataset that includes images with no abnormality (normal), and those exhibiting manifestations consistent with bacterial and viral pneumonia. We construct prediction-level and model-level ensembles to improve classification performance. Our results show that compared to the individual models and the state-of-the-art literature, the weighted averaging of the predictions for top-3 and top-5 model-level ensembles delivered significantly superior classification performance (p < 0.05) in terms of MCC (0.9068, 95% confidence interval (0.8839, 0.9297)) metric. Finally, we performed localization studies to interpret model behavior and confirm that the individual models and ensembles learned task-specific features and highlighted disease-specific regions of interest. The code is available at https://github.com/sivaramakrishnan-rajaraman/multiloss_ensemble_models.