Cargando…

Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports

OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual ta...

Descripción completa

Detalles Bibliográficos
Autores principales: De Angeli, Kevin, Gao, Shang, Blanchard, Andrew, Durbin, Eric B, Wu, Xiao-Cheng, Stroup, Antoinette, Doherty, Jennifer, Schwartz, Stephen M, Wiggins, Charles, Coyle, Linda, Penberthy, Lynne, Tourassi, Georgia, Yoon, Hong-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/
https://www.ncbi.nlm.nih.gov/pubmed/36110150
http://dx.doi.org/10.1093/jamiaopen/ooac075
_version_ 1784788739476684800
author De Angeli, Kevin
Gao, Shang
Blanchard, Andrew
Durbin, Eric B
Wu, Xiao-Cheng
Stroup, Antoinette
Doherty, Jennifer
Schwartz, Stephen M
Wiggins, Charles
Coyle, Linda
Penberthy, Lynne
Tourassi, Georgia
Yoon, Hong-Jun
author_facet De Angeli, Kevin
Gao, Shang
Blanchard, Andrew
Durbin, Eric B
Wu, Xiao-Cheng
Stroup, Antoinette
Doherty, Jennifer
Schwartz, Stephen M
Wiggins, Charles
Coyle, Linda
Penberthy, Lynne
Tourassi, Georgia
Yoon, Hong-Jun
author_sort De Angeli, Kevin
collection PubMed
description OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required.
format Online
Article
Text
id pubmed-9469924
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94699242022-09-14 Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun JAMIA Open Research and Applications OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required. Oxford University Press 2022-09-13 /pmc/articles/PMC9469924/ /pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research and Applications
De Angeli, Kevin
Gao, Shang
Blanchard, Andrew
Durbin, Eric B
Wu, Xiao-Cheng
Stroup, Antoinette
Doherty, Jennifer
Schwartz, Stephen M
Wiggins, Charles
Coyle, Linda
Penberthy, Lynne
Tourassi, Georgia
Yoon, Hong-Jun
Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_full Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_fullStr Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_full_unstemmed Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_short Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_sort using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/
https://www.ncbi.nlm.nih.gov/pubmed/36110150
http://dx.doi.org/10.1093/jamiaopen/ooac075
work_keys_str_mv AT deangelikevin usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT gaoshang usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT blanchardandrew usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT durbinericb usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT wuxiaocheng usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT stroupantoinette usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT dohertyjennifer usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT schwartzstephenm usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT wigginscharles usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT coylelinda usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT penberthylynne usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT tourassigeorgia usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports
AT yoonhongjun usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports