Cargando…
Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual ta...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/ https://www.ncbi.nlm.nih.gov/pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075 |
_version_ | 1784788739476684800 |
---|---|
author | De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun |
author_facet | De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun |
author_sort | De Angeli, Kevin |
collection | PubMed |
description | OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required. |
format | Online Article Text |
id | pubmed-9469924 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-94699242022-09-14 Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun JAMIA Open Research and Applications OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required. Oxford University Press 2022-09-13 /pmc/articles/PMC9469924/ /pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research and Applications De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title | Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title_full | Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title_fullStr | Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title_full_unstemmed | Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title_short | Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
title_sort | using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/ https://www.ncbi.nlm.nih.gov/pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075 |
work_keys_str_mv | AT deangelikevin usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT gaoshang usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT blanchardandrew usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT durbinericb usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT wuxiaocheng usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT stroupantoinette usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT dohertyjennifer usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT schwartzstephenm usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT wigginscharles usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT coylelinda usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT penberthylynne usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT tourassigeorgia usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT yoonhongjun usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports |