Cargando…

Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports

OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual ta...

Descripción completa

Detalles Bibliográficos
Autores principales:	De Angeli, Kevin, Gao, Shang, Blanchard, Andrew, Durbin, Eric B, Wu, Xiao-Cheng, Stroup, Antoinette, Doherty, Jennifer, Schwartz, Stephen M, Wiggins, Charles, Coyle, Linda, Penberthy, Lynne, Tourassi, Georgia, Yoon, Hong-Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/ https://www.ncbi.nlm.nih.gov/pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075

_version_	1784788739476684800
author	De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun
author_facet	De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun
author_sort	De Angeli, Kevin
collection	PubMed
description	OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required.
format	Online Article Text
id	pubmed-9469924
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-94699242022-09-14 Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun JAMIA Open Research and Applications OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required. Oxford University Press 2022-09-13 /pmc/articles/PMC9469924/ /pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research and Applications De Angeli, Kevin Gao, Shang Blanchard, Andrew Durbin, Eric B Wu, Xiao-Cheng Stroup, Antoinette Doherty, Jennifer Schwartz, Stephen M Wiggins, Charles Coyle, Linda Penberthy, Lynne Tourassi, Georgia Yoon, Hong-Jun Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title	Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_full	Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_fullStr	Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_full_unstemmed	Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_short	Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
title_sort	using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9469924/ https://www.ncbi.nlm.nih.gov/pubmed/36110150 http://dx.doi.org/10.1093/jamiaopen/ooac075
work_keys_str_mv	AT deangelikevin usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT gaoshang usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT blanchardandrew usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT durbinericb usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT wuxiaocheng usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT stroupantoinette usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT dohertyjennifer usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT schwartzstephenm usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT wigginscharles usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT coylelinda usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT penberthylynne usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT tourassigeorgia usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports AT yoonhongjun usingensemblesanddistillationtooptimizethedeploymentofdeeplearningmodelsfortheclassificationofelectroniccancerpathologyreports

Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports

Ejemplares similares