Cargando…
Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629758/ https://www.ncbi.nlm.nih.gov/pubmed/36323915 http://dx.doi.org/10.1007/s10278-022-00714-8 |
_version_ | 1784823462015008768 |
---|---|
author | Chambon, Pierre Cook, Tessa S. Langlotz, Curtis P. |
author_facet | Chambon, Pierre Cook, Tessa S. Langlotz, Curtis P. |
author_sort | Chambon, Pierre |
collection | PubMed |
description | Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-022-00714-8. |
format | Online Article Text |
id | pubmed-9629758 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-96297582022-11-03 Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports Chambon, Pierre Cook, Tessa S. Langlotz, Curtis P. J Digit Imaging Original Paper Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-022-00714-8. Springer International Publishing 2022-11-02 2023-02 /pmc/articles/PMC9629758/ /pubmed/36323915 http://dx.doi.org/10.1007/s10278-022-00714-8 Text en © The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. |
spellingShingle | Original Paper Chambon, Pierre Cook, Tessa S. Langlotz, Curtis P. Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title | Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title_full | Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title_fullStr | Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title_full_unstemmed | Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title_short | Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports |
title_sort | improved fine-tuning of in-domain transformer model for inferring covid-19 presence in multi-institutional radiology reports |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629758/ https://www.ncbi.nlm.nih.gov/pubmed/36323915 http://dx.doi.org/10.1007/s10278-022-00714-8 |
work_keys_str_mv | AT chambonpierre improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports AT cooktessas improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports AT langlotzcurtisp improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports |