Cargando…

Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports

Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its...

Descripción completa

Detalles Bibliográficos
Autores principales: Chambon, Pierre, Cook, Tessa S., Langlotz, Curtis P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629758/
https://www.ncbi.nlm.nih.gov/pubmed/36323915
http://dx.doi.org/10.1007/s10278-022-00714-8
_version_ 1784823462015008768
author Chambon, Pierre
Cook, Tessa S.
Langlotz, Curtis P.
author_facet Chambon, Pierre
Cook, Tessa S.
Langlotz, Curtis P.
author_sort Chambon, Pierre
collection PubMed
description Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-022-00714-8.
format Online
Article
Text
id pubmed-9629758
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-96297582022-11-03 Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports Chambon, Pierre Cook, Tessa S. Langlotz, Curtis P. J Digit Imaging Original Paper Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-022-00714-8. Springer International Publishing 2022-11-02 2023-02 /pmc/articles/PMC9629758/ /pubmed/36323915 http://dx.doi.org/10.1007/s10278-022-00714-8 Text en © The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
spellingShingle Original Paper
Chambon, Pierre
Cook, Tessa S.
Langlotz, Curtis P.
Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title_full Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title_fullStr Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title_full_unstemmed Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title_short Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports
title_sort improved fine-tuning of in-domain transformer model for inferring covid-19 presence in multi-institutional radiology reports
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9629758/
https://www.ncbi.nlm.nih.gov/pubmed/36323915
http://dx.doi.org/10.1007/s10278-022-00714-8
work_keys_str_mv AT chambonpierre improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports
AT cooktessas improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports
AT langlotzcurtisp improvedfinetuningofindomaintransformermodelforinferringcovid19presenceinmultiinstitutionalradiologyreports