Cargando…
Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification
Developing a robust algorithm to diagnose and quantify the severity of the novel coronavirus disease 2019 (COVID-19) using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with o...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier B.V.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8566090/ https://www.ncbi.nlm.nih.gov/pubmed/34814058 http://dx.doi.org/10.1016/j.media.2021.102299 |
_version_ | 1784593940984365056 |
---|---|
author | Park, Sangjoon Kim, Gwanghyun Oh, Yujin Seo, Joon Beom Lee, Sang Min Kim, Jin Hwan Moon, Sungjun Lim, Jae-Kwang Ye, Jong Chul |
author_facet | Park, Sangjoon Kim, Gwanghyun Oh, Yujin Seo, Joon Beom Lee, Sang Min Kim, Jin Hwan Moon, Sungjun Lim, Jae-Kwang Ye, Jong Chul |
author_sort | Park, Sangjoon |
collection | PubMed |
description | Developing a robust algorithm to diagnose and quantify the severity of the novel coronavirus disease 2019 (COVID-19) using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with other findings are abundant. This situation is ideally suited for the Vision Transformer (ViT) architecture, where a lot of unlabeled data can be used through structural modeling by the self-attention mechanism. However, the use of existing ViT may not be optimal, as the feature embedding by direct patch flattening or ResNet backbone in the standard ViT is not intended for CXR. To address this problem, here we propose a novel Multi-task ViT that leverages low-level CXR feature corpus obtained from a backbone network that extracts common CXR findings. Specifically, the backbone network is first trained with large public datasets to detect common abnormal findings such as consolidation, opacity, edema, etc. Then, the embedded features from the backbone network are used as corpora for a versatile Transformer model for both the diagnosis and the severity quantification of COVID-19. We evaluate our model on various external test datasets from totally different institutions to evaluate the generalization capability. The experimental results confirm that our model can achieve state-of-the-art performance in both diagnosis and severity quantification tasks with outstanding generalization capability, which are sine qua non of widespread deployment. |
format | Online Article Text |
id | pubmed-8566090 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-85660902021-11-04 Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification Park, Sangjoon Kim, Gwanghyun Oh, Yujin Seo, Joon Beom Lee, Sang Min Kim, Jin Hwan Moon, Sungjun Lim, Jae-Kwang Ye, Jong Chul Med Image Anal Article Developing a robust algorithm to diagnose and quantify the severity of the novel coronavirus disease 2019 (COVID-19) using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with other findings are abundant. This situation is ideally suited for the Vision Transformer (ViT) architecture, where a lot of unlabeled data can be used through structural modeling by the self-attention mechanism. However, the use of existing ViT may not be optimal, as the feature embedding by direct patch flattening or ResNet backbone in the standard ViT is not intended for CXR. To address this problem, here we propose a novel Multi-task ViT that leverages low-level CXR feature corpus obtained from a backbone network that extracts common CXR findings. Specifically, the backbone network is first trained with large public datasets to detect common abnormal findings such as consolidation, opacity, edema, etc. Then, the embedded features from the backbone network are used as corpora for a versatile Transformer model for both the diagnosis and the severity quantification of COVID-19. We evaluate our model on various external test datasets from totally different institutions to evaluate the generalization capability. The experimental results confirm that our model can achieve state-of-the-art performance in both diagnosis and severity quantification tasks with outstanding generalization capability, which are sine qua non of widespread deployment. Elsevier B.V. 2022-01 2021-11-04 /pmc/articles/PMC8566090/ /pubmed/34814058 http://dx.doi.org/10.1016/j.media.2021.102299 Text en © 2021 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Park, Sangjoon Kim, Gwanghyun Oh, Yujin Seo, Joon Beom Lee, Sang Min Kim, Jin Hwan Moon, Sungjun Lim, Jae-Kwang Ye, Jong Chul Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title | Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title_full | Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title_fullStr | Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title_full_unstemmed | Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title_short | Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification |
title_sort | multi-task vision transformer using low-level chest x-ray feature corpus for covid-19 diagnosis and severity quantification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8566090/ https://www.ncbi.nlm.nih.gov/pubmed/34814058 http://dx.doi.org/10.1016/j.media.2021.102299 |
work_keys_str_mv | AT parksangjoon multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT kimgwanghyun multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT ohyujin multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT seojoonbeom multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT leesangmin multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT kimjinhwan multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT moonsungjun multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT limjaekwang multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification AT yejongchul multitaskvisiontransformerusinglowlevelchestxrayfeaturecorpusforcovid19diagnosisandseverityquantification |