Cargando…

Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data

Till March 31st, 2021, the coronavirus disease 2019 (COVID-19) had reportedly infected more than 127 million people and caused over 2.5 million deaths worldwide. Timely diagnosis of COVID-19 is crucial for management of individual patients as well as containment of the highly contagious disease. Hav...

Descripción completa

Detalles Bibliográficos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IEEE 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908883/
https://www.ncbi.nlm.nih.gov/pubmed/34637384
http://dx.doi.org/10.1109/JBHI.2021.3119325
_version_ 1784665968858890240
collection PubMed
description Till March 31st, 2021, the coronavirus disease 2019 (COVID-19) had reportedly infected more than 127 million people and caused over 2.5 million deaths worldwide. Timely diagnosis of COVID-19 is crucial for management of individual patients as well as containment of the highly contagious disease. Having realized the clinical value of non-contrast chest computed tomography (CT) for diagnosis of COVID-19, deep learning (DL) based automated methods have been proposed to aid the radiologists in reading the huge quantities of CT exams as a result of the pandemic. In this work, we address an overlooked problem for training deep convolutional neural networks for COVID-19 classification using real-world multi-source data, namely, the data source bias problem. The data source bias problem refers to the situation in which certain sources of data comprise only a single class of data, and training with such source-biased data may make the DL models learn to distinguish data sources instead of COVID-19. To overcome this problem, we propose MIx-aNd-Interpolate (MINI), a conceptually simple, easy-to-implement, efficient yet effective training strategy. The proposed MINI approach generates volumes of the absent class by combining the samples collected from different hospitals, which enlarges the sample space of the original source-biased dataset. Experimental results on a large collection of real patient data (1,221 COVID-19 and 1,520 negative CT images, and the latter consisting of 786 community acquired pneumonia and 734 non-pneumonia) from eight hospitals and health institutions show that: 1) MINI can improve COVID-19 classification performance upon the baseline (which does not deal with the source bias), and 2) MINI is superior to competing methods in terms of the extent of improvement.
format Online
Article
Text
id pubmed-8908883
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher IEEE
record_format MEDLINE/PubMed
spelling pubmed-89088832022-05-13 Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data IEEE J Biomed Health Inform Article Till March 31st, 2021, the coronavirus disease 2019 (COVID-19) had reportedly infected more than 127 million people and caused over 2.5 million deaths worldwide. Timely diagnosis of COVID-19 is crucial for management of individual patients as well as containment of the highly contagious disease. Having realized the clinical value of non-contrast chest computed tomography (CT) for diagnosis of COVID-19, deep learning (DL) based automated methods have been proposed to aid the radiologists in reading the huge quantities of CT exams as a result of the pandemic. In this work, we address an overlooked problem for training deep convolutional neural networks for COVID-19 classification using real-world multi-source data, namely, the data source bias problem. The data source bias problem refers to the situation in which certain sources of data comprise only a single class of data, and training with such source-biased data may make the DL models learn to distinguish data sources instead of COVID-19. To overcome this problem, we propose MIx-aNd-Interpolate (MINI), a conceptually simple, easy-to-implement, efficient yet effective training strategy. The proposed MINI approach generates volumes of the absent class by combining the samples collected from different hospitals, which enlarges the sample space of the original source-biased dataset. Experimental results on a large collection of real patient data (1,221 COVID-19 and 1,520 negative CT images, and the latter consisting of 786 community acquired pneumonia and 734 non-pneumonia) from eight hospitals and health institutions show that: 1) MINI can improve COVID-19 classification performance upon the baseline (which does not deal with the source bias), and 2) MINI is superior to competing methods in terms of the extent of improvement. IEEE 2021-10-12 /pmc/articles/PMC8908883/ /pubmed/34637384 http://dx.doi.org/10.1109/JBHI.2021.3119325 Text en This article is free to access and download, along with rights for full text and data mining, re-use and analysis.
spellingShingle Article
Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title_full Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title_fullStr Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title_full_unstemmed Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title_short Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data
title_sort mix-and-interpolate: a training strategy to deal with source-biased medical data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908883/
https://www.ncbi.nlm.nih.gov/pubmed/34637384
http://dx.doi.org/10.1109/JBHI.2021.3119325
work_keys_str_mv AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata
AT mixandinterpolateatrainingstrategytodealwithsourcebiasedmedicaldata