Cargando…

Reducing Annotation Burden Through Multimodal Learning

Choosing an optimal data fusion technique is essential when performing machine learning with multimodal data. In this study, we examined deep learning-based multimodal fusion techniques for the combined classification of radiological images and associated text reports. In our analysis, we (1) compar...

Descripción completa

Detalles Bibliográficos
Autores principales: Lopez, Kevin, Fodeh, Samah J., Allam, Ahmed, Brandt, Cynthia A., Krauthammer, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931886/
https://www.ncbi.nlm.nih.gov/pubmed/33693393
http://dx.doi.org/10.3389/fdata.2020.00019
_version_ 1783660375185555456
author Lopez, Kevin
Fodeh, Samah J.
Allam, Ahmed
Brandt, Cynthia A.
Krauthammer, Michael
author_facet Lopez, Kevin
Fodeh, Samah J.
Allam, Ahmed
Brandt, Cynthia A.
Krauthammer, Michael
author_sort Lopez, Kevin
collection PubMed
description Choosing an optimal data fusion technique is essential when performing machine learning with multimodal data. In this study, we examined deep learning-based multimodal fusion techniques for the combined classification of radiological images and associated text reports. In our analysis, we (1) compared the classification performance of three prototypical multimodal fusion techniques: Early, Late, and Model fusion, (2) assessed the performance of multimodal compared to unimodal learning; and finally (3) investigated the amount of labeled data needed by multimodal vs. unimodal models to yield comparable classification performance. Our experiments demonstrate the potential of multimodal fusion methods to yield competitive results using less training data (labeled data) than their unimodal counterparts. This was more pronounced using the Early and less so using the Model and Late fusion approaches. With increasing amount of training data, unimodal models achieved comparable results to multimodal models. Overall, our results suggest the potential of multimodal learning to decrease the need for labeled training data resulting in a lower annotation burden for domain experts.
format Online
Article
Text
id pubmed-7931886
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79318862021-03-09 Reducing Annotation Burden Through Multimodal Learning Lopez, Kevin Fodeh, Samah J. Allam, Ahmed Brandt, Cynthia A. Krauthammer, Michael Front Big Data Big Data Choosing an optimal data fusion technique is essential when performing machine learning with multimodal data. In this study, we examined deep learning-based multimodal fusion techniques for the combined classification of radiological images and associated text reports. In our analysis, we (1) compared the classification performance of three prototypical multimodal fusion techniques: Early, Late, and Model fusion, (2) assessed the performance of multimodal compared to unimodal learning; and finally (3) investigated the amount of labeled data needed by multimodal vs. unimodal models to yield comparable classification performance. Our experiments demonstrate the potential of multimodal fusion methods to yield competitive results using less training data (labeled data) than their unimodal counterparts. This was more pronounced using the Early and less so using the Model and Late fusion approaches. With increasing amount of training data, unimodal models achieved comparable results to multimodal models. Overall, our results suggest the potential of multimodal learning to decrease the need for labeled training data resulting in a lower annotation burden for domain experts. Frontiers Media S.A. 2020-06-02 /pmc/articles/PMC7931886/ /pubmed/33693393 http://dx.doi.org/10.3389/fdata.2020.00019 Text en Copyright © 2020 Lopez, Fodeh, Allam, Brandt and Krauthammer. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Lopez, Kevin
Fodeh, Samah J.
Allam, Ahmed
Brandt, Cynthia A.
Krauthammer, Michael
Reducing Annotation Burden Through Multimodal Learning
title Reducing Annotation Burden Through Multimodal Learning
title_full Reducing Annotation Burden Through Multimodal Learning
title_fullStr Reducing Annotation Burden Through Multimodal Learning
title_full_unstemmed Reducing Annotation Burden Through Multimodal Learning
title_short Reducing Annotation Burden Through Multimodal Learning
title_sort reducing annotation burden through multimodal learning
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931886/
https://www.ncbi.nlm.nih.gov/pubmed/33693393
http://dx.doi.org/10.3389/fdata.2020.00019
work_keys_str_mv AT lopezkevin reducingannotationburdenthroughmultimodallearning
AT fodehsamahj reducingannotationburdenthroughmultimodallearning
AT allamahmed reducingannotationburdenthroughmultimodallearning
AT brandtcynthiaa reducingannotationburdenthroughmultimodallearning
AT krauthammermichael reducingannotationburdenthroughmultimodallearning