Cargando…

Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection

Examination of speech datasets for detecting dementia, collected via various speech tasks, has revealed links between speech and cognitive abilities. However, the speech dataset available for this research is extremely limited because the collection process of speech and baseline data from patients...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhu, Youxiang, Liang, Xiaohui, Batsis, John A., Roth, Robert M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153512/ https://www.ncbi.nlm.nih.gov/pubmed/34046588 http://dx.doi.org/10.3389/fcomp.2021.624683

_version_	1783698815143903232
author	Zhu, Youxiang Liang, Xiaohui Batsis, John A. Roth, Robert M.
author_facet	Zhu, Youxiang Liang, Xiaohui Batsis, John A. Roth, Robert M.
author_sort	Zhu, Youxiang
collection	PubMed
description	Examination of speech datasets for detecting dementia, collected via various speech tasks, has revealed links between speech and cognitive abilities. However, the speech dataset available for this research is extremely limited because the collection process of speech and baseline data from patients with dementia in clinical settings is expensive. In this paper, we study the spontaneous speech dataset from a recent ADReSS challenge, a Cookie Theft Picture (CTP) dataset with balanced groups of participants in age, gender, and cognitive status. We explore state-of-the-art deep transfer learning techniques from image, audio, speech, and language domains. We envision that one advantage of transfer learning is to eliminate the design of handcrafted features based on the tasks and datasets. Transfer learning further mitigates the limited dementia-relevant speech data problem by inheriting knowledge from similar but much larger datasets. Specifically, we built a variety of transfer learning models using commonly employed MobileNet (image), YAMNet (audio), Mockingjay (speech), and BERT (text) models. Results indicated that the transfer learning models of text data showed significantly better performance than those of audio data. Performance gains of the text models may be due to the high similarity between the pre-training text dataset and the CTP text dataset. Our multi-modal transfer learning introduced a slight improvement in accuracy, demonstrating that audio and text data provide limited complementary information. Multi-task transfer learning resulted in limited improvements in classification and a negative impact in regression. By analyzing the meaning behind the AD/non-AD labels and Mini-Mental State Examination (MMSE) scores, we observed that the inconsistency between labels and scores could limit the performance of the multi-task learning, especially when the outputs of the single-task models are highly consistent with the corresponding labels/scores. In sum, we conducted a large comparative analysis of varying transfer learning models focusing less on model customization but more on pre-trained models and pre-training datasets. We revealed insightful relations among models, data types, and data labels in this research area.
format	Online Article Text
id	pubmed-8153512
institution	National Center for Biotechnology Information
language	English
publishDate	2021
record_format	MEDLINE/PubMed
spelling	pubmed-81535122021-05-26 Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection Zhu, Youxiang Liang, Xiaohui Batsis, John A. Roth, Robert M. Front Comput Sci Article Examination of speech datasets for detecting dementia, collected via various speech tasks, has revealed links between speech and cognitive abilities. However, the speech dataset available for this research is extremely limited because the collection process of speech and baseline data from patients with dementia in clinical settings is expensive. In this paper, we study the spontaneous speech dataset from a recent ADReSS challenge, a Cookie Theft Picture (CTP) dataset with balanced groups of participants in age, gender, and cognitive status. We explore state-of-the-art deep transfer learning techniques from image, audio, speech, and language domains. We envision that one advantage of transfer learning is to eliminate the design of handcrafted features based on the tasks and datasets. Transfer learning further mitigates the limited dementia-relevant speech data problem by inheriting knowledge from similar but much larger datasets. Specifically, we built a variety of transfer learning models using commonly employed MobileNet (image), YAMNet (audio), Mockingjay (speech), and BERT (text) models. Results indicated that the transfer learning models of text data showed significantly better performance than those of audio data. Performance gains of the text models may be due to the high similarity between the pre-training text dataset and the CTP text dataset. Our multi-modal transfer learning introduced a slight improvement in accuracy, demonstrating that audio and text data provide limited complementary information. Multi-task transfer learning resulted in limited improvements in classification and a negative impact in regression. By analyzing the meaning behind the AD/non-AD labels and Mini-Mental State Examination (MMSE) scores, we observed that the inconsistency between labels and scores could limit the performance of the multi-task learning, especially when the outputs of the single-task models are highly consistent with the corresponding labels/scores. In sum, we conducted a large comparative analysis of varying transfer learning models focusing less on model customization but more on pre-trained models and pre-training datasets. We revealed insightful relations among models, data types, and data labels in this research area. 2021-05-12 2021-05 /pmc/articles/PMC8153512/ /pubmed/34046588 http://dx.doi.org/10.3389/fcomp.2021.624683 Text en https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Article Zhu, Youxiang Liang, Xiaohui Batsis, John A. Roth, Robert M. Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title	Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title_full	Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title_fullStr	Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title_full_unstemmed	Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title_short	Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection
title_sort	exploring deep transfer learning techniques for alzheimer’s dementia detection
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153512/ https://www.ncbi.nlm.nih.gov/pubmed/34046588 http://dx.doi.org/10.3389/fcomp.2021.624683
work_keys_str_mv	AT zhuyouxiang exploringdeeptransferlearningtechniquesforalzheimersdementiadetection AT liangxiaohui exploringdeeptransferlearningtechniquesforalzheimersdementiadetection AT batsisjohna exploringdeeptransferlearningtechniquesforalzheimersdementiadetection AT rothrobertm exploringdeeptransferlearningtechniquesforalzheimersdementiadetection

Exploring Deep Transfer Learning Techniques for Alzheimer’s Dementia Detection

Ejemplares similares