Cargando…

Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation

BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) fe...

Descripción completa

Detalles Bibliográficos
Autores principales: Wen, Bingyang, Wang, Ning, Subbalakshmi, Koduvayur, Chandramouli, Rajarathnam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189619/
https://www.ncbi.nlm.nih.gov/pubmed/37129944
http://dx.doi.org/10.2196/36590
_version_ 1785043123183812608
author Wen, Bingyang
Wang, Ning
Subbalakshmi, Koduvayur
Chandramouli, Rajarathnam
author_facet Wen, Bingyang
Wang, Ning
Subbalakshmi, Koduvayur
Chandramouli, Rajarathnam
author_sort Wen, Bingyang
collection PubMed
description BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) features or tags could be good indicators of AD. However, there has not been a systematic attempt to discover the underlying relationships between PoS features and AD. Moreover, there has not been any attempt to quantify the relative importance of PoS features in detecting AD. OBJECTIVE: Our goal was to disclose the underlying relationship between PoS features and AD, understand whether PoS features are useful in AD diagnosis, and explore which PoS features play a vital role in the diagnosis. METHODS: The DementiaBank, containing 1049 transcripts from 208 patients with AD and 243 transcripts from 104 older control individuals, was used. A total of 27 PoS features were extracted from each record. Then, the relationship between AD and each of the PoS features was explored. A transformer-based deep learning model for AD prediction using PoS features was trained. Then, a global explainable artificial intelligence method was proposed and used to discover which PoS features were the most important in AD diagnosis using the transformer-based predictor. A global (model-level) feature importance measure was derived as a summary from the local (example-level) feature importance metric, which was obtained using the proposed causally aware counterfactual explanation method. The unique feature of this method is that it considers causal relations among PoS features and can, hence, preclude counterfactuals that are improbable and result in more reliable explanations. RESULTS: The deep learning–based AD predictor achieved an accuracy of 92.2% and an F(1)-score of 0.955 when distinguishing patients with AD from healthy controls. The proposed explanation method identified 12 PoS features as being important for distinguishing patients with AD from healthy controls. Of these 12 features, 3 (25%) have been identified by other researchers in previous works in psychology and natural language processing. The remaining 75% (9/12) of PoS features have not been previously identified. We believe that this is an interesting finding that can be used in creating tests that might aid in the diagnosis of AD. Note that although our method is focused on PoS features, it should be possible to extend it to more types of features, perhaps even those derived from other biomarkers, such as syntactic features. CONCLUSIONS: The high classification accuracy of the proposed deep learner indicates that PoS features are strong clues in AD diagnosis. There are 12 PoS features that are strongly tied to AD, and because language is a noninvasive and potentially cheap method for detecting AD, this work shows some promising directions in this field.
format Online
Article
Text
id pubmed-10189619
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-101896192023-05-18 Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation Wen, Bingyang Wang, Ning Subbalakshmi, Koduvayur Chandramouli, Rajarathnam JMIR Form Res Original Paper BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) features or tags could be good indicators of AD. However, there has not been a systematic attempt to discover the underlying relationships between PoS features and AD. Moreover, there has not been any attempt to quantify the relative importance of PoS features in detecting AD. OBJECTIVE: Our goal was to disclose the underlying relationship between PoS features and AD, understand whether PoS features are useful in AD diagnosis, and explore which PoS features play a vital role in the diagnosis. METHODS: The DementiaBank, containing 1049 transcripts from 208 patients with AD and 243 transcripts from 104 older control individuals, was used. A total of 27 PoS features were extracted from each record. Then, the relationship between AD and each of the PoS features was explored. A transformer-based deep learning model for AD prediction using PoS features was trained. Then, a global explainable artificial intelligence method was proposed and used to discover which PoS features were the most important in AD diagnosis using the transformer-based predictor. A global (model-level) feature importance measure was derived as a summary from the local (example-level) feature importance metric, which was obtained using the proposed causally aware counterfactual explanation method. The unique feature of this method is that it considers causal relations among PoS features and can, hence, preclude counterfactuals that are improbable and result in more reliable explanations. RESULTS: The deep learning–based AD predictor achieved an accuracy of 92.2% and an F(1)-score of 0.955 when distinguishing patients with AD from healthy controls. The proposed explanation method identified 12 PoS features as being important for distinguishing patients with AD from healthy controls. Of these 12 features, 3 (25%) have been identified by other researchers in previous works in psychology and natural language processing. The remaining 75% (9/12) of PoS features have not been previously identified. We believe that this is an interesting finding that can be used in creating tests that might aid in the diagnosis of AD. Note that although our method is focused on PoS features, it should be possible to extend it to more types of features, perhaps even those derived from other biomarkers, such as syntactic features. CONCLUSIONS: The high classification accuracy of the proposed deep learner indicates that PoS features are strong clues in AD diagnosis. There are 12 PoS features that are strongly tied to AD, and because language is a noninvasive and potentially cheap method for detecting AD, this work shows some promising directions in this field. JMIR Publications 2023-05-02 /pmc/articles/PMC10189619/ /pubmed/37129944 http://dx.doi.org/10.2196/36590 Text en ©Bingyang Wen, Ning Wang, Koduvayur Subbalakshmi, Rajarathnam Chandramouli. Originally published in JMIR Formative Research (https://formative.jmir.org), 02.05.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Wen, Bingyang
Wang, Ning
Subbalakshmi, Koduvayur
Chandramouli, Rajarathnam
Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title_full Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title_fullStr Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title_full_unstemmed Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title_short Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
title_sort revealing the roles of part-of-speech taggers in alzheimer disease detection: scientific discovery using one-intervention causal explanation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189619/
https://www.ncbi.nlm.nih.gov/pubmed/37129944
http://dx.doi.org/10.2196/36590
work_keys_str_mv AT wenbingyang revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation
AT wangning revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation
AT subbalakshmikoduvayur revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation
AT chandramoulirajarathnam revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation