Cargando…
Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation
BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) fe...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189619/ https://www.ncbi.nlm.nih.gov/pubmed/37129944 http://dx.doi.org/10.2196/36590 |
_version_ | 1785043123183812608 |
---|---|
author | Wen, Bingyang Wang, Ning Subbalakshmi, Koduvayur Chandramouli, Rajarathnam |
author_facet | Wen, Bingyang Wang, Ning Subbalakshmi, Koduvayur Chandramouli, Rajarathnam |
author_sort | Wen, Bingyang |
collection | PubMed |
description | BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) features or tags could be good indicators of AD. However, there has not been a systematic attempt to discover the underlying relationships between PoS features and AD. Moreover, there has not been any attempt to quantify the relative importance of PoS features in detecting AD. OBJECTIVE: Our goal was to disclose the underlying relationship between PoS features and AD, understand whether PoS features are useful in AD diagnosis, and explore which PoS features play a vital role in the diagnosis. METHODS: The DementiaBank, containing 1049 transcripts from 208 patients with AD and 243 transcripts from 104 older control individuals, was used. A total of 27 PoS features were extracted from each record. Then, the relationship between AD and each of the PoS features was explored. A transformer-based deep learning model for AD prediction using PoS features was trained. Then, a global explainable artificial intelligence method was proposed and used to discover which PoS features were the most important in AD diagnosis using the transformer-based predictor. A global (model-level) feature importance measure was derived as a summary from the local (example-level) feature importance metric, which was obtained using the proposed causally aware counterfactual explanation method. The unique feature of this method is that it considers causal relations among PoS features and can, hence, preclude counterfactuals that are improbable and result in more reliable explanations. RESULTS: The deep learning–based AD predictor achieved an accuracy of 92.2% and an F(1)-score of 0.955 when distinguishing patients with AD from healthy controls. The proposed explanation method identified 12 PoS features as being important for distinguishing patients with AD from healthy controls. Of these 12 features, 3 (25%) have been identified by other researchers in previous works in psychology and natural language processing. The remaining 75% (9/12) of PoS features have not been previously identified. We believe that this is an interesting finding that can be used in creating tests that might aid in the diagnosis of AD. Note that although our method is focused on PoS features, it should be possible to extend it to more types of features, perhaps even those derived from other biomarkers, such as syntactic features. CONCLUSIONS: The high classification accuracy of the proposed deep learner indicates that PoS features are strong clues in AD diagnosis. There are 12 PoS features that are strongly tied to AD, and because language is a noninvasive and potentially cheap method for detecting AD, this work shows some promising directions in this field. |
format | Online Article Text |
id | pubmed-10189619 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-101896192023-05-18 Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation Wen, Bingyang Wang, Ning Subbalakshmi, Koduvayur Chandramouli, Rajarathnam JMIR Form Res Original Paper BACKGROUND: Recently, rich computational methods that use deep learning or machine learning have been developed using linguistic biomarkers for the diagnosis of early-stage Alzheimer disease (AD). Moreover, some qualitative and quantitative studies have indicated that certain part-of-speech (PoS) features or tags could be good indicators of AD. However, there has not been a systematic attempt to discover the underlying relationships between PoS features and AD. Moreover, there has not been any attempt to quantify the relative importance of PoS features in detecting AD. OBJECTIVE: Our goal was to disclose the underlying relationship between PoS features and AD, understand whether PoS features are useful in AD diagnosis, and explore which PoS features play a vital role in the diagnosis. METHODS: The DementiaBank, containing 1049 transcripts from 208 patients with AD and 243 transcripts from 104 older control individuals, was used. A total of 27 PoS features were extracted from each record. Then, the relationship between AD and each of the PoS features was explored. A transformer-based deep learning model for AD prediction using PoS features was trained. Then, a global explainable artificial intelligence method was proposed and used to discover which PoS features were the most important in AD diagnosis using the transformer-based predictor. A global (model-level) feature importance measure was derived as a summary from the local (example-level) feature importance metric, which was obtained using the proposed causally aware counterfactual explanation method. The unique feature of this method is that it considers causal relations among PoS features and can, hence, preclude counterfactuals that are improbable and result in more reliable explanations. RESULTS: The deep learning–based AD predictor achieved an accuracy of 92.2% and an F(1)-score of 0.955 when distinguishing patients with AD from healthy controls. The proposed explanation method identified 12 PoS features as being important for distinguishing patients with AD from healthy controls. Of these 12 features, 3 (25%) have been identified by other researchers in previous works in psychology and natural language processing. The remaining 75% (9/12) of PoS features have not been previously identified. We believe that this is an interesting finding that can be used in creating tests that might aid in the diagnosis of AD. Note that although our method is focused on PoS features, it should be possible to extend it to more types of features, perhaps even those derived from other biomarkers, such as syntactic features. CONCLUSIONS: The high classification accuracy of the proposed deep learner indicates that PoS features are strong clues in AD diagnosis. There are 12 PoS features that are strongly tied to AD, and because language is a noninvasive and potentially cheap method for detecting AD, this work shows some promising directions in this field. JMIR Publications 2023-05-02 /pmc/articles/PMC10189619/ /pubmed/37129944 http://dx.doi.org/10.2196/36590 Text en ©Bingyang Wen, Ning Wang, Koduvayur Subbalakshmi, Rajarathnam Chandramouli. Originally published in JMIR Formative Research (https://formative.jmir.org), 02.05.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Wen, Bingyang Wang, Ning Subbalakshmi, Koduvayur Chandramouli, Rajarathnam Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title | Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title_full | Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title_fullStr | Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title_full_unstemmed | Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title_short | Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal Explanation |
title_sort | revealing the roles of part-of-speech taggers in alzheimer disease detection: scientific discovery using one-intervention causal explanation |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189619/ https://www.ncbi.nlm.nih.gov/pubmed/37129944 http://dx.doi.org/10.2196/36590 |
work_keys_str_mv | AT wenbingyang revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation AT wangning revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation AT subbalakshmikoduvayur revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation AT chandramoulirajarathnam revealingtherolesofpartofspeechtaggersinalzheimerdiseasedetectionscientificdiscoveryusingoneinterventioncausalexplanation |