Cargando…

How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples

INTRODUCTION: Changes in speech can act as biomarkers of cognitive decline in Alzheimer’s disease (AD). While shorter speech samples would promote data collection and analysis, the minimum length of informative speech samples remains debated. This study aims to provide insight into the effect of sam...

Descripción completa

Detalles Bibliográficos
Autores principales: Petti, Ulla, Baker, Simon, Korhonen, Anna, Robin, Jessica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: S. Karger AG 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10673351/
https://www.ncbi.nlm.nih.gov/pubmed/38029002
http://dx.doi.org/10.1159/000533423
_version_ 1785149600554811392
author Petti, Ulla
Baker, Simon
Korhonen, Anna
Robin, Jessica
author_facet Petti, Ulla
Baker, Simon
Korhonen, Anna
Robin, Jessica
author_sort Petti, Ulla
collection PubMed
description INTRODUCTION: Changes in speech can act as biomarkers of cognitive decline in Alzheimer’s disease (AD). While shorter speech samples would promote data collection and analysis, the minimum length of informative speech samples remains debated. This study aims to provide insight into the effect of sample length in analyzing longitudinal recordings of spontaneous speech in AD by comparing the original random length, 5- and 1-minute-long samples. We hope to understand whether capping the audio improves the accuracy of the analysis, and whether an extra 4 min conveys necessary information. METHODS: 110 spontaneous speech samples were collected from decades of Youtube videos of 17 public figures, 9 of whom eventually developed AD. 456 language features were extracted and their text-length-sensitivity, comparability, and ability to capture change over time were analyzed across three different sample lengths. RESULTS: Capped audio files had advantages over the random length ones. While most extracted features were statistically comparable or highly correlated across the datasets, potential effects of sample length should be acknowledged for some features. The 5-min dataset presented the highest reliability in tracking the evolution of the disease, suggesting that the 4 extra minutes do convey informative data. CONCLUSION: Sample length seems to play an important role in extracting the language feature values from speech and tracking disease progress over time. We highlight the importance of further research into optimal sample length and standardization of methods when studying speech in AD.
format Online
Article
Text
id pubmed-10673351
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher S. Karger AG
record_format MEDLINE/PubMed
spelling pubmed-106733512023-11-24 How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples Petti, Ulla Baker, Simon Korhonen, Anna Robin, Jessica Digit Biomark Research Reports – Research Article INTRODUCTION: Changes in speech can act as biomarkers of cognitive decline in Alzheimer’s disease (AD). While shorter speech samples would promote data collection and analysis, the minimum length of informative speech samples remains debated. This study aims to provide insight into the effect of sample length in analyzing longitudinal recordings of spontaneous speech in AD by comparing the original random length, 5- and 1-minute-long samples. We hope to understand whether capping the audio improves the accuracy of the analysis, and whether an extra 4 min conveys necessary information. METHODS: 110 spontaneous speech samples were collected from decades of Youtube videos of 17 public figures, 9 of whom eventually developed AD. 456 language features were extracted and their text-length-sensitivity, comparability, and ability to capture change over time were analyzed across three different sample lengths. RESULTS: Capped audio files had advantages over the random length ones. While most extracted features were statistically comparable or highly correlated across the datasets, potential effects of sample length should be acknowledged for some features. The 5-min dataset presented the highest reliability in tracking the evolution of the disease, suggesting that the 4 extra minutes do convey informative data. CONCLUSION: Sample length seems to play an important role in extracting the language feature values from speech and tracking disease progress over time. We highlight the importance of further research into optimal sample length and standardization of methods when studying speech in AD. S. Karger AG 2023-11-24 /pmc/articles/PMC10673351/ /pubmed/38029002 http://dx.doi.org/10.1159/000533423 Text en © 2023 The Author(s). Published by S. Karger AG, Basel https://creativecommons.org/licenses/by-nc/4.0/This article is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC) (http://www.karger.com/Services/OpenAccessLicense). Usage and distribution for commercial purposes requires written permission.
spellingShingle Research Reports – Research Article
Petti, Ulla
Baker, Simon
Korhonen, Anna
Robin, Jessica
How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title_full How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title_fullStr How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title_full_unstemmed How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title_short How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples
title_sort how much speech data is needed for tracking language change in alzheimer’s disease? a comparison of random length, 5-min, and 1-min spontaneous speech samples
topic Research Reports – Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10673351/
https://www.ncbi.nlm.nih.gov/pubmed/38029002
http://dx.doi.org/10.1159/000533423
work_keys_str_mv AT pettiulla howmuchspeechdataisneededfortrackinglanguagechangeinalzheimersdiseaseacomparisonofrandomlength5minand1minspontaneousspeechsamples
AT bakersimon howmuchspeechdataisneededfortrackinglanguagechangeinalzheimersdiseaseacomparisonofrandomlength5minand1minspontaneousspeechsamples
AT korhonenanna howmuchspeechdataisneededfortrackinglanguagechangeinalzheimersdiseaseacomparisonofrandomlength5minand1minspontaneousspeechsamples
AT robinjessica howmuchspeechdataisneededfortrackinglanguagechangeinalzheimersdiseaseacomparisonofrandomlength5minand1minspontaneousspeechsamples