Cargando…

Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning

Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sens...

Descripción completa

Detalles Bibliográficos
Autores principales: Rehman, Rana Zia Ur, Guan, Yu, Shi, Jian Qing, Alcock, Lisa, Yarnall, Alison J., Rochester, Lynn, Del Din, Silvia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8981298/
https://www.ncbi.nlm.nih.gov/pubmed/35391750
http://dx.doi.org/10.3389/fnagi.2022.808518
_version_ 1784681573743853568
author Rehman, Rana Zia Ur
Guan, Yu
Shi, Jian Qing
Alcock, Lisa
Yarnall, Alison J.
Rochester, Lynn
Del Din, Silvia
author_facet Rehman, Rana Zia Ur
Guan, Yu
Shi, Jian Qing
Alcock, Lisa
Yarnall, Alison J.
Rochester, Lynn
Del Din, Silvia
author_sort Rehman, Rana Zia Ur
collection PubMed
description Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sensitive to classify PD compared to laboratory data. Real-world gait yields multiple walking bouts (WBs), and selecting the optimal method to aggregate the data (e.g., different WB durations) is essential as this may influence classification performance. The objective of this study was to investigate the impact of environment (laboratory vs. real world) and data aggregation on ML performance for optimizing sensitivity of PD classification. Gait assessment was performed on 47 people with PD (age: 68 ± 9 years) and 52 controls [Healthy controls (HCs), age: 70 ± 7 years]. In the laboratory, participants walked at their normal pace for 2 min, while in the real world, participants were assessed over 7 days. In both environments, 14 gait characteristics were evaluated from one tri-axial accelerometer attached to the lower back. The ability of individual gait characteristics to differentiate PD from HC was evaluated using the Area Under the Curve (AUC). ML models (i.e., support vector machine, random forest, and ensemble models) applied to real-world gait showed better classification performance compared to laboratory data. Real-world gait characteristics aggregated over longer WBs (WB 30–60 s, WB > 60 s, WB > 120 s) resulted in superior discriminative performance (PD vs. HC) compared to laboratory gait characteristics (0.51 ≤ AUC ≤ 0.77). Real-world gait speed showed the highest AUC of 0.77. Overall, random forest trained on 14 gait characteristics aggregated over WBs > 60 s gave better performance (F1 score = 77.20 ± 5.51%) as compared to laboratory results (F1 Score = 68.75 ± 12.80%). Findings from this study suggest that the choice of environment and data aggregation are important to achieve maximum discrimination performance and have direct impact on ML performance for PD classification. This study highlights the importance of a harmonized approach to data analysis in order to drive future implementation and clinical use. CLINICAL TRIAL REGISTRATION: [09/H0906/82].
format Online
Article
Text
id pubmed-8981298
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89812982022-04-06 Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning Rehman, Rana Zia Ur Guan, Yu Shi, Jian Qing Alcock, Lisa Yarnall, Alison J. Rochester, Lynn Del Din, Silvia Front Aging Neurosci Neuroscience Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sensitive to classify PD compared to laboratory data. Real-world gait yields multiple walking bouts (WBs), and selecting the optimal method to aggregate the data (e.g., different WB durations) is essential as this may influence classification performance. The objective of this study was to investigate the impact of environment (laboratory vs. real world) and data aggregation on ML performance for optimizing sensitivity of PD classification. Gait assessment was performed on 47 people with PD (age: 68 ± 9 years) and 52 controls [Healthy controls (HCs), age: 70 ± 7 years]. In the laboratory, participants walked at their normal pace for 2 min, while in the real world, participants were assessed over 7 days. In both environments, 14 gait characteristics were evaluated from one tri-axial accelerometer attached to the lower back. The ability of individual gait characteristics to differentiate PD from HC was evaluated using the Area Under the Curve (AUC). ML models (i.e., support vector machine, random forest, and ensemble models) applied to real-world gait showed better classification performance compared to laboratory data. Real-world gait characteristics aggregated over longer WBs (WB 30–60 s, WB > 60 s, WB > 120 s) resulted in superior discriminative performance (PD vs. HC) compared to laboratory gait characteristics (0.51 ≤ AUC ≤ 0.77). Real-world gait speed showed the highest AUC of 0.77. Overall, random forest trained on 14 gait characteristics aggregated over WBs > 60 s gave better performance (F1 score = 77.20 ± 5.51%) as compared to laboratory results (F1 Score = 68.75 ± 12.80%). Findings from this study suggest that the choice of environment and data aggregation are important to achieve maximum discrimination performance and have direct impact on ML performance for PD classification. This study highlights the importance of a harmonized approach to data analysis in order to drive future implementation and clinical use. CLINICAL TRIAL REGISTRATION: [09/H0906/82]. Frontiers Media S.A. 2022-03-22 /pmc/articles/PMC8981298/ /pubmed/35391750 http://dx.doi.org/10.3389/fnagi.2022.808518 Text en Copyright © 2022 Rehman, Guan, Shi, Alcock, Yarnall, Rochester and Del Din. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Rehman, Rana Zia Ur
Guan, Yu
Shi, Jian Qing
Alcock, Lisa
Yarnall, Alison J.
Rochester, Lynn
Del Din, Silvia
Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title_full Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title_fullStr Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title_full_unstemmed Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title_short Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
title_sort investigating the impact of environment and data aggregation by walking bout duration on parkinson’s disease classification using machine learning
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8981298/
https://www.ncbi.nlm.nih.gov/pubmed/35391750
http://dx.doi.org/10.3389/fnagi.2022.808518
work_keys_str_mv AT rehmanranaziaur investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT guanyu investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT shijianqing investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT alcocklisa investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT yarnallalisonj investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT rochesterlynn investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning
AT deldinsilvia investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning