Cargando…
Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning
Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sens...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8981298/ https://www.ncbi.nlm.nih.gov/pubmed/35391750 http://dx.doi.org/10.3389/fnagi.2022.808518 |
_version_ | 1784681573743853568 |
---|---|
author | Rehman, Rana Zia Ur Guan, Yu Shi, Jian Qing Alcock, Lisa Yarnall, Alison J. Rochester, Lynn Del Din, Silvia |
author_facet | Rehman, Rana Zia Ur Guan, Yu Shi, Jian Qing Alcock, Lisa Yarnall, Alison J. Rochester, Lynn Del Din, Silvia |
author_sort | Rehman, Rana Zia Ur |
collection | PubMed |
description | Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sensitive to classify PD compared to laboratory data. Real-world gait yields multiple walking bouts (WBs), and selecting the optimal method to aggregate the data (e.g., different WB durations) is essential as this may influence classification performance. The objective of this study was to investigate the impact of environment (laboratory vs. real world) and data aggregation on ML performance for optimizing sensitivity of PD classification. Gait assessment was performed on 47 people with PD (age: 68 ± 9 years) and 52 controls [Healthy controls (HCs), age: 70 ± 7 years]. In the laboratory, participants walked at their normal pace for 2 min, while in the real world, participants were assessed over 7 days. In both environments, 14 gait characteristics were evaluated from one tri-axial accelerometer attached to the lower back. The ability of individual gait characteristics to differentiate PD from HC was evaluated using the Area Under the Curve (AUC). ML models (i.e., support vector machine, random forest, and ensemble models) applied to real-world gait showed better classification performance compared to laboratory data. Real-world gait characteristics aggregated over longer WBs (WB 30–60 s, WB > 60 s, WB > 120 s) resulted in superior discriminative performance (PD vs. HC) compared to laboratory gait characteristics (0.51 ≤ AUC ≤ 0.77). Real-world gait speed showed the highest AUC of 0.77. Overall, random forest trained on 14 gait characteristics aggregated over WBs > 60 s gave better performance (F1 score = 77.20 ± 5.51%) as compared to laboratory results (F1 Score = 68.75 ± 12.80%). Findings from this study suggest that the choice of environment and data aggregation are important to achieve maximum discrimination performance and have direct impact on ML performance for PD classification. This study highlights the importance of a harmonized approach to data analysis in order to drive future implementation and clinical use. CLINICAL TRIAL REGISTRATION: [09/H0906/82]. |
format | Online Article Text |
id | pubmed-8981298 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89812982022-04-06 Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning Rehman, Rana Zia Ur Guan, Yu Shi, Jian Qing Alcock, Lisa Yarnall, Alison J. Rochester, Lynn Del Din, Silvia Front Aging Neurosci Neuroscience Parkinson’s disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sensitive to classify PD compared to laboratory data. Real-world gait yields multiple walking bouts (WBs), and selecting the optimal method to aggregate the data (e.g., different WB durations) is essential as this may influence classification performance. The objective of this study was to investigate the impact of environment (laboratory vs. real world) and data aggregation on ML performance for optimizing sensitivity of PD classification. Gait assessment was performed on 47 people with PD (age: 68 ± 9 years) and 52 controls [Healthy controls (HCs), age: 70 ± 7 years]. In the laboratory, participants walked at their normal pace for 2 min, while in the real world, participants were assessed over 7 days. In both environments, 14 gait characteristics were evaluated from one tri-axial accelerometer attached to the lower back. The ability of individual gait characteristics to differentiate PD from HC was evaluated using the Area Under the Curve (AUC). ML models (i.e., support vector machine, random forest, and ensemble models) applied to real-world gait showed better classification performance compared to laboratory data. Real-world gait characteristics aggregated over longer WBs (WB 30–60 s, WB > 60 s, WB > 120 s) resulted in superior discriminative performance (PD vs. HC) compared to laboratory gait characteristics (0.51 ≤ AUC ≤ 0.77). Real-world gait speed showed the highest AUC of 0.77. Overall, random forest trained on 14 gait characteristics aggregated over WBs > 60 s gave better performance (F1 score = 77.20 ± 5.51%) as compared to laboratory results (F1 Score = 68.75 ± 12.80%). Findings from this study suggest that the choice of environment and data aggregation are important to achieve maximum discrimination performance and have direct impact on ML performance for PD classification. This study highlights the importance of a harmonized approach to data analysis in order to drive future implementation and clinical use. CLINICAL TRIAL REGISTRATION: [09/H0906/82]. Frontiers Media S.A. 2022-03-22 /pmc/articles/PMC8981298/ /pubmed/35391750 http://dx.doi.org/10.3389/fnagi.2022.808518 Text en Copyright © 2022 Rehman, Guan, Shi, Alcock, Yarnall, Rochester and Del Din. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Rehman, Rana Zia Ur Guan, Yu Shi, Jian Qing Alcock, Lisa Yarnall, Alison J. Rochester, Lynn Del Din, Silvia Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title | Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title_full | Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title_fullStr | Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title_full_unstemmed | Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title_short | Investigating the Impact of Environment and Data Aggregation by Walking Bout Duration on Parkinson’s Disease Classification Using Machine Learning |
title_sort | investigating the impact of environment and data aggregation by walking bout duration on parkinson’s disease classification using machine learning |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8981298/ https://www.ncbi.nlm.nih.gov/pubmed/35391750 http://dx.doi.org/10.3389/fnagi.2022.808518 |
work_keys_str_mv | AT rehmanranaziaur investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT guanyu investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT shijianqing investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT alcocklisa investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT yarnallalisonj investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT rochesterlynn investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning AT deldinsilvia investigatingtheimpactofenvironmentanddataaggregationbywalkingboutdurationonparkinsonsdiseaseclassificationusingmachinelearning |