Cargando…

Automated time activity classification based on global positioning system (GPS) tracking data

BACKGROUND: Air pollution epidemiological studies are increasingly using global positioning system (GPS) to collect time-location data because they offer continuous tracking, high temporal resolution, and minimum reporting burden for participants. However, substantial uncertainties in the processing...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jun, Jiang, Chengsheng, Houston, Douglas, Baker, Dean, Delfino, Ralph
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3256108/
https://www.ncbi.nlm.nih.gov/pubmed/22082316
http://dx.doi.org/10.1186/1476-069X-10-101
_version_ 1782221033343811584
author Wu, Jun
Jiang, Chengsheng
Houston, Douglas
Baker, Dean
Delfino, Ralph
author_facet Wu, Jun
Jiang, Chengsheng
Houston, Douglas
Baker, Dean
Delfino, Ralph
author_sort Wu, Jun
collection PubMed
description BACKGROUND: Air pollution epidemiological studies are increasingly using global positioning system (GPS) to collect time-location data because they offer continuous tracking, high temporal resolution, and minimum reporting burden for participants. However, substantial uncertainties in the processing and classifying of raw GPS data create challenges for reliably characterizing time activity patterns. We developed and evaluated models to classify people's major time activity patterns from continuous GPS tracking data. METHODS: We developed and evaluated two automated models to classify major time activity patterns (i.e., indoor, outdoor static, outdoor walking, and in-vehicle travel) based on GPS time activity data collected under free living conditions for 47 participants (N = 131 person-days) from the Harbor Communities Time Location Study (HCTLS) in 2008 and supplemental GPS data collected from three UC-Irvine research staff (N = 21 person-days) in 2010. Time activity patterns used for model development were manually classified by research staff using information from participant GPS recordings, activity logs, and follow-up interviews. We evaluated two models: (a) a rule-based model that developed user-defined rules based on time, speed, and spatial location, and (b) a random forest decision tree model. RESULTS: Indoor, outdoor static, outdoor walking and in-vehicle travel activities accounted for 82.7%, 6.1%, 3.2% and 7.2% of manually-classified time activities in the HCTLS dataset, respectively. The rule-based model classified indoor and in-vehicle travel periods reasonably well (Indoor: sensitivity > 91%, specificity > 80%, and precision > 96%; in-vehicle travel: sensitivity > 71%, specificity > 99%, and precision > 88%), but the performance was moderate for outdoor static and outdoor walking predictions. No striking differences in performance were observed between the rule-based and the random forest models. The random forest model was fast and easy to execute, but was likely less robust than the rule-based model under the condition of biased or poor quality training data. CONCLUSIONS: Our models can successfully identify indoor and in-vehicle travel points from the raw GPS data, but challenges remain in developing models to distinguish outdoor static points and walking. Accurate training data are essential in developing reliable models in classifying time-activity patterns.
format Online
Article
Text
id pubmed-3256108
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32561082012-01-12 Automated time activity classification based on global positioning system (GPS) tracking data Wu, Jun Jiang, Chengsheng Houston, Douglas Baker, Dean Delfino, Ralph Environ Health Research BACKGROUND: Air pollution epidemiological studies are increasingly using global positioning system (GPS) to collect time-location data because they offer continuous tracking, high temporal resolution, and minimum reporting burden for participants. However, substantial uncertainties in the processing and classifying of raw GPS data create challenges for reliably characterizing time activity patterns. We developed and evaluated models to classify people's major time activity patterns from continuous GPS tracking data. METHODS: We developed and evaluated two automated models to classify major time activity patterns (i.e., indoor, outdoor static, outdoor walking, and in-vehicle travel) based on GPS time activity data collected under free living conditions for 47 participants (N = 131 person-days) from the Harbor Communities Time Location Study (HCTLS) in 2008 and supplemental GPS data collected from three UC-Irvine research staff (N = 21 person-days) in 2010. Time activity patterns used for model development were manually classified by research staff using information from participant GPS recordings, activity logs, and follow-up interviews. We evaluated two models: (a) a rule-based model that developed user-defined rules based on time, speed, and spatial location, and (b) a random forest decision tree model. RESULTS: Indoor, outdoor static, outdoor walking and in-vehicle travel activities accounted for 82.7%, 6.1%, 3.2% and 7.2% of manually-classified time activities in the HCTLS dataset, respectively. The rule-based model classified indoor and in-vehicle travel periods reasonably well (Indoor: sensitivity > 91%, specificity > 80%, and precision > 96%; in-vehicle travel: sensitivity > 71%, specificity > 99%, and precision > 88%), but the performance was moderate for outdoor static and outdoor walking predictions. No striking differences in performance were observed between the rule-based and the random forest models. The random forest model was fast and easy to execute, but was likely less robust than the rule-based model under the condition of biased or poor quality training data. CONCLUSIONS: Our models can successfully identify indoor and in-vehicle travel points from the raw GPS data, but challenges remain in developing models to distinguish outdoor static points and walking. Accurate training data are essential in developing reliable models in classifying time-activity patterns. BioMed Central 2011-11-14 /pmc/articles/PMC3256108/ /pubmed/22082316 http://dx.doi.org/10.1186/1476-069X-10-101 Text en Copyright ©2011 Wu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Wu, Jun
Jiang, Chengsheng
Houston, Douglas
Baker, Dean
Delfino, Ralph
Automated time activity classification based on global positioning system (GPS) tracking data
title Automated time activity classification based on global positioning system (GPS) tracking data
title_full Automated time activity classification based on global positioning system (GPS) tracking data
title_fullStr Automated time activity classification based on global positioning system (GPS) tracking data
title_full_unstemmed Automated time activity classification based on global positioning system (GPS) tracking data
title_short Automated time activity classification based on global positioning system (GPS) tracking data
title_sort automated time activity classification based on global positioning system (gps) tracking data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3256108/
https://www.ncbi.nlm.nih.gov/pubmed/22082316
http://dx.doi.org/10.1186/1476-069X-10-101
work_keys_str_mv AT wujun automatedtimeactivityclassificationbasedonglobalpositioningsystemgpstrackingdata
AT jiangchengsheng automatedtimeactivityclassificationbasedonglobalpositioningsystemgpstrackingdata
AT houstondouglas automatedtimeactivityclassificationbasedonglobalpositioningsystemgpstrackingdata
AT bakerdean automatedtimeactivityclassificationbasedonglobalpositioningsystemgpstrackingdata
AT delfinoralph automatedtimeactivityclassificationbasedonglobalpositioningsystemgpstrackingdata