Cargando…
Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. Howeve...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109391/ https://www.ncbi.nlm.nih.gov/pubmed/35578372 http://dx.doi.org/10.1186/s40462-022-00324-7 |
_version_ | 1784708890653360128 |
---|---|
author | Overton, Cory Casazza, Michael Bretz, Joseph McDuie, Fiona Matchett, Elliott Mackell, Desmond Lorenz, Austen Mott, Andrea Herzog, Mark Ackerman, Josh |
author_facet | Overton, Cory Casazza, Michael Bretz, Joseph McDuie, Fiona Matchett, Elliott Mackell, Desmond Lorenz, Austen Mott, Andrea Herzog, Mark Ackerman, Josh |
author_sort | Overton, Cory |
collection | PubMed |
description | BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using “automated modelling pipelines”. METHODS: Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter “feature sets”: GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the “features” (independent variables) used in models. RESULTS: Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). CONCLUSIONS: Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40462-022-00324-7. |
format | Online Article Text |
id | pubmed-9109391 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-91093912022-05-17 Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl Overton, Cory Casazza, Michael Bretz, Joseph McDuie, Fiona Matchett, Elliott Mackell, Desmond Lorenz, Austen Mott, Andrea Herzog, Mark Ackerman, Josh Mov Ecol Methodology Article BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using “automated modelling pipelines”. METHODS: Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter “feature sets”: GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the “features” (independent variables) used in models. RESULTS: Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). CONCLUSIONS: Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40462-022-00324-7. BioMed Central 2022-05-16 /pmc/articles/PMC9109391/ /pubmed/35578372 http://dx.doi.org/10.1186/s40462-022-00324-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Overton, Cory Casazza, Michael Bretz, Joseph McDuie, Fiona Matchett, Elliott Mackell, Desmond Lorenz, Austen Mott, Andrea Herzog, Mark Ackerman, Josh Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title | Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title_full | Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title_fullStr | Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title_full_unstemmed | Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title_short | Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl |
title_sort | machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to north american waterfowl |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109391/ https://www.ncbi.nlm.nih.gov/pubmed/35578372 http://dx.doi.org/10.1186/s40462-022-00324-7 |
work_keys_str_mv | AT overtoncory machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT casazzamichael machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT bretzjoseph machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT mcduiefiona machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT matchettelliott machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT mackelldesmond machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT lorenzausten machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT mottandrea machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT herzogmark machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl AT ackermanjosh machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl |