Cargando…

Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl

BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Overton, Cory, Casazza, Michael, Bretz, Joseph, McDuie, Fiona, Matchett, Elliott, Mackell, Desmond, Lorenz, Austen, Mott, Andrea, Herzog, Mark, Ackerman, Josh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109391/
https://www.ncbi.nlm.nih.gov/pubmed/35578372
http://dx.doi.org/10.1186/s40462-022-00324-7
_version_ 1784708890653360128
author Overton, Cory
Casazza, Michael
Bretz, Joseph
McDuie, Fiona
Matchett, Elliott
Mackell, Desmond
Lorenz, Austen
Mott, Andrea
Herzog, Mark
Ackerman, Josh
author_facet Overton, Cory
Casazza, Michael
Bretz, Joseph
McDuie, Fiona
Matchett, Elliott
Mackell, Desmond
Lorenz, Austen
Mott, Andrea
Herzog, Mark
Ackerman, Josh
author_sort Overton, Cory
collection PubMed
description BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using “automated modelling pipelines”. METHODS: Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter “feature sets”: GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the “features” (independent variables) used in models. RESULTS: Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). CONCLUSIONS: Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40462-022-00324-7.
format Online
Article
Text
id pubmed-9109391
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91093912022-05-17 Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl Overton, Cory Casazza, Michael Bretz, Joseph McDuie, Fiona Matchett, Elliott Mackell, Desmond Lorenz, Austen Mott, Andrea Herzog, Mark Ackerman, Josh Mov Ecol Methodology Article BACKGROUND: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using “automated modelling pipelines”. METHODS: Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter “feature sets”: GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the “features” (independent variables) used in models. RESULTS: Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). CONCLUSIONS: Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40462-022-00324-7. BioMed Central 2022-05-16 /pmc/articles/PMC9109391/ /pubmed/35578372 http://dx.doi.org/10.1186/s40462-022-00324-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Overton, Cory
Casazza, Michael
Bretz, Joseph
McDuie, Fiona
Matchett, Elliott
Mackell, Desmond
Lorenz, Austen
Mott, Andrea
Herzog, Mark
Ackerman, Josh
Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title_full Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title_fullStr Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title_full_unstemmed Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title_short Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl
title_sort machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to north american waterfowl
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9109391/
https://www.ncbi.nlm.nih.gov/pubmed/35578372
http://dx.doi.org/10.1186/s40462-022-00324-7
work_keys_str_mv AT overtoncory machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT casazzamichael machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT bretzjoseph machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT mcduiefiona machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT matchettelliott machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT mackelldesmond machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT lorenzausten machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT mottandrea machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT herzogmark machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl
AT ackermanjosh machinelearneddailylifehistoryclassificationusinglowfrequencytrackingdataandautomatedmodellingpipelinesapplicationtonorthamericanwaterfowl