Cargando…

Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods

BACKGROUND: Researchers applying compositional data analysis to time-use data (e.g., time spent in physical behaviors) often face the problem of zeros, that is, recordings of zero time spent in any of the studied behaviors. Zeros hinder the application of compositional data analysis because the anal...

Descripción completa

Detalles Bibliográficos
Autores principales: Rasmussen, Charlotte Lund, Palarea-Albaladejo, Javier, Johansson, Melker Staffan, Crowley, Patrick, Stevens, Matthew Leigh, Gupta, Nidhi, Karstad, Kristina, Holtermann, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7542467/
https://www.ncbi.nlm.nih.gov/pubmed/33023619
http://dx.doi.org/10.1186/s12966-020-01029-z
_version_ 1783591556763090944
author Rasmussen, Charlotte Lund
Palarea-Albaladejo, Javier
Johansson, Melker Staffan
Crowley, Patrick
Stevens, Matthew Leigh
Gupta, Nidhi
Karstad, Kristina
Holtermann, Andreas
author_facet Rasmussen, Charlotte Lund
Palarea-Albaladejo, Javier
Johansson, Melker Staffan
Crowley, Patrick
Stevens, Matthew Leigh
Gupta, Nidhi
Karstad, Kristina
Holtermann, Andreas
author_sort Rasmussen, Charlotte Lund
collection PubMed
description BACKGROUND: Researchers applying compositional data analysis to time-use data (e.g., time spent in physical behaviors) often face the problem of zeros, that is, recordings of zero time spent in any of the studied behaviors. Zeros hinder the application of compositional data analysis because the analysis is based on log-ratios. One way to overcome this challenge is to replace the zeros with sensible small values. The aim of this study was to compare the performance of three existing replacement methods used within physical behavior time-use epidemiology: simple replacement, multiplicative replacement, and log-ratio expectation-maximization (lrEM) algorithm. Moreover, we assessed the consequence of choosing replacement values higher than the lowest observed value for a given behavior. METHOD: Using a complete dataset based on accelerometer data from 1310 Danish adults as reference, multiple datasets were simulated across six scenarios of zeros (5–30% zeros in 5% increments). Moreover, four examples were produced based on real data, in which, 10 and 20% zeros were imposed and replaced using a replacement value of 0.5 min, 65% of the observation threshold, or an estimated value below the observation threshold. For the simulation study and the examples, the zeros were replaced using the three replacement methods and the degree of distortion introduced was assessed by comparison with the complete dataset. RESULTS: The lrEM method outperformed the other replacement methods as it had the smallest influence on the structure of relative variation of the datasets. Both the simple and multiplicative replacements introduced higher distortion, particularly in scenarios with more than 10% zeros; although the latter, like the lrEM, does preserve the ratios between behaviors with no zeros. The examples revealed that replacing zeros with a value higher than the observation threshold severely affected the structure of relative variation. CONCLUSIONS: Given our findings, we encourage the use of replacement methods that preserve the relative structure of physical behavior data, as achieved by the multiplicative and lrEM replacements, and to avoid simple replacement. Moreover, we do not recommend replacing zeros with values higher than the lowest observed value for a behavior.
format Online
Article
Text
id pubmed-7542467
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75424672020-10-08 Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods Rasmussen, Charlotte Lund Palarea-Albaladejo, Javier Johansson, Melker Staffan Crowley, Patrick Stevens, Matthew Leigh Gupta, Nidhi Karstad, Kristina Holtermann, Andreas Int J Behav Nutr Phys Act Methodology BACKGROUND: Researchers applying compositional data analysis to time-use data (e.g., time spent in physical behaviors) often face the problem of zeros, that is, recordings of zero time spent in any of the studied behaviors. Zeros hinder the application of compositional data analysis because the analysis is based on log-ratios. One way to overcome this challenge is to replace the zeros with sensible small values. The aim of this study was to compare the performance of three existing replacement methods used within physical behavior time-use epidemiology: simple replacement, multiplicative replacement, and log-ratio expectation-maximization (lrEM) algorithm. Moreover, we assessed the consequence of choosing replacement values higher than the lowest observed value for a given behavior. METHOD: Using a complete dataset based on accelerometer data from 1310 Danish adults as reference, multiple datasets were simulated across six scenarios of zeros (5–30% zeros in 5% increments). Moreover, four examples were produced based on real data, in which, 10 and 20% zeros were imposed and replaced using a replacement value of 0.5 min, 65% of the observation threshold, or an estimated value below the observation threshold. For the simulation study and the examples, the zeros were replaced using the three replacement methods and the degree of distortion introduced was assessed by comparison with the complete dataset. RESULTS: The lrEM method outperformed the other replacement methods as it had the smallest influence on the structure of relative variation of the datasets. Both the simple and multiplicative replacements introduced higher distortion, particularly in scenarios with more than 10% zeros; although the latter, like the lrEM, does preserve the ratios between behaviors with no zeros. The examples revealed that replacing zeros with a value higher than the observation threshold severely affected the structure of relative variation. CONCLUSIONS: Given our findings, we encourage the use of replacement methods that preserve the relative structure of physical behavior data, as achieved by the multiplicative and lrEM replacements, and to avoid simple replacement. Moreover, we do not recommend replacing zeros with values higher than the lowest observed value for a behavior. BioMed Central 2020-10-06 /pmc/articles/PMC7542467/ /pubmed/33023619 http://dx.doi.org/10.1186/s12966-020-01029-z Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Rasmussen, Charlotte Lund
Palarea-Albaladejo, Javier
Johansson, Melker Staffan
Crowley, Patrick
Stevens, Matthew Leigh
Gupta, Nidhi
Karstad, Kristina
Holtermann, Andreas
Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title_full Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title_fullStr Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title_full_unstemmed Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title_short Zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
title_sort zero problems with compositional data of physical behaviors: a comparison of three zero replacement methods
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7542467/
https://www.ncbi.nlm.nih.gov/pubmed/33023619
http://dx.doi.org/10.1186/s12966-020-01029-z
work_keys_str_mv AT rasmussencharlottelund zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT palareaalbaladejojavier zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT johanssonmelkerstaffan zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT crowleypatrick zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT stevensmatthewleigh zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT guptanidhi zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT karstadkristina zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods
AT holtermannandreas zeroproblemswithcompositionaldataofphysicalbehaviorsacomparisonofthreezeroreplacementmethods