Cargando…

Cut points and contexts

In research, policy, and practice, continuous variables are often categorized. Statisticians have generally advised against categorization for many reasons, such as loss of information and precision as well as distortion of estimated statistics. Here, a different kind of problem with categorization...

Descripción completa

Detalles Bibliográficos
Autor principal: Busch, Evan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8578203/
https://www.ncbi.nlm.nih.gov/pubmed/34424538
http://dx.doi.org/10.1002/cncr.33838
_version_ 1784596222373265408
author Busch, Evan L.
author_facet Busch, Evan L.
author_sort Busch, Evan L.
collection PubMed
description In research, policy, and practice, continuous variables are often categorized. Statisticians have generally advised against categorization for many reasons, such as loss of information and precision as well as distortion of estimated statistics. Here, a different kind of problem with categorization is considered: the idea that, for a given continuous variable, there is a unique set of cut points that is the objectively correct or best categorization. It is shown that this is unlikely to be the case because categorized variables typically exist in webs of statistical relationships with other variables. The choice of cut points for a categorized variable can influence the values of many statistics relating that variable to others. This essay explores the substantive trade‐offs that can arise between different possible cut points to categorize a continuous variable, making it difficult to say that any particular categorization is objectively best. Limitations of different approaches to selecting cut points are discussed. Contextual trade‐offs may often be an argument against categorization. At the very least, such trade‐offs mean that research inferences, or decisions about policy or practice, that involve categorized variables should be framed and acted upon with flexibility and humility. LAY SUMMARY: In research, policy, and practice, continuous variables are often turned into categorical variables with cut points that define the boundaries between categories. This involves choices about how many categories to create and what cut‐point values to use. This commentary shows that different choices about which cut points to use can lead to different sets of trade‐offs across multiple statistical relationships between the categorized variable and other variables. These trade‐offs mean that no single categorization is objectively best or correct. This context is critical when one is deciding whether and how to categorize a continuous variable.
format Online
Article
Text
id pubmed-8578203
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-85782032021-12-01 Cut points and contexts Busch, Evan L. Cancer Commentaries In research, policy, and practice, continuous variables are often categorized. Statisticians have generally advised against categorization for many reasons, such as loss of information and precision as well as distortion of estimated statistics. Here, a different kind of problem with categorization is considered: the idea that, for a given continuous variable, there is a unique set of cut points that is the objectively correct or best categorization. It is shown that this is unlikely to be the case because categorized variables typically exist in webs of statistical relationships with other variables. The choice of cut points for a categorized variable can influence the values of many statistics relating that variable to others. This essay explores the substantive trade‐offs that can arise between different possible cut points to categorize a continuous variable, making it difficult to say that any particular categorization is objectively best. Limitations of different approaches to selecting cut points are discussed. Contextual trade‐offs may often be an argument against categorization. At the very least, such trade‐offs mean that research inferences, or decisions about policy or practice, that involve categorized variables should be framed and acted upon with flexibility and humility. LAY SUMMARY: In research, policy, and practice, continuous variables are often turned into categorical variables with cut points that define the boundaries between categories. This involves choices about how many categories to create and what cut‐point values to use. This commentary shows that different choices about which cut points to use can lead to different sets of trade‐offs across multiple statistical relationships between the categorized variable and other variables. These trade‐offs mean that no single categorization is objectively best or correct. This context is critical when one is deciding whether and how to categorize a continuous variable. John Wiley and Sons Inc. 2021-08-23 2021-12-01 /pmc/articles/PMC8578203/ /pubmed/34424538 http://dx.doi.org/10.1002/cncr.33838 Text en © 2021 The Authors. Cancer published by Wiley Periodicals LLC on behalf of American Cancer Society. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Commentaries
Busch, Evan L.
Cut points and contexts
title Cut points and contexts
title_full Cut points and contexts
title_fullStr Cut points and contexts
title_full_unstemmed Cut points and contexts
title_short Cut points and contexts
title_sort cut points and contexts
topic Commentaries
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8578203/
https://www.ncbi.nlm.nih.gov/pubmed/34424538
http://dx.doi.org/10.1002/cncr.33838
work_keys_str_mv AT buschevanl cutpointsandcontexts