Cargando…

Population-level and individual-level explainers for propensity score matching in observational studies

PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of prop...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghosh, Debashis, Amini, Arya, Jones, Bernard L., Karam, Sana D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630947/
https://www.ncbi.nlm.nih.gov/pubmed/36338745
http://dx.doi.org/10.3389/fonc.2022.958907
_version_ 1784823719293616128
author Ghosh, Debashis
Amini, Arya
Jones, Bernard L.
Karam, Sana D.
author_facet Ghosh, Debashis
Amini, Arya
Jones, Bernard L.
Karam, Sana D.
author_sort Ghosh, Debashis
collection PubMed
description PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of propensity scores in evaluating the effect of cancer treatments on survival, particularly in administrative databases and cancer registries. A byproduct of certain matching schemes is the exclusion of observations. Borrowing an analogy from clinical trials, one can view these exclusions as subjects that do not satisfy eligibility criteria. METHODS: Developing identification rules for these “data-driven eligibility criteria” in observational studies on both population and individual levels helps to ascertain the population on which causal effects are being made. This article presents a machine learning method to determine the representativeness of causal effects in two different datasets from the National Cancer Database. RESULTS: Decision trees reveal that groups with certain features have a higher probability of inclusion in the study population than older patients. In the first dataset, younger age categories had an inclusion probability of at least 0.90 in all models, while the probability for the older category ranged from 0.47 to 0.65. Most trees split once more on an even higher age at a lower node, suggesting that the oldest patients are the least likely to be matched. In the second set of data, both age and surgery status were associated with inclusion. CONCLUSION: The methodology presented in this paper underscores the need to consider exclusions in propensity score matching procedures as well as complementing matching with other propensity score adjustments.
format Online
Article
Text
id pubmed-9630947
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96309472022-11-04 Population-level and individual-level explainers for propensity score matching in observational studies Ghosh, Debashis Amini, Arya Jones, Bernard L. Karam, Sana D. Front Oncol Oncology PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of propensity scores in evaluating the effect of cancer treatments on survival, particularly in administrative databases and cancer registries. A byproduct of certain matching schemes is the exclusion of observations. Borrowing an analogy from clinical trials, one can view these exclusions as subjects that do not satisfy eligibility criteria. METHODS: Developing identification rules for these “data-driven eligibility criteria” in observational studies on both population and individual levels helps to ascertain the population on which causal effects are being made. This article presents a machine learning method to determine the representativeness of causal effects in two different datasets from the National Cancer Database. RESULTS: Decision trees reveal that groups with certain features have a higher probability of inclusion in the study population than older patients. In the first dataset, younger age categories had an inclusion probability of at least 0.90 in all models, while the probability for the older category ranged from 0.47 to 0.65. Most trees split once more on an even higher age at a lower node, suggesting that the oldest patients are the least likely to be matched. In the second set of data, both age and surgery status were associated with inclusion. CONCLUSION: The methodology presented in this paper underscores the need to consider exclusions in propensity score matching procedures as well as complementing matching with other propensity score adjustments. Frontiers Media S.A. 2022-10-20 /pmc/articles/PMC9630947/ /pubmed/36338745 http://dx.doi.org/10.3389/fonc.2022.958907 Text en Copyright © 2022 Ghosh, Amini, Jones and Karam https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Ghosh, Debashis
Amini, Arya
Jones, Bernard L.
Karam, Sana D.
Population-level and individual-level explainers for propensity score matching in observational studies
title Population-level and individual-level explainers for propensity score matching in observational studies
title_full Population-level and individual-level explainers for propensity score matching in observational studies
title_fullStr Population-level and individual-level explainers for propensity score matching in observational studies
title_full_unstemmed Population-level and individual-level explainers for propensity score matching in observational studies
title_short Population-level and individual-level explainers for propensity score matching in observational studies
title_sort population-level and individual-level explainers for propensity score matching in observational studies
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630947/
https://www.ncbi.nlm.nih.gov/pubmed/36338745
http://dx.doi.org/10.3389/fonc.2022.958907
work_keys_str_mv AT ghoshdebashis populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies
AT aminiarya populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies
AT jonesbernardl populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies
AT karamsanad populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies