Cargando…
Population-level and individual-level explainers for propensity score matching in observational studies
PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of prop...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630947/ https://www.ncbi.nlm.nih.gov/pubmed/36338745 http://dx.doi.org/10.3389/fonc.2022.958907 |
_version_ | 1784823719293616128 |
---|---|
author | Ghosh, Debashis Amini, Arya Jones, Bernard L. Karam, Sana D. |
author_facet | Ghosh, Debashis Amini, Arya Jones, Bernard L. Karam, Sana D. |
author_sort | Ghosh, Debashis |
collection | PubMed |
description | PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of propensity scores in evaluating the effect of cancer treatments on survival, particularly in administrative databases and cancer registries. A byproduct of certain matching schemes is the exclusion of observations. Borrowing an analogy from clinical trials, one can view these exclusions as subjects that do not satisfy eligibility criteria. METHODS: Developing identification rules for these “data-driven eligibility criteria” in observational studies on both population and individual levels helps to ascertain the population on which causal effects are being made. This article presents a machine learning method to determine the representativeness of causal effects in two different datasets from the National Cancer Database. RESULTS: Decision trees reveal that groups with certain features have a higher probability of inclusion in the study population than older patients. In the first dataset, younger age categories had an inclusion probability of at least 0.90 in all models, while the probability for the older category ranged from 0.47 to 0.65. Most trees split once more on an even higher age at a lower node, suggesting that the oldest patients are the least likely to be matched. In the second set of data, both age and surgery status were associated with inclusion. CONCLUSION: The methodology presented in this paper underscores the need to consider exclusions in propensity score matching procedures as well as complementing matching with other propensity score adjustments. |
format | Online Article Text |
id | pubmed-9630947 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96309472022-11-04 Population-level and individual-level explainers for propensity score matching in observational studies Ghosh, Debashis Amini, Arya Jones, Bernard L. Karam, Sana D. Front Oncol Oncology PRECIS: The exclusion of unmatched observations in propensity score matching has implications for the generalizability of causal effects. Machine learning methods can help to identify how the study population differs from the unmatched subpopulation. BACKGROUND: There has been widespread use of propensity scores in evaluating the effect of cancer treatments on survival, particularly in administrative databases and cancer registries. A byproduct of certain matching schemes is the exclusion of observations. Borrowing an analogy from clinical trials, one can view these exclusions as subjects that do not satisfy eligibility criteria. METHODS: Developing identification rules for these “data-driven eligibility criteria” in observational studies on both population and individual levels helps to ascertain the population on which causal effects are being made. This article presents a machine learning method to determine the representativeness of causal effects in two different datasets from the National Cancer Database. RESULTS: Decision trees reveal that groups with certain features have a higher probability of inclusion in the study population than older patients. In the first dataset, younger age categories had an inclusion probability of at least 0.90 in all models, while the probability for the older category ranged from 0.47 to 0.65. Most trees split once more on an even higher age at a lower node, suggesting that the oldest patients are the least likely to be matched. In the second set of data, both age and surgery status were associated with inclusion. CONCLUSION: The methodology presented in this paper underscores the need to consider exclusions in propensity score matching procedures as well as complementing matching with other propensity score adjustments. Frontiers Media S.A. 2022-10-20 /pmc/articles/PMC9630947/ /pubmed/36338745 http://dx.doi.org/10.3389/fonc.2022.958907 Text en Copyright © 2022 Ghosh, Amini, Jones and Karam https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Oncology Ghosh, Debashis Amini, Arya Jones, Bernard L. Karam, Sana D. Population-level and individual-level explainers for propensity score matching in observational studies |
title | Population-level and individual-level explainers for propensity score matching in observational studies |
title_full | Population-level and individual-level explainers for propensity score matching in observational studies |
title_fullStr | Population-level and individual-level explainers for propensity score matching in observational studies |
title_full_unstemmed | Population-level and individual-level explainers for propensity score matching in observational studies |
title_short | Population-level and individual-level explainers for propensity score matching in observational studies |
title_sort | population-level and individual-level explainers for propensity score matching in observational studies |
topic | Oncology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630947/ https://www.ncbi.nlm.nih.gov/pubmed/36338745 http://dx.doi.org/10.3389/fonc.2022.958907 |
work_keys_str_mv | AT ghoshdebashis populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies AT aminiarya populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies AT jonesbernardl populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies AT karamsanad populationlevelandindividuallevelexplainersforpropensityscorematchinginobservationalstudies |