Cargando…

A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma

Real‐world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subs...

Descripción completa

Detalles Bibliográficos
Autores principales: Sondhi, Arjun, Weberpals, Janick, Yerram, Prakirthi, Jiang, Chengsheng, Taylor, Michael, Samant, Meghna, Cherng, Sarah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10508534/
https://www.ncbi.nlm.nih.gov/pubmed/37322818
http://dx.doi.org/10.1002/psp4.12998
_version_ 1785107558471565312
author Sondhi, Arjun
Weberpals, Janick
Yerram, Prakirthi
Jiang, Chengsheng
Taylor, Michael
Samant, Meghna
Cherng, Sarah
author_facet Sondhi, Arjun
Weberpals, Janick
Yerram, Prakirthi
Jiang, Chengsheng
Taylor, Michael
Samant, Meghna
Cherng, Sarah
author_sort Sondhi, Arjun
collection PubMed
description Real‐world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subsequent statistical analyses. We quantify evidence for missing completely at random (MCAR) or missing at random (MAR), mechanisms using Hotelling's multivariate t‐test, and random forest classifiers, respectively. We further illustrate how to apply sensitivity analyses using the not at random fully conditional specification procedure to examine changes in parameter estimates under missing not at random (MNAR) mechanisms. In simulation studies, we validated these diagnostics and compared analytic bias under different mechanisms. To demonstrate the application of this workflow, we applied it to two exemplary case studies with an advanced non‐small cell lung cancer and a multiple myeloma cohort derived from a real‐world oncology database. Here, we found strong evidence against MCAR, and some evidence of MAR, implying that imputation approaches that attempt to predict missing values by fitting a model to observed data may be suitable for use. Sensitivity analyses did not suggest meaningful departures of our analytic results under potential MNAR mechanisms; these results were also in line with results reported in clinical trials.
format Online
Article
Text
id pubmed-10508534
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-105085342023-09-20 A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma Sondhi, Arjun Weberpals, Janick Yerram, Prakirthi Jiang, Chengsheng Taylor, Michael Samant, Meghna Cherng, Sarah CPT Pharmacometrics Syst Pharmacol Research Real‐world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subsequent statistical analyses. We quantify evidence for missing completely at random (MCAR) or missing at random (MAR), mechanisms using Hotelling's multivariate t‐test, and random forest classifiers, respectively. We further illustrate how to apply sensitivity analyses using the not at random fully conditional specification procedure to examine changes in parameter estimates under missing not at random (MNAR) mechanisms. In simulation studies, we validated these diagnostics and compared analytic bias under different mechanisms. To demonstrate the application of this workflow, we applied it to two exemplary case studies with an advanced non‐small cell lung cancer and a multiple myeloma cohort derived from a real‐world oncology database. Here, we found strong evidence against MCAR, and some evidence of MAR, implying that imputation approaches that attempt to predict missing values by fitting a model to observed data may be suitable for use. Sensitivity analyses did not suggest meaningful departures of our analytic results under potential MNAR mechanisms; these results were also in line with results reported in clinical trials. John Wiley and Sons Inc. 2023-06-15 /pmc/articles/PMC10508534/ /pubmed/37322818 http://dx.doi.org/10.1002/psp4.12998 Text en © 2023 Flatiron Health and The Authors. CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals LLC on behalf of American Society for Clinical Pharmacology and Therapeutics. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Research
Sondhi, Arjun
Weberpals, Janick
Yerram, Prakirthi
Jiang, Chengsheng
Taylor, Michael
Samant, Meghna
Cherng, Sarah
A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title_full A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title_fullStr A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title_full_unstemmed A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title_short A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma
title_sort systematic approach towards missing lab data in electronic health records: a case study in non‐small cell lung cancer and multiple myeloma
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10508534/
https://www.ncbi.nlm.nih.gov/pubmed/37322818
http://dx.doi.org/10.1002/psp4.12998
work_keys_str_mv AT sondhiarjun asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT weberpalsjanick asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT yerramprakirthi asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT jiangchengsheng asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT taylormichael asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT samantmeghna asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT cherngsarah asystematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT sondhiarjun systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT weberpalsjanick systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT yerramprakirthi systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT jiangchengsheng systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT taylormichael systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT samantmeghna systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma
AT cherngsarah systematicapproachtowardsmissinglabdatainelectronichealthrecordsacasestudyinnonsmallcelllungcancerandmultiplemyeloma