Cargando…

The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction

BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic...

Descripción completa

Detalles Bibliográficos
Autores principales: Bennett, Tellen D., Moffitt, Richard A., Hajagos, Janos G., Amor, Benjamin, Anand, Adit, Bissell, Mark M., Bradwell, Katie Rebecca, Bremer, Carolyn, Byrd, James Brian, Denham, Alina, DeWitt, Peter E., Gabriel, Davera, Garibaldi, Brian T., Girvin, Andrew T., Guinney, Justin, Hill, Elaine L., Hong, Stephanie S., Jimenez, Hunter, Kavuluru, Ramakanth, Kostka, Kristin, Lehmann, Harold P., Levitt, Eli, Mallipattu, Sandeep K., Manna, Amin, McMurry, Julie A., Morris, Michele, Muschelli, John, Neumann, Andrew J., Palchuk, Matvey B., Pfaff, Emily R., Qian, Zhenglong, Qureshi, Nabeel, Russell, Seth, Spratt, Heidi, Walden, Anita, Williams, Andrew E., Wooldridge, Jacob T., Yoo, Yun Jae, Zhang, Xiaohan Tanner, Zhu, Richard L., Austin, Christopher P., Saltz, Joel H., Gersing, Ken R., Haendel, Melissa A., Chute, Christopher G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7814838/
https://www.ncbi.nlm.nih.gov/pubmed/33469592
http://dx.doi.org/10.1101/2021.01.12.21249511
_version_ 1783638127676489728
author Bennett, Tellen D.
Moffitt, Richard A.
Hajagos, Janos G.
Amor, Benjamin
Anand, Adit
Bissell, Mark M.
Bradwell, Katie Rebecca
Bremer, Carolyn
Byrd, James Brian
Denham, Alina
DeWitt, Peter E.
Gabriel, Davera
Garibaldi, Brian T.
Girvin, Andrew T.
Guinney, Justin
Hill, Elaine L.
Hong, Stephanie S.
Jimenez, Hunter
Kavuluru, Ramakanth
Kostka, Kristin
Lehmann, Harold P.
Levitt, Eli
Mallipattu, Sandeep K.
Manna, Amin
McMurry, Julie A.
Morris, Michele
Muschelli, John
Neumann, Andrew J.
Palchuk, Matvey B.
Pfaff, Emily R.
Qian, Zhenglong
Qureshi, Nabeel
Russell, Seth
Spratt, Heidi
Walden, Anita
Williams, Andrew E.
Wooldridge, Jacob T.
Yoo, Yun Jae
Zhang, Xiaohan Tanner
Zhu, Richard L.
Austin, Christopher P.
Saltz, Joel H.
Gersing, Ken R.
Haendel, Melissa A.
Chute, Christopher G.
author_facet Bennett, Tellen D.
Moffitt, Richard A.
Hajagos, Janos G.
Amor, Benjamin
Anand, Adit
Bissell, Mark M.
Bradwell, Katie Rebecca
Bremer, Carolyn
Byrd, James Brian
Denham, Alina
DeWitt, Peter E.
Gabriel, Davera
Garibaldi, Brian T.
Girvin, Andrew T.
Guinney, Justin
Hill, Elaine L.
Hong, Stephanie S.
Jimenez, Hunter
Kavuluru, Ramakanth
Kostka, Kristin
Lehmann, Harold P.
Levitt, Eli
Mallipattu, Sandeep K.
Manna, Amin
McMurry, Julie A.
Morris, Michele
Muschelli, John
Neumann, Andrew J.
Palchuk, Matvey B.
Pfaff, Emily R.
Qian, Zhenglong
Qureshi, Nabeel
Russell, Seth
Spratt, Heidi
Walden, Anita
Williams, Andrew E.
Wooldridge, Jacob T.
Yoo, Yun Jae
Zhang, Xiaohan Tanner
Zhu, Richard L.
Austin, Christopher P.
Saltz, Joel H.
Gersing, Ken R.
Haendel, Melissa A.
Chute, Christopher G.
author_sort Bennett, Tellen D.
collection PubMed
description BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy. METHODS AND FINDINGS: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients. CONCLUSIONS: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.
format Online
Article
Text
id pubmed-7814838
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-78148382021-01-20 The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction Bennett, Tellen D. Moffitt, Richard A. Hajagos, Janos G. Amor, Benjamin Anand, Adit Bissell, Mark M. Bradwell, Katie Rebecca Bremer, Carolyn Byrd, James Brian Denham, Alina DeWitt, Peter E. Gabriel, Davera Garibaldi, Brian T. Girvin, Andrew T. Guinney, Justin Hill, Elaine L. Hong, Stephanie S. Jimenez, Hunter Kavuluru, Ramakanth Kostka, Kristin Lehmann, Harold P. Levitt, Eli Mallipattu, Sandeep K. Manna, Amin McMurry, Julie A. Morris, Michele Muschelli, John Neumann, Andrew J. Palchuk, Matvey B. Pfaff, Emily R. Qian, Zhenglong Qureshi, Nabeel Russell, Seth Spratt, Heidi Walden, Anita Williams, Andrew E. Wooldridge, Jacob T. Yoo, Yun Jae Zhang, Xiaohan Tanner Zhu, Richard L. Austin, Christopher P. Saltz, Joel H. Gersing, Ken R. Haendel, Melissa A. Chute, Christopher G. medRxiv Article BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy. METHODS AND FINDINGS: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients. CONCLUSIONS: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease. Cold Spring Harbor Laboratory 2021-01-23 /pmc/articles/PMC7814838/ /pubmed/33469592 http://dx.doi.org/10.1101/2021.01.12.21249511 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Bennett, Tellen D.
Moffitt, Richard A.
Hajagos, Janos G.
Amor, Benjamin
Anand, Adit
Bissell, Mark M.
Bradwell, Katie Rebecca
Bremer, Carolyn
Byrd, James Brian
Denham, Alina
DeWitt, Peter E.
Gabriel, Davera
Garibaldi, Brian T.
Girvin, Andrew T.
Guinney, Justin
Hill, Elaine L.
Hong, Stephanie S.
Jimenez, Hunter
Kavuluru, Ramakanth
Kostka, Kristin
Lehmann, Harold P.
Levitt, Eli
Mallipattu, Sandeep K.
Manna, Amin
McMurry, Julie A.
Morris, Michele
Muschelli, John
Neumann, Andrew J.
Palchuk, Matvey B.
Pfaff, Emily R.
Qian, Zhenglong
Qureshi, Nabeel
Russell, Seth
Spratt, Heidi
Walden, Anita
Williams, Andrew E.
Wooldridge, Jacob T.
Yoo, Yun Jae
Zhang, Xiaohan Tanner
Zhu, Richard L.
Austin, Christopher P.
Saltz, Joel H.
Gersing, Ken R.
Haendel, Melissa A.
Chute, Christopher G.
The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title_full The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title_fullStr The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title_full_unstemmed The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title_short The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
title_sort national covid cohort collaborative: clinical characterization and early severity prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7814838/
https://www.ncbi.nlm.nih.gov/pubmed/33469592
http://dx.doi.org/10.1101/2021.01.12.21249511
work_keys_str_mv AT bennetttellend thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT moffittricharda thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hajagosjanosg thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT amorbenjamin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT anandadit thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bissellmarkm thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bradwellkatierebecca thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bremercarolyn thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT byrdjamesbrian thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT denhamalina thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT dewittpetere thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT gabrieldavera thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT garibaldibriant thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT girvinandrewt thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT guinneyjustin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hillelainel thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hongstephanies thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT jimenezhunter thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT kavulururamakanth thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT kostkakristin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT lehmannharoldp thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT levitteli thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mallipattusandeepk thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mannaamin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mcmurryjuliea thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT morrismichele thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT muschellijohn thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT neumannandrewj thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT palchukmatveyb thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT pfaffemilyr thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT qianzhenglong thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT qureshinabeel thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT russellseth thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT sprattheidi thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT waldenanita thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT williamsandrewe thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT wooldridgejacobt thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT yooyunjae thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT zhangxiaohantanner thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT zhurichardl thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT austinchristopherp thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT saltzjoelh thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT gersingkenr thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT haendelmelissaa thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT chutechristopherg thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bennetttellend nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT moffittricharda nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hajagosjanosg nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT amorbenjamin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT anandadit nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bissellmarkm nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bradwellkatierebecca nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT bremercarolyn nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT byrdjamesbrian nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT denhamalina nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT dewittpetere nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT gabrieldavera nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT garibaldibriant nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT girvinandrewt nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT guinneyjustin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hillelainel nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT hongstephanies nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT jimenezhunter nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT kavulururamakanth nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT kostkakristin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT lehmannharoldp nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT levitteli nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mallipattusandeepk nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mannaamin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT mcmurryjuliea nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT morrismichele nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT muschellijohn nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT neumannandrewj nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT palchukmatveyb nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT pfaffemilyr nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT qianzhenglong nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT qureshinabeel nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT russellseth nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT sprattheidi nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT waldenanita nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT williamsandrewe nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT wooldridgejacobt nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT yooyunjae nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT zhangxiaohantanner nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT zhurichardl nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT austinchristopherp nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT saltzjoelh nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT gersingkenr nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT haendelmelissaa nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT chutechristopherg nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction
AT nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction