Cargando…
The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7814838/ https://www.ncbi.nlm.nih.gov/pubmed/33469592 http://dx.doi.org/10.1101/2021.01.12.21249511 |
_version_ | 1783638127676489728 |
---|---|
author | Bennett, Tellen D. Moffitt, Richard A. Hajagos, Janos G. Amor, Benjamin Anand, Adit Bissell, Mark M. Bradwell, Katie Rebecca Bremer, Carolyn Byrd, James Brian Denham, Alina DeWitt, Peter E. Gabriel, Davera Garibaldi, Brian T. Girvin, Andrew T. Guinney, Justin Hill, Elaine L. Hong, Stephanie S. Jimenez, Hunter Kavuluru, Ramakanth Kostka, Kristin Lehmann, Harold P. Levitt, Eli Mallipattu, Sandeep K. Manna, Amin McMurry, Julie A. Morris, Michele Muschelli, John Neumann, Andrew J. Palchuk, Matvey B. Pfaff, Emily R. Qian, Zhenglong Qureshi, Nabeel Russell, Seth Spratt, Heidi Walden, Anita Williams, Andrew E. Wooldridge, Jacob T. Yoo, Yun Jae Zhang, Xiaohan Tanner Zhu, Richard L. Austin, Christopher P. Saltz, Joel H. Gersing, Ken R. Haendel, Melissa A. Chute, Christopher G. |
author_facet | Bennett, Tellen D. Moffitt, Richard A. Hajagos, Janos G. Amor, Benjamin Anand, Adit Bissell, Mark M. Bradwell, Katie Rebecca Bremer, Carolyn Byrd, James Brian Denham, Alina DeWitt, Peter E. Gabriel, Davera Garibaldi, Brian T. Girvin, Andrew T. Guinney, Justin Hill, Elaine L. Hong, Stephanie S. Jimenez, Hunter Kavuluru, Ramakanth Kostka, Kristin Lehmann, Harold P. Levitt, Eli Mallipattu, Sandeep K. Manna, Amin McMurry, Julie A. Morris, Michele Muschelli, John Neumann, Andrew J. Palchuk, Matvey B. Pfaff, Emily R. Qian, Zhenglong Qureshi, Nabeel Russell, Seth Spratt, Heidi Walden, Anita Williams, Andrew E. Wooldridge, Jacob T. Yoo, Yun Jae Zhang, Xiaohan Tanner Zhu, Richard L. Austin, Christopher P. Saltz, Joel H. Gersing, Ken R. Haendel, Melissa A. Chute, Christopher G. |
author_sort | Bennett, Tellen D. |
collection | PubMed |
description | BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy. METHODS AND FINDINGS: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients. CONCLUSIONS: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease. |
format | Online Article Text |
id | pubmed-7814838 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-78148382021-01-20 The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction Bennett, Tellen D. Moffitt, Richard A. Hajagos, Janos G. Amor, Benjamin Anand, Adit Bissell, Mark M. Bradwell, Katie Rebecca Bremer, Carolyn Byrd, James Brian Denham, Alina DeWitt, Peter E. Gabriel, Davera Garibaldi, Brian T. Girvin, Andrew T. Guinney, Justin Hill, Elaine L. Hong, Stephanie S. Jimenez, Hunter Kavuluru, Ramakanth Kostka, Kristin Lehmann, Harold P. Levitt, Eli Mallipattu, Sandeep K. Manna, Amin McMurry, Julie A. Morris, Michele Muschelli, John Neumann, Andrew J. Palchuk, Matvey B. Pfaff, Emily R. Qian, Zhenglong Qureshi, Nabeel Russell, Seth Spratt, Heidi Walden, Anita Williams, Andrew E. Wooldridge, Jacob T. Yoo, Yun Jae Zhang, Xiaohan Tanner Zhu, Richard L. Austin, Christopher P. Saltz, Joel H. Gersing, Ken R. Haendel, Melissa A. Chute, Christopher G. medRxiv Article BACKGROUND: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy. METHODS AND FINDINGS: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients. CONCLUSIONS: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease. Cold Spring Harbor Laboratory 2021-01-23 /pmc/articles/PMC7814838/ /pubmed/33469592 http://dx.doi.org/10.1101/2021.01.12.21249511 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Bennett, Tellen D. Moffitt, Richard A. Hajagos, Janos G. Amor, Benjamin Anand, Adit Bissell, Mark M. Bradwell, Katie Rebecca Bremer, Carolyn Byrd, James Brian Denham, Alina DeWitt, Peter E. Gabriel, Davera Garibaldi, Brian T. Girvin, Andrew T. Guinney, Justin Hill, Elaine L. Hong, Stephanie S. Jimenez, Hunter Kavuluru, Ramakanth Kostka, Kristin Lehmann, Harold P. Levitt, Eli Mallipattu, Sandeep K. Manna, Amin McMurry, Julie A. Morris, Michele Muschelli, John Neumann, Andrew J. Palchuk, Matvey B. Pfaff, Emily R. Qian, Zhenglong Qureshi, Nabeel Russell, Seth Spratt, Heidi Walden, Anita Williams, Andrew E. Wooldridge, Jacob T. Yoo, Yun Jae Zhang, Xiaohan Tanner Zhu, Richard L. Austin, Christopher P. Saltz, Joel H. Gersing, Ken R. Haendel, Melissa A. Chute, Christopher G. The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title | The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title_full | The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title_fullStr | The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title_full_unstemmed | The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title_short | The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction |
title_sort | national covid cohort collaborative: clinical characterization and early severity prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7814838/ https://www.ncbi.nlm.nih.gov/pubmed/33469592 http://dx.doi.org/10.1101/2021.01.12.21249511 |
work_keys_str_mv | AT bennetttellend thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT moffittricharda thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hajagosjanosg thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT amorbenjamin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT anandadit thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bissellmarkm thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bradwellkatierebecca thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bremercarolyn thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT byrdjamesbrian thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT denhamalina thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT dewittpetere thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT gabrieldavera thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT garibaldibriant thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT girvinandrewt thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT guinneyjustin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hillelainel thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hongstephanies thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT jimenezhunter thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT kavulururamakanth thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT kostkakristin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT lehmannharoldp thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT levitteli thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mallipattusandeepk thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mannaamin thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mcmurryjuliea thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT morrismichele thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT muschellijohn thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT neumannandrewj thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT palchukmatveyb thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT pfaffemilyr thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT qianzhenglong thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT qureshinabeel thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT russellseth thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT sprattheidi thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT waldenanita thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT williamsandrewe thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT wooldridgejacobt thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT yooyunjae thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT zhangxiaohantanner thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT zhurichardl thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT austinchristopherp thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT saltzjoelh thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT gersingkenr thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT haendelmelissaa thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT chutechristopherg thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT thenationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bennetttellend nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT moffittricharda nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hajagosjanosg nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT amorbenjamin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT anandadit nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bissellmarkm nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bradwellkatierebecca nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT bremercarolyn nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT byrdjamesbrian nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT denhamalina nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT dewittpetere nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT gabrieldavera nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT garibaldibriant nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT girvinandrewt nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT guinneyjustin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hillelainel nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT hongstephanies nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT jimenezhunter nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT kavulururamakanth nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT kostkakristin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT lehmannharoldp nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT levitteli nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mallipattusandeepk nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mannaamin nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT mcmurryjuliea nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT morrismichele nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT muschellijohn nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT neumannandrewj nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT palchukmatveyb nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT pfaffemilyr nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT qianzhenglong nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT qureshinabeel nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT russellseth nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT sprattheidi nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT waldenanita nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT williamsandrewe nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT wooldridgejacobt nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT yooyunjae nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT zhangxiaohantanner nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT zhurichardl nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT austinchristopherp nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT saltzjoelh nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT gersingkenr nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT haendelmelissaa nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT chutechristopherg nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction AT nationalcovidcohortcollaborativeclinicalcharacterizationandearlyseverityprediction |