Cargando…

The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics

OBJECTIVE: Integrating and harmonizing disparate patient data sources into one consolidated data portal enables researchers to conduct analysis efficiently and effectively. MATERIALS AND METHODS: We describe an implementation of Informatics for Integrating Biology and the Bedside (i2b2) to create th...

Descripción completa

Detalles Bibliográficos
Autores principales: Castro, Victor M, Gainer, Vivian, Wattanasin, Nich, Benoit, Barbara, Cagan, Andrew, Ghosh, Bhaswati, Goryachev, Sergey, Metta, Reeta, Park, Heekyong, Wang, David, Mendis, Michael, Rees, Martin, Herrick, Christopher, Murphy, Shawn N
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922162/
https://www.ncbi.nlm.nih.gov/pubmed/34849976
http://dx.doi.org/10.1093/jamia/ocab264
_version_ 1784669473776599040
author Castro, Victor M
Gainer, Vivian
Wattanasin, Nich
Benoit, Barbara
Cagan, Andrew
Ghosh, Bhaswati
Goryachev, Sergey
Metta, Reeta
Park, Heekyong
Wang, David
Mendis, Michael
Rees, Martin
Herrick, Christopher
Murphy, Shawn N
author_facet Castro, Victor M
Gainer, Vivian
Wattanasin, Nich
Benoit, Barbara
Cagan, Andrew
Ghosh, Bhaswati
Goryachev, Sergey
Metta, Reeta
Park, Heekyong
Wang, David
Mendis, Michael
Rees, Martin
Herrick, Christopher
Murphy, Shawn N
author_sort Castro, Victor M
collection PubMed
description OBJECTIVE: Integrating and harmonizing disparate patient data sources into one consolidated data portal enables researchers to conduct analysis efficiently and effectively. MATERIALS AND METHODS: We describe an implementation of Informatics for Integrating Biology and the Bedside (i2b2) to create the Mass General Brigham (MGB) Biobank Portal data repository. The repository integrates data from primary and curated data sources and is updated weekly. The data are made readily available to investigators in a data portal where they can easily construct and export customized datasets for analysis. RESULTS: As of July 2021, there are 125 645 consented patients enrolled in the MGB Biobank. 88 527 (70.5%) have a biospecimen, 55 121 (43.9%) have completed the health information survey, 43 552 (34.7%) have genomic data and 124 760 (99.3%) have EHR data. Twenty machine learning computed phenotypes are calculated on a weekly basis. There are currently 1220 active investigators who have run 58 793 patient queries and exported 10 257 analysis files. DISCUSSION: The Biobank Portal allows noninformatics researchers to conduct study feasibility by querying across many data sources and then extract data that are most useful to them for clinical studies. While institutions require substantial informatics resources to establish and maintain integrated data repositories, they yield significant research value to a wide range of investigators. CONCLUSION: The Biobank Portal and other patient data portals that integrate complex and simple datasets enable diverse research use cases. i2b2 tools to implement these registries and make the data interoperable are open source and freely available.
format Online
Article
Text
id pubmed-8922162
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89221622022-03-15 The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics Castro, Victor M Gainer, Vivian Wattanasin, Nich Benoit, Barbara Cagan, Andrew Ghosh, Bhaswati Goryachev, Sergey Metta, Reeta Park, Heekyong Wang, David Mendis, Michael Rees, Martin Herrick, Christopher Murphy, Shawn N J Am Med Inform Assoc Research and Applications OBJECTIVE: Integrating and harmonizing disparate patient data sources into one consolidated data portal enables researchers to conduct analysis efficiently and effectively. MATERIALS AND METHODS: We describe an implementation of Informatics for Integrating Biology and the Bedside (i2b2) to create the Mass General Brigham (MGB) Biobank Portal data repository. The repository integrates data from primary and curated data sources and is updated weekly. The data are made readily available to investigators in a data portal where they can easily construct and export customized datasets for analysis. RESULTS: As of July 2021, there are 125 645 consented patients enrolled in the MGB Biobank. 88 527 (70.5%) have a biospecimen, 55 121 (43.9%) have completed the health information survey, 43 552 (34.7%) have genomic data and 124 760 (99.3%) have EHR data. Twenty machine learning computed phenotypes are calculated on a weekly basis. There are currently 1220 active investigators who have run 58 793 patient queries and exported 10 257 analysis files. DISCUSSION: The Biobank Portal allows noninformatics researchers to conduct study feasibility by querying across many data sources and then extract data that are most useful to them for clinical studies. While institutions require substantial informatics resources to establish and maintain integrated data repositories, they yield significant research value to a wide range of investigators. CONCLUSION: The Biobank Portal and other patient data portals that integrate complex and simple datasets enable diverse research use cases. i2b2 tools to implement these registries and make the data interoperable are open source and freely available. Oxford University Press 2021-11-28 /pmc/articles/PMC8922162/ /pubmed/34849976 http://dx.doi.org/10.1093/jamia/ocab264 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Castro, Victor M
Gainer, Vivian
Wattanasin, Nich
Benoit, Barbara
Cagan, Andrew
Ghosh, Bhaswati
Goryachev, Sergey
Metta, Reeta
Park, Heekyong
Wang, David
Mendis, Michael
Rees, Martin
Herrick, Christopher
Murphy, Shawn N
The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title_full The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title_fullStr The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title_full_unstemmed The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title_short The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
title_sort mass general brigham biobank portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922162/
https://www.ncbi.nlm.nih.gov/pubmed/34849976
http://dx.doi.org/10.1093/jamia/ocab264
work_keys_str_mv AT castrovictorm themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT gainervivian themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT wattanasinnich themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT benoitbarbara themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT caganandrew themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT ghoshbhaswati themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT goryachevsergey themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT mettareeta themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT parkheekyong themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT wangdavid themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT mendismichael themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT reesmartin themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT herrickchristopher themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT murphyshawnn themassgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT castrovictorm massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT gainervivian massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT wattanasinnich massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT benoitbarbara massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT caganandrew massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT ghoshbhaswati massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT goryachevsergey massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT mettareeta massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT parkheekyong massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT wangdavid massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT mendismichael massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT reesmartin massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT herrickchristopher massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics
AT murphyshawnn massgeneralbrighambiobankportalani2b2baseddatarepositorylinkingdisparateandhighdimensionalpatientdatatosupportmultimodalanalytics