Cargando…
The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods,...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9973856/ https://www.ncbi.nlm.nih.gov/pubmed/36748616 http://dx.doi.org/10.1099/mgen.0.000908 |
_version_ | 1784898612008845312 |
---|---|
author | Gill, Ivan S. Griffiths, Emma J. Dooley, Damion Cameron, Rhiannon Savić Kallesøe, Sarah John, Nithu Sara Sehar, Anoosha Gosal, Gurinder Alexander, David Chapel, Madison Croxen, Matthew A. Delisle, Benjamin Di Tullio, Rachelle Gaston, Daniel Duggan, Ana Guthrie, Jennifer L. Horsman, Mark Joshi, Esha Kearny, Levon Knox, Natalie Lau, Lynette LeBlanc, Jason J. Li, Vincent Lyons, Pierre MacKenzie, Keith McArthur, Andrew G. Panousis, Emily M. Palmer, John Prystajecky, Natalie Smith, Kerri N. Tanner, Jennifer Townend, Christopher Tyler, Andrea Van Domselaar, Gary Hsiao, William W. L. |
author_facet | Gill, Ivan S. Griffiths, Emma J. Dooley, Damion Cameron, Rhiannon Savić Kallesøe, Sarah John, Nithu Sara Sehar, Anoosha Gosal, Gurinder Alexander, David Chapel, Madison Croxen, Matthew A. Delisle, Benjamin Di Tullio, Rachelle Gaston, Daniel Duggan, Ana Guthrie, Jennifer L. Horsman, Mark Joshi, Esha Kearny, Levon Knox, Natalie Lau, Lynette LeBlanc, Jason J. Li, Vincent Lyons, Pierre MacKenzie, Keith McArthur, Andrew G. Panousis, Emily M. Palmer, John Prystajecky, Natalie Smith, Kerri N. Tanner, Jennifer Townend, Christopher Tyler, Andrea Van Domselaar, Gary Hsiao, William W. L. |
author_sort | Gill, Ivan S. |
collection | PubMed |
description | Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration and their use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission. In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway. |
format | Online Article Text |
id | pubmed-9973856 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-99738562023-03-01 The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information Gill, Ivan S. Griffiths, Emma J. Dooley, Damion Cameron, Rhiannon Savić Kallesøe, Sarah John, Nithu Sara Sehar, Anoosha Gosal, Gurinder Alexander, David Chapel, Madison Croxen, Matthew A. Delisle, Benjamin Di Tullio, Rachelle Gaston, Daniel Duggan, Ana Guthrie, Jennifer L. Horsman, Mark Joshi, Esha Kearny, Levon Knox, Natalie Lau, Lynette LeBlanc, Jason J. Li, Vincent Lyons, Pierre MacKenzie, Keith McArthur, Andrew G. Panousis, Emily M. Palmer, John Prystajecky, Natalie Smith, Kerri N. Tanner, Jennifer Townend, Christopher Tyler, Andrea Van Domselaar, Gary Hsiao, William W. L. Microb Genom Bioresources Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration and their use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission. In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway. Microbiology Society 2023-01-23 /pmc/articles/PMC9973856/ /pubmed/36748616 http://dx.doi.org/10.1099/mgen.0.000908 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution. |
spellingShingle | Bioresources Gill, Ivan S. Griffiths, Emma J. Dooley, Damion Cameron, Rhiannon Savić Kallesøe, Sarah John, Nithu Sara Sehar, Anoosha Gosal, Gurinder Alexander, David Chapel, Madison Croxen, Matthew A. Delisle, Benjamin Di Tullio, Rachelle Gaston, Daniel Duggan, Ana Guthrie, Jennifer L. Horsman, Mark Joshi, Esha Kearny, Levon Knox, Natalie Lau, Lynette LeBlanc, Jason J. Li, Vincent Lyons, Pierre MacKenzie, Keith McArthur, Andrew G. Panousis, Emily M. Palmer, John Prystajecky, Natalie Smith, Kerri N. Tanner, Jennifer Townend, Christopher Tyler, Andrea Van Domselaar, Gary Hsiao, William W. L. The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title | The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title_full | The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title_fullStr | The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title_full_unstemmed | The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title_short | The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
title_sort | dataharmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information |
topic | Bioresources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9973856/ https://www.ncbi.nlm.nih.gov/pubmed/36748616 http://dx.doi.org/10.1099/mgen.0.000908 |
work_keys_str_mv | AT gillivans thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT griffithsemmaj thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT dooleydamion thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT cameronrhiannon thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT savickallesøesarah thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT johnnithusara thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT seharanoosha thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT gosalgurinder thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT alexanderdavid thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT chapelmadison thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT croxenmatthewa thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT delislebenjamin thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT ditulliorachelle thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT gastondaniel thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT dugganana thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT guthriejenniferl thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT horsmanmark thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT joshiesha thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT kearnylevon thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT knoxnatalie thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT laulynette thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT leblancjasonj thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT livincent thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT lyonspierre thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT mackenziekeith thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT mcarthurandrewg thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT panousisemilym thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT palmerjohn thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT prystajeckynatalie thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT smithkerrin thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT tannerjennifer thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT townendchristopher thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT tylerandrea thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT vandomselaargary thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT hsiaowilliamwl thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT gillivans dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT griffithsemmaj dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT dooleydamion dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT cameronrhiannon dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT savickallesøesarah dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT johnnithusara dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT seharanoosha dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT gosalgurinder dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT alexanderdavid dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT chapelmadison dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT croxenmatthewa dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT delislebenjamin dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT ditulliorachelle dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT gastondaniel dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT dugganana dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT guthriejenniferl dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT horsmanmark dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT joshiesha dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT kearnylevon dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT knoxnatalie dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT laulynette dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT leblancjasonj dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT livincent dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT lyonspierre dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT mackenziekeith dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT mcarthurandrewg dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT panousisemilym dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT palmerjohn dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT prystajeckynatalie dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT smithkerrin dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT tannerjennifer dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT townendchristopher dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT tylerandrea dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT vandomselaargary dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation AT hsiaowilliamwl dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation |