Cargando…

The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information

Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods,...

Descripción completa

Detalles Bibliográficos
Autores principales: Gill, Ivan S., Griffiths, Emma J., Dooley, Damion, Cameron, Rhiannon, Savić Kallesøe, Sarah, John, Nithu Sara, Sehar, Anoosha, Gosal, Gurinder, Alexander, David, Chapel, Madison, Croxen, Matthew A., Delisle, Benjamin, Di Tullio, Rachelle, Gaston, Daniel, Duggan, Ana, Guthrie, Jennifer L., Horsman, Mark, Joshi, Esha, Kearny, Levon, Knox, Natalie, Lau, Lynette, LeBlanc, Jason J., Li, Vincent, Lyons, Pierre, MacKenzie, Keith, McArthur, Andrew G., Panousis, Emily M., Palmer, John, Prystajecky, Natalie, Smith, Kerri N., Tanner, Jennifer, Townend, Christopher, Tyler, Andrea, Van Domselaar, Gary, Hsiao, William W. L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9973856/
https://www.ncbi.nlm.nih.gov/pubmed/36748616
http://dx.doi.org/10.1099/mgen.0.000908
_version_ 1784898612008845312
author Gill, Ivan S.
Griffiths, Emma J.
Dooley, Damion
Cameron, Rhiannon
Savić Kallesøe, Sarah
John, Nithu Sara
Sehar, Anoosha
Gosal, Gurinder
Alexander, David
Chapel, Madison
Croxen, Matthew A.
Delisle, Benjamin
Di Tullio, Rachelle
Gaston, Daniel
Duggan, Ana
Guthrie, Jennifer L.
Horsman, Mark
Joshi, Esha
Kearny, Levon
Knox, Natalie
Lau, Lynette
LeBlanc, Jason J.
Li, Vincent
Lyons, Pierre
MacKenzie, Keith
McArthur, Andrew G.
Panousis, Emily M.
Palmer, John
Prystajecky, Natalie
Smith, Kerri N.
Tanner, Jennifer
Townend, Christopher
Tyler, Andrea
Van Domselaar, Gary
Hsiao, William W. L.
author_facet Gill, Ivan S.
Griffiths, Emma J.
Dooley, Damion
Cameron, Rhiannon
Savić Kallesøe, Sarah
John, Nithu Sara
Sehar, Anoosha
Gosal, Gurinder
Alexander, David
Chapel, Madison
Croxen, Matthew A.
Delisle, Benjamin
Di Tullio, Rachelle
Gaston, Daniel
Duggan, Ana
Guthrie, Jennifer L.
Horsman, Mark
Joshi, Esha
Kearny, Levon
Knox, Natalie
Lau, Lynette
LeBlanc, Jason J.
Li, Vincent
Lyons, Pierre
MacKenzie, Keith
McArthur, Andrew G.
Panousis, Emily M.
Palmer, John
Prystajecky, Natalie
Smith, Kerri N.
Tanner, Jennifer
Townend, Christopher
Tyler, Andrea
Van Domselaar, Gary
Hsiao, William W. L.
author_sort Gill, Ivan S.
collection PubMed
description Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration and their use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission. In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway.
format Online
Article
Text
id pubmed-9973856
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-99738562023-03-01 The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information Gill, Ivan S. Griffiths, Emma J. Dooley, Damion Cameron, Rhiannon Savić Kallesøe, Sarah John, Nithu Sara Sehar, Anoosha Gosal, Gurinder Alexander, David Chapel, Madison Croxen, Matthew A. Delisle, Benjamin Di Tullio, Rachelle Gaston, Daniel Duggan, Ana Guthrie, Jennifer L. Horsman, Mark Joshi, Esha Kearny, Levon Knox, Natalie Lau, Lynette LeBlanc, Jason J. Li, Vincent Lyons, Pierre MacKenzie, Keith McArthur, Andrew G. Panousis, Emily M. Palmer, John Prystajecky, Natalie Smith, Kerri N. Tanner, Jennifer Townend, Christopher Tyler, Andrea Van Domselaar, Gary Hsiao, William W. L. Microb Genom Bioresources Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration and their use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission. In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway. Microbiology Society 2023-01-23 /pmc/articles/PMC9973856/ /pubmed/36748616 http://dx.doi.org/10.1099/mgen.0.000908 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
spellingShingle Bioresources
Gill, Ivan S.
Griffiths, Emma J.
Dooley, Damion
Cameron, Rhiannon
Savić Kallesøe, Sarah
John, Nithu Sara
Sehar, Anoosha
Gosal, Gurinder
Alexander, David
Chapel, Madison
Croxen, Matthew A.
Delisle, Benjamin
Di Tullio, Rachelle
Gaston, Daniel
Duggan, Ana
Guthrie, Jennifer L.
Horsman, Mark
Joshi, Esha
Kearny, Levon
Knox, Natalie
Lau, Lynette
LeBlanc, Jason J.
Li, Vincent
Lyons, Pierre
MacKenzie, Keith
McArthur, Andrew G.
Panousis, Emily M.
Palmer, John
Prystajecky, Natalie
Smith, Kerri N.
Tanner, Jennifer
Townend, Christopher
Tyler, Andrea
Van Domselaar, Gary
Hsiao, William W. L.
The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title_full The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title_fullStr The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title_full_unstemmed The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title_short The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
title_sort dataharmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information
topic Bioresources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9973856/
https://www.ncbi.nlm.nih.gov/pubmed/36748616
http://dx.doi.org/10.1099/mgen.0.000908
work_keys_str_mv AT gillivans thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT griffithsemmaj thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT dooleydamion thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT cameronrhiannon thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT savickallesøesarah thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT johnnithusara thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT seharanoosha thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT gosalgurinder thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT alexanderdavid thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT chapelmadison thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT croxenmatthewa thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT delislebenjamin thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT ditulliorachelle thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT gastondaniel thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT dugganana thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT guthriejenniferl thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT horsmanmark thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT joshiesha thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT kearnylevon thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT knoxnatalie thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT laulynette thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT leblancjasonj thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT livincent thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT lyonspierre thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT mackenziekeith thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT mcarthurandrewg thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT panousisemilym thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT palmerjohn thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT prystajeckynatalie thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT smithkerrin thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT tannerjennifer thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT townendchristopher thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT tylerandrea thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT vandomselaargary thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT hsiaowilliamwl thedataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT gillivans dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT griffithsemmaj dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT dooleydamion dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT cameronrhiannon dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT savickallesøesarah dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT johnnithusara dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT seharanoosha dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT gosalgurinder dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT alexanderdavid dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT chapelmadison dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT croxenmatthewa dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT delislebenjamin dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT ditulliorachelle dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT gastondaniel dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT dugganana dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT guthriejenniferl dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT horsmanmark dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT joshiesha dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT kearnylevon dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT knoxnatalie dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT laulynette dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT leblancjasonj dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT livincent dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT lyonspierre dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT mackenziekeith dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT mcarthurandrewg dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT panousisemilym dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT palmerjohn dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT prystajeckynatalie dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT smithkerrin dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT tannerjennifer dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT townendchristopher dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT tylerandrea dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT vandomselaargary dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation
AT hsiaowilliamwl dataharmonizeratoolforfasterdataharmonizationvalidationaggregationandanalysisofpathogengenomicscontextualinformation