Cargando…
Standardized Metadata for Human Pathogen/Vector Genomic Sequences
High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulen...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4061050/ https://www.ncbi.nlm.nih.gov/pubmed/24936976 http://dx.doi.org/10.1371/journal.pone.0099979 |
_version_ | 1782321439576162304 |
---|---|
author | Dugan, Vivien G. Emrich, Scott J. Giraldo-Calderón, Gloria I. Harb, Omar S. Newman, Ruchi M. Pickett, Brett E. Schriml, Lynn M. Stockwell, Timothy B. Stoeckert, Christian J. Sullivan, Dan E. Singh, Indresh Ward, Doyle V. Yao, Alison Zheng, Jie Barrett, Tanya Birren, Bruce Brinkac, Lauren Bruno, Vincent M. Caler, Elizabet Chapman, Sinéad Collins, Frank H. Cuomo, Christina A. Di Francesco, Valentina Durkin, Scott Eppinger, Mark Feldgarden, Michael Fraser, Claire Fricke, W. Florian Giovanni, Maria Henn, Matthew R. Hine, Erin Hotopp, Julie Dunning Karsch-Mizrachi, Ilene Kissinger, Jessica C. Lee, Eun Mi Mathur, Punam Mongodin, Emmanuel F. Murphy, Cheryl I. Myers, Garry Neafsey, Daniel E. Nelson, Karen E. Nierman, William C. Puzak, Julia Rasko, David Roos, David S. Sadzewicz, Lisa Silva, Joana C. Sobral, Bruno Squires, R. Burke Stevens, Rick L. Tallon, Luke Tettelin, Herve Wentworth, David White, Owen Will, Rebecca Wortman, Jennifer Zhang, Yun Scheuermann, Richard H. |
author_facet | Dugan, Vivien G. Emrich, Scott J. Giraldo-Calderón, Gloria I. Harb, Omar S. Newman, Ruchi M. Pickett, Brett E. Schriml, Lynn M. Stockwell, Timothy B. Stoeckert, Christian J. Sullivan, Dan E. Singh, Indresh Ward, Doyle V. Yao, Alison Zheng, Jie Barrett, Tanya Birren, Bruce Brinkac, Lauren Bruno, Vincent M. Caler, Elizabet Chapman, Sinéad Collins, Frank H. Cuomo, Christina A. Di Francesco, Valentina Durkin, Scott Eppinger, Mark Feldgarden, Michael Fraser, Claire Fricke, W. Florian Giovanni, Maria Henn, Matthew R. Hine, Erin Hotopp, Julie Dunning Karsch-Mizrachi, Ilene Kissinger, Jessica C. Lee, Eun Mi Mathur, Punam Mongodin, Emmanuel F. Murphy, Cheryl I. Myers, Garry Neafsey, Daniel E. Nelson, Karen E. Nierman, William C. Puzak, Julia Rasko, David Roos, David S. Sadzewicz, Lisa Silva, Joana C. Sobral, Bruno Squires, R. Burke Stevens, Rick L. Tallon, Luke Tettelin, Herve Wentworth, David White, Owen Will, Rebecca Wortman, Jennifer Zhang, Yun Scheuermann, Richard H. |
author_sort | Dugan, Vivien G. |
collection | PubMed |
description | High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. |
format | Online Article Text |
id | pubmed-4061050 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-40610502014-06-20 Standardized Metadata for Human Pathogen/Vector Genomic Sequences Dugan, Vivien G. Emrich, Scott J. Giraldo-Calderón, Gloria I. Harb, Omar S. Newman, Ruchi M. Pickett, Brett E. Schriml, Lynn M. Stockwell, Timothy B. Stoeckert, Christian J. Sullivan, Dan E. Singh, Indresh Ward, Doyle V. Yao, Alison Zheng, Jie Barrett, Tanya Birren, Bruce Brinkac, Lauren Bruno, Vincent M. Caler, Elizabet Chapman, Sinéad Collins, Frank H. Cuomo, Christina A. Di Francesco, Valentina Durkin, Scott Eppinger, Mark Feldgarden, Michael Fraser, Claire Fricke, W. Florian Giovanni, Maria Henn, Matthew R. Hine, Erin Hotopp, Julie Dunning Karsch-Mizrachi, Ilene Kissinger, Jessica C. Lee, Eun Mi Mathur, Punam Mongodin, Emmanuel F. Murphy, Cheryl I. Myers, Garry Neafsey, Daniel E. Nelson, Karen E. Nierman, William C. Puzak, Julia Rasko, David Roos, David S. Sadzewicz, Lisa Silva, Joana C. Sobral, Bruno Squires, R. Burke Stevens, Rick L. Tallon, Luke Tettelin, Herve Wentworth, David White, Owen Will, Rebecca Wortman, Jennifer Zhang, Yun Scheuermann, Richard H. PLoS One Research Article High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. Public Library of Science 2014-06-17 /pmc/articles/PMC4061050/ /pubmed/24936976 http://dx.doi.org/10.1371/journal.pone.0099979 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. |
spellingShingle | Research Article Dugan, Vivien G. Emrich, Scott J. Giraldo-Calderón, Gloria I. Harb, Omar S. Newman, Ruchi M. Pickett, Brett E. Schriml, Lynn M. Stockwell, Timothy B. Stoeckert, Christian J. Sullivan, Dan E. Singh, Indresh Ward, Doyle V. Yao, Alison Zheng, Jie Barrett, Tanya Birren, Bruce Brinkac, Lauren Bruno, Vincent M. Caler, Elizabet Chapman, Sinéad Collins, Frank H. Cuomo, Christina A. Di Francesco, Valentina Durkin, Scott Eppinger, Mark Feldgarden, Michael Fraser, Claire Fricke, W. Florian Giovanni, Maria Henn, Matthew R. Hine, Erin Hotopp, Julie Dunning Karsch-Mizrachi, Ilene Kissinger, Jessica C. Lee, Eun Mi Mathur, Punam Mongodin, Emmanuel F. Murphy, Cheryl I. Myers, Garry Neafsey, Daniel E. Nelson, Karen E. Nierman, William C. Puzak, Julia Rasko, David Roos, David S. Sadzewicz, Lisa Silva, Joana C. Sobral, Bruno Squires, R. Burke Stevens, Rick L. Tallon, Luke Tettelin, Herve Wentworth, David White, Owen Will, Rebecca Wortman, Jennifer Zhang, Yun Scheuermann, Richard H. Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title | Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title_full | Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title_fullStr | Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title_full_unstemmed | Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title_short | Standardized Metadata for Human Pathogen/Vector Genomic Sequences |
title_sort | standardized metadata for human pathogen/vector genomic sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4061050/ https://www.ncbi.nlm.nih.gov/pubmed/24936976 http://dx.doi.org/10.1371/journal.pone.0099979 |
work_keys_str_mv | AT duganvivieng standardizedmetadataforhumanpathogenvectorgenomicsequences AT emrichscottj standardizedmetadataforhumanpathogenvectorgenomicsequences AT giraldocalderongloriai standardizedmetadataforhumanpathogenvectorgenomicsequences AT harbomars standardizedmetadataforhumanpathogenvectorgenomicsequences AT newmanruchim standardizedmetadataforhumanpathogenvectorgenomicsequences AT pickettbrette standardizedmetadataforhumanpathogenvectorgenomicsequences AT schrimllynnm standardizedmetadataforhumanpathogenvectorgenomicsequences AT stockwelltimothyb standardizedmetadataforhumanpathogenvectorgenomicsequences AT stoeckertchristianj standardizedmetadataforhumanpathogenvectorgenomicsequences AT sullivandane standardizedmetadataforhumanpathogenvectorgenomicsequences AT singhindresh standardizedmetadataforhumanpathogenvectorgenomicsequences AT warddoylev standardizedmetadataforhumanpathogenvectorgenomicsequences AT yaoalison standardizedmetadataforhumanpathogenvectorgenomicsequences AT zhengjie standardizedmetadataforhumanpathogenvectorgenomicsequences AT barretttanya standardizedmetadataforhumanpathogenvectorgenomicsequences AT birrenbruce standardizedmetadataforhumanpathogenvectorgenomicsequences AT brinkaclauren standardizedmetadataforhumanpathogenvectorgenomicsequences AT brunovincentm standardizedmetadataforhumanpathogenvectorgenomicsequences AT calerelizabet standardizedmetadataforhumanpathogenvectorgenomicsequences AT chapmansinead standardizedmetadataforhumanpathogenvectorgenomicsequences AT collinsfrankh standardizedmetadataforhumanpathogenvectorgenomicsequences AT cuomochristinaa standardizedmetadataforhumanpathogenvectorgenomicsequences AT difrancescovalentina standardizedmetadataforhumanpathogenvectorgenomicsequences AT durkinscott standardizedmetadataforhumanpathogenvectorgenomicsequences AT eppingermark standardizedmetadataforhumanpathogenvectorgenomicsequences AT feldgardenmichael standardizedmetadataforhumanpathogenvectorgenomicsequences AT fraserclaire standardizedmetadataforhumanpathogenvectorgenomicsequences AT frickewflorian standardizedmetadataforhumanpathogenvectorgenomicsequences AT giovannimaria standardizedmetadataforhumanpathogenvectorgenomicsequences AT hennmatthewr standardizedmetadataforhumanpathogenvectorgenomicsequences AT hineerin standardizedmetadataforhumanpathogenvectorgenomicsequences AT hotoppjuliedunning standardizedmetadataforhumanpathogenvectorgenomicsequences AT karschmizrachiilene standardizedmetadataforhumanpathogenvectorgenomicsequences AT kissingerjessicac standardizedmetadataforhumanpathogenvectorgenomicsequences AT leeeunmi standardizedmetadataforhumanpathogenvectorgenomicsequences AT mathurpunam standardizedmetadataforhumanpathogenvectorgenomicsequences AT mongodinemmanuelf standardizedmetadataforhumanpathogenvectorgenomicsequences AT murphycheryli standardizedmetadataforhumanpathogenvectorgenomicsequences AT myersgarry standardizedmetadataforhumanpathogenvectorgenomicsequences AT neafseydaniele standardizedmetadataforhumanpathogenvectorgenomicsequences AT nelsonkarene standardizedmetadataforhumanpathogenvectorgenomicsequences AT niermanwilliamc standardizedmetadataforhumanpathogenvectorgenomicsequences AT puzakjulia standardizedmetadataforhumanpathogenvectorgenomicsequences AT raskodavid standardizedmetadataforhumanpathogenvectorgenomicsequences AT roosdavids standardizedmetadataforhumanpathogenvectorgenomicsequences AT sadzewiczlisa standardizedmetadataforhumanpathogenvectorgenomicsequences AT silvajoanac standardizedmetadataforhumanpathogenvectorgenomicsequences AT sobralbruno standardizedmetadataforhumanpathogenvectorgenomicsequences AT squiresrburke standardizedmetadataforhumanpathogenvectorgenomicsequences AT stevensrickl standardizedmetadataforhumanpathogenvectorgenomicsequences AT tallonluke standardizedmetadataforhumanpathogenvectorgenomicsequences AT tettelinherve standardizedmetadataforhumanpathogenvectorgenomicsequences AT wentworthdavid standardizedmetadataforhumanpathogenvectorgenomicsequences AT whiteowen standardizedmetadataforhumanpathogenvectorgenomicsequences AT willrebecca standardizedmetadataforhumanpathogenvectorgenomicsequences AT wortmanjennifer standardizedmetadataforhumanpathogenvectorgenomicsequences AT zhangyun standardizedmetadataforhumanpathogenvectorgenomicsequences AT scheuermannrichardh standardizedmetadataforhumanpathogenvectorgenomicsequences |