Cargando…

Pathogen metadata platform: software for accessing and analyzing pathogen strain information

BACKGROUND: Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Wenling E., Peterson, Matthew W., Garay, Christopher D., Korves, Tonia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5025631/
https://www.ncbi.nlm.nih.gov/pubmed/27634291
http://dx.doi.org/10.1186/s12859-016-1231-2
_version_ 1782453991934787584
author Chang, Wenling E.
Peterson, Matthew W.
Garay, Christopher D.
Korves, Tonia
author_facet Chang, Wenling E.
Peterson, Matthew W.
Garay, Christopher D.
Korves, Tonia
author_sort Chang, Wenling E.
collection PubMed
description BACKGROUND: Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. RESULTS: We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. CONCLUSIONS: This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1231-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5025631
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50256312016-09-20 Pathogen metadata platform: software for accessing and analyzing pathogen strain information Chang, Wenling E. Peterson, Matthew W. Garay, Christopher D. Korves, Tonia BMC Bioinformatics Software BACKGROUND: Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. RESULTS: We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. CONCLUSIONS: This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1231-2) contains supplementary material, which is available to authorized users. BioMed Central 2016-09-15 /pmc/articles/PMC5025631/ /pubmed/27634291 http://dx.doi.org/10.1186/s12859-016-1231-2 Text en © The MITRE Corporation. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Chang, Wenling E.
Peterson, Matthew W.
Garay, Christopher D.
Korves, Tonia
Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title_full Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title_fullStr Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title_full_unstemmed Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title_short Pathogen metadata platform: software for accessing and analyzing pathogen strain information
title_sort pathogen metadata platform: software for accessing and analyzing pathogen strain information
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5025631/
https://www.ncbi.nlm.nih.gov/pubmed/27634291
http://dx.doi.org/10.1186/s12859-016-1231-2
work_keys_str_mv AT changwenlinge pathogenmetadataplatformsoftwareforaccessingandanalyzingpathogenstraininformation
AT petersonmattheww pathogenmetadataplatformsoftwareforaccessingandanalyzingpathogenstraininformation
AT garaychristopherd pathogenmetadataplatformsoftwareforaccessingandanalyzingpathogenstraininformation
AT korvestonia pathogenmetadataplatformsoftwareforaccessingandanalyzingpathogenstraininformation