Cargando…
SNPpy - Database Management for SNP Data from Genome Wide Association Studies
BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS). This system makes it possible to merge study data with HapMap data and merge across studies for...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3198468/ https://www.ncbi.nlm.nih.gov/pubmed/22039405 http://dx.doi.org/10.1371/journal.pone.0024982 |
_version_ | 1782214431725322240 |
---|---|
author | Mitha, Faheem Herodotou, Herodotos Borisov, Nedyalko Jiang, Chen Yoder, Josh Owzar, Kouros |
author_facet | Mitha, Faheem Herodotou, Herodotos Borisov, Nedyalko Jiang, Chen Yoder, Josh Owzar, Kouros |
author_sort | Mitha, Faheem |
collection | PubMed |
description | BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS). This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP) data. SNPpy and its dependencies are open source software. RESULTS: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. CONCLUSIONS: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data. |
format | Online Article Text |
id | pubmed-3198468 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-31984682011-10-28 SNPpy - Database Management for SNP Data from Genome Wide Association Studies Mitha, Faheem Herodotou, Herodotos Borisov, Nedyalko Jiang, Chen Yoder, Josh Owzar, Kouros PLoS One Research Article BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS). This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP) data. SNPpy and its dependencies are open source software. RESULTS: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. CONCLUSIONS: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data. Public Library of Science 2011-10-19 /pmc/articles/PMC3198468/ /pubmed/22039405 http://dx.doi.org/10.1371/journal.pone.0024982 Text en Mitha et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Mitha, Faheem Herodotou, Herodotos Borisov, Nedyalko Jiang, Chen Yoder, Josh Owzar, Kouros SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title | SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title_full | SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title_fullStr | SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title_full_unstemmed | SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title_short | SNPpy - Database Management for SNP Data from Genome Wide Association Studies |
title_sort | snppy - database management for snp data from genome wide association studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3198468/ https://www.ncbi.nlm.nih.gov/pubmed/22039405 http://dx.doi.org/10.1371/journal.pone.0024982 |
work_keys_str_mv | AT mithafaheem snppydatabasemanagementforsnpdatafromgenomewideassociationstudies AT herodotouherodotos snppydatabasemanagementforsnpdatafromgenomewideassociationstudies AT borisovnedyalko snppydatabasemanagementforsnpdatafromgenomewideassociationstudies AT jiangchen snppydatabasemanagementforsnpdatafromgenomewideassociationstudies AT yoderjosh snppydatabasemanagementforsnpdatafromgenomewideassociationstudies AT owzarkouros snppydatabasemanagementforsnpdatafromgenomewideassociationstudies |