Cargando…

The variant call format provides efficient and robust storage of GWAS summary statistics

GWAS summary statistics are fundamental for a variety of research applications yet no common storage format has been widely adopted. Existing tabular formats ambiguously or incompletely store information about genetic variants and associations, lack essential metadata and are typically not indexed y...

Descripción completa

Detalles Bibliográficos
Autores principales: Lyon, Matthew S., Andrews, Shea J., Elsworth, Ben, Gaunt, Tom R., Hemani, Gibran, Marcora, Edoardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805039/
https://www.ncbi.nlm.nih.gov/pubmed/33441155
http://dx.doi.org/10.1186/s13059-020-02248-0
Descripción
Sumario:GWAS summary statistics are fundamental for a variety of research applications yet no common storage format has been widely adopted. Existing tabular formats ambiguously or incompletely store information about genetic variants and associations, lack essential metadata and are typically not indexed yielding poor query performance and increasing the possibility of errors in data interpretation and post-GWAS analyses. To address these issues, we adapted the variant call format to store GWAS summary statistics (GWAS-VCF) and developed open-source tools to use this format in downstream analyses. We provide open access to over 10,000 complete GWAS summary datasets converted to this format (https://gwas.mrcieu.ac.uk). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-020-02248-0.