Cargando…

Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts

BACKGROUND: Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hakenberg, Jörg, Cheng, Wei-Yi, Thomas, Philippe, Wang, Ying-Chih, Uzilov, Andrew V., Chen, Rong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Database
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4706706/ https://www.ncbi.nlm.nih.gov/pubmed/26746786 http://dx.doi.org/10.1186/s12859-015-0865-9

_version_	1782409206866903040
author	Hakenberg, Jörg Cheng, Wei-Yi Thomas, Philippe Wang, Ying-Chih Uzilov, Andrew V. Chen, Rong
author_facet	Hakenberg, Jörg Cheng, Wei-Yi Thomas, Philippe Wang, Ying-Chih Uzilov, Andrew V. Chen, Rong
author_sort	Hakenberg, Jörg
collection	PubMed
description	BACKGROUND: Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint perspective can be more powerful in identifying polymorphisms, rare variants, disease-associations, genetic burden, somatic variants, and disease mechanisms. DESCRIPTION: We have set up a Reference Variant Store (RVS) containing variants observed in a number of large-scale sequencing efforts, such as 1000 Genomes, ExAC, Scripps Wellderly, UK10K; various genotyping studies; and disease association databases. RVS holds extensive annotations pertaining to affected genes, functional impacts, disease associations, and population frequencies. RVS currently stores 400 million distinct variants observed in more than 80,000 human samples. CONCLUSIONS: RVS facilitates cross-study analysis to discover novel genetic risk factors, gene–disease associations, potential disease mechanisms, and actionable variants. Due to its large reference populations, RVS can also be employed for variant filtration and gene prioritization. AVAILABILITY: A web interface to public datasets and annotations in RVS is available at https://rvs.u.hpc.mssm.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0865-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4706706
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-47067062016-01-10 Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts Hakenberg, Jörg Cheng, Wei-Yi Thomas, Philippe Wang, Ying-Chih Uzilov, Andrew V. Chen, Rong BMC Bioinformatics Database BACKGROUND: Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint perspective can be more powerful in identifying polymorphisms, rare variants, disease-associations, genetic burden, somatic variants, and disease mechanisms. DESCRIPTION: We have set up a Reference Variant Store (RVS) containing variants observed in a number of large-scale sequencing efforts, such as 1000 Genomes, ExAC, Scripps Wellderly, UK10K; various genotyping studies; and disease association databases. RVS holds extensive annotations pertaining to affected genes, functional impacts, disease associations, and population frequencies. RVS currently stores 400 million distinct variants observed in more than 80,000 human samples. CONCLUSIONS: RVS facilitates cross-study analysis to discover novel genetic risk factors, gene–disease associations, potential disease mechanisms, and actionable variants. Due to its large reference populations, RVS can also be employed for variant filtration and gene prioritization. AVAILABILITY: A web interface to public datasets and annotations in RVS is available at https://rvs.u.hpc.mssm.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0865-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-08 /pmc/articles/PMC4706706/ /pubmed/26746786 http://dx.doi.org/10.1186/s12859-015-0865-9 Text en © Hakenberg et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Database Hakenberg, Jörg Cheng, Wei-Yi Thomas, Philippe Wang, Ying-Chih Uzilov, Andrew V. Chen, Rong Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title	Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title_full	Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title_fullStr	Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title_full_unstemmed	Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title_short	Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
title_sort	integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
topic	Database
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4706706/ https://www.ncbi.nlm.nih.gov/pubmed/26746786 http://dx.doi.org/10.1186/s12859-015-0865-9
work_keys_str_mv	AT hakenbergjorg integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts AT chengweiyi integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts AT thomasphilippe integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts AT wangyingchih integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts AT uzilovandrewv integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts AT chenrong integrating400millionvariantsfrom80000humansampleswithextensiveannotationstowardsaknowledgebasetoanalyzediseasecohorts

Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts

Ejemplares similares