Cargando…

Disease-associated variants in different categories of disease located in distinct regulatory elements

BACKGROUND: The invention of high throughput sequencing technologies has led to the discoveries of hundreds of thousands of genetic variants associated with thousands of human diseases. Many of these genetic variants are located outside the protein coding regions, and as such, it is challenging to i...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Meng, Ru, Ying, Chuang, Ling-Shiang, Hsu, Nai-Yun, Shi, Li-Song, Hakenberg, Jörg, Cheng, Wei-Yi, Uzilov, Andrew, Ding, Wei, Glicksberg, Benjamin S, Chen, Rong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480828/
https://www.ncbi.nlm.nih.gov/pubmed/26110593
http://dx.doi.org/10.1186/1471-2164-16-S8-S3
_version_ 1782378197068808192
author Ma, Meng
Ru, Ying
Chuang, Ling-Shiang
Hsu, Nai-Yun
Shi, Li-Song
Hakenberg, Jörg
Cheng, Wei-Yi
Uzilov, Andrew
Ding, Wei
Glicksberg, Benjamin S
Chen, Rong
author_facet Ma, Meng
Ru, Ying
Chuang, Ling-Shiang
Hsu, Nai-Yun
Shi, Li-Song
Hakenberg, Jörg
Cheng, Wei-Yi
Uzilov, Andrew
Ding, Wei
Glicksberg, Benjamin S
Chen, Rong
author_sort Ma, Meng
collection PubMed
description BACKGROUND: The invention of high throughput sequencing technologies has led to the discoveries of hundreds of thousands of genetic variants associated with thousands of human diseases. Many of these genetic variants are located outside the protein coding regions, and as such, it is challenging to interpret the function of these genetic variants by traditional genetic approaches. Recent genome-wide functional genomics studies, such as FANTOM5 and ENCODE have uncovered a large number of regulatory elements across hundreds of different tissues or cell lines in the human genome. These findings provide an opportunity to study the interaction between regulatory elements and disease-associated genetic variants. Identifying these diseased-related regulatory elements will shed light on understanding the mechanisms of how these variants regulate gene expression and ultimately result in disease formation and progression. RESULTS: In this study, we curated and categorized 27,558 Mendelian disease variants, 20,964 complex disease variants, 5,809 cancer predisposing germline variants, and 43,364 recurrent cancer somatic mutations. Compared against nine different types of regulatory regions from FANTOM5 and ENCODE projects, we found that different types of disease variants show distinctive propensity for particular regulatory elements. Mendelian disease variants and recurrent cancer somatic mutations are 22-fold and 10- fold significantly enriched in promoter regions respectively (q<0.001), compared with allele-frequency-matched genomic background. Separate from these two categories, cancer predisposing germline variants are 27-fold enriched in histone modification regions (q<0.001), 10-fold enriched in chromatin physical interaction regions (q<0.001), and 6-fold enriched in transcription promoters (q<0.001). Furthermore, Mendelian disease variants and recurrent cancer somatic mutations share very similar distribution across types of functional effects. We further found that regulatory regions are located within over 50% coding exon regions. Transcription promoters, methylation regions, and transcription insulators have the highest density of disease variants, with 472, 239, and 72 disease variants per one million base pairs, respectively. CONCLUSIONS: Disease-associated variants in different disease categories are preferentially located in particular regulatory elements. These results will be useful for an overall understanding about the differences among the pathogenic mechanisms of various disease-associated variants.
format Online
Article
Text
id pubmed-4480828
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44808282015-07-10 Disease-associated variants in different categories of disease located in distinct regulatory elements Ma, Meng Ru, Ying Chuang, Ling-Shiang Hsu, Nai-Yun Shi, Li-Song Hakenberg, Jörg Cheng, Wei-Yi Uzilov, Andrew Ding, Wei Glicksberg, Benjamin S Chen, Rong BMC Genomics Research BACKGROUND: The invention of high throughput sequencing technologies has led to the discoveries of hundreds of thousands of genetic variants associated with thousands of human diseases. Many of these genetic variants are located outside the protein coding regions, and as such, it is challenging to interpret the function of these genetic variants by traditional genetic approaches. Recent genome-wide functional genomics studies, such as FANTOM5 and ENCODE have uncovered a large number of regulatory elements across hundreds of different tissues or cell lines in the human genome. These findings provide an opportunity to study the interaction between regulatory elements and disease-associated genetic variants. Identifying these diseased-related regulatory elements will shed light on understanding the mechanisms of how these variants regulate gene expression and ultimately result in disease formation and progression. RESULTS: In this study, we curated and categorized 27,558 Mendelian disease variants, 20,964 complex disease variants, 5,809 cancer predisposing germline variants, and 43,364 recurrent cancer somatic mutations. Compared against nine different types of regulatory regions from FANTOM5 and ENCODE projects, we found that different types of disease variants show distinctive propensity for particular regulatory elements. Mendelian disease variants and recurrent cancer somatic mutations are 22-fold and 10- fold significantly enriched in promoter regions respectively (q<0.001), compared with allele-frequency-matched genomic background. Separate from these two categories, cancer predisposing germline variants are 27-fold enriched in histone modification regions (q<0.001), 10-fold enriched in chromatin physical interaction regions (q<0.001), and 6-fold enriched in transcription promoters (q<0.001). Furthermore, Mendelian disease variants and recurrent cancer somatic mutations share very similar distribution across types of functional effects. We further found that regulatory regions are located within over 50% coding exon regions. Transcription promoters, methylation regions, and transcription insulators have the highest density of disease variants, with 472, 239, and 72 disease variants per one million base pairs, respectively. CONCLUSIONS: Disease-associated variants in different disease categories are preferentially located in particular regulatory elements. These results will be useful for an overall understanding about the differences among the pathogenic mechanisms of various disease-associated variants. BioMed Central 2015-06-18 /pmc/articles/PMC4480828/ /pubmed/26110593 http://dx.doi.org/10.1186/1471-2164-16-S8-S3 Text en Copyright © 2015 Ma et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Ma, Meng
Ru, Ying
Chuang, Ling-Shiang
Hsu, Nai-Yun
Shi, Li-Song
Hakenberg, Jörg
Cheng, Wei-Yi
Uzilov, Andrew
Ding, Wei
Glicksberg, Benjamin S
Chen, Rong
Disease-associated variants in different categories of disease located in distinct regulatory elements
title Disease-associated variants in different categories of disease located in distinct regulatory elements
title_full Disease-associated variants in different categories of disease located in distinct regulatory elements
title_fullStr Disease-associated variants in different categories of disease located in distinct regulatory elements
title_full_unstemmed Disease-associated variants in different categories of disease located in distinct regulatory elements
title_short Disease-associated variants in different categories of disease located in distinct regulatory elements
title_sort disease-associated variants in different categories of disease located in distinct regulatory elements
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480828/
https://www.ncbi.nlm.nih.gov/pubmed/26110593
http://dx.doi.org/10.1186/1471-2164-16-S8-S3
work_keys_str_mv AT mameng diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT ruying diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT chuanglingshiang diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT hsunaiyun diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT shilisong diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT hakenbergjorg diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT chengweiyi diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT uzilovandrew diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT dingwei diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT glicksbergbenjamins diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements
AT chenrong diseaseassociatedvariantsindifferentcategoriesofdiseaselocatedindistinctregulatoryelements