Cargando…
Establishing and validating regulatory regions for variant annotation and expression analysis
BACKGROUND: The regulatory effect of inherited or de novo genetic variants occurring in promoters as well as in transcribed or even coding gene regions is gaining greater recognition as a contributing factor to disease processes in addition to mutations affecting protein functionality. Thousands of...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928138/ https://www.ncbi.nlm.nih.gov/pubmed/27357948 http://dx.doi.org/10.1186/s12864-016-2724-0 |
_version_ | 1782440385964933120 |
---|---|
author | Kaplun, Alexander Krull, Mathias Lakshman, Karthick Matys, Volker Lewicki, Birgit Hogan, Jennifer D. |
author_facet | Kaplun, Alexander Krull, Mathias Lakshman, Karthick Matys, Volker Lewicki, Birgit Hogan, Jennifer D. |
author_sort | Kaplun, Alexander |
collection | PubMed |
description | BACKGROUND: The regulatory effect of inherited or de novo genetic variants occurring in promoters as well as in transcribed or even coding gene regions is gaining greater recognition as a contributing factor to disease processes in addition to mutations affecting protein functionality. Thousands of such regulatory mutations are already recorded in HGMD, OMIM, ClinVar and other databases containing published disease causing and associated mutations. It is therefore important to properly annotate genetic variants occurring in experimentally verified and predicted transcription factor binding sites (TFBS) that could thus influence the factor binding event. Selection of the promoter sequence used is an important factor in the analysis as it directly influences the composition of the sequence available for transcription factor binding analysis. RESULTS: In this study we first establish genomic regions likely to be involved in regulation of gene expression. TRANSFAC uses a method of virtual transcription start sites (vTSS) calculation to define the best supported promoter for a gene. We have performed a comparison of the virtually calculated promoters between the best supported and secondary promoters in hg19 and hg38 reference genomes to test and validate the approach. Next we create and utilize a workflow for systematic analysis of casual disease associated variants in TFBS using Genome Trax and TRANSFAC databases. A total of 841 and 736 experimentally verified TFBSs within best supported promoters were mapped over HGMD and ClinVar mutation sites respectively. Tens of thousands of predicted ChIP-Seq derived TFBSs were mapped over mutations as well. We have further analyzed some of these mutations for potential gain or loss in transcription factor binding. CONCLUSIONS: We have confirmed the validity of TRANSFAC’s approach to define the best supported promoters and established a workflow of their use in annotation of regulatory genetic variants. |
format | Online Article Text |
id | pubmed-4928138 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49281382016-06-30 Establishing and validating regulatory regions for variant annotation and expression analysis Kaplun, Alexander Krull, Mathias Lakshman, Karthick Matys, Volker Lewicki, Birgit Hogan, Jennifer D. BMC Genomics Methodology Article BACKGROUND: The regulatory effect of inherited or de novo genetic variants occurring in promoters as well as in transcribed or even coding gene regions is gaining greater recognition as a contributing factor to disease processes in addition to mutations affecting protein functionality. Thousands of such regulatory mutations are already recorded in HGMD, OMIM, ClinVar and other databases containing published disease causing and associated mutations. It is therefore important to properly annotate genetic variants occurring in experimentally verified and predicted transcription factor binding sites (TFBS) that could thus influence the factor binding event. Selection of the promoter sequence used is an important factor in the analysis as it directly influences the composition of the sequence available for transcription factor binding analysis. RESULTS: In this study we first establish genomic regions likely to be involved in regulation of gene expression. TRANSFAC uses a method of virtual transcription start sites (vTSS) calculation to define the best supported promoter for a gene. We have performed a comparison of the virtually calculated promoters between the best supported and secondary promoters in hg19 and hg38 reference genomes to test and validate the approach. Next we create and utilize a workflow for systematic analysis of casual disease associated variants in TFBS using Genome Trax and TRANSFAC databases. A total of 841 and 736 experimentally verified TFBSs within best supported promoters were mapped over HGMD and ClinVar mutation sites respectively. Tens of thousands of predicted ChIP-Seq derived TFBSs were mapped over mutations as well. We have further analyzed some of these mutations for potential gain or loss in transcription factor binding. CONCLUSIONS: We have confirmed the validity of TRANSFAC’s approach to define the best supported promoters and established a workflow of their use in annotation of regulatory genetic variants. BioMed Central 2016-06-23 /pmc/articles/PMC4928138/ /pubmed/27357948 http://dx.doi.org/10.1186/s12864-016-2724-0 Text en © Kaplun et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Kaplun, Alexander Krull, Mathias Lakshman, Karthick Matys, Volker Lewicki, Birgit Hogan, Jennifer D. Establishing and validating regulatory regions for variant annotation and expression analysis |
title | Establishing and validating regulatory regions for variant annotation and expression analysis |
title_full | Establishing and validating regulatory regions for variant annotation and expression analysis |
title_fullStr | Establishing and validating regulatory regions for variant annotation and expression analysis |
title_full_unstemmed | Establishing and validating regulatory regions for variant annotation and expression analysis |
title_short | Establishing and validating regulatory regions for variant annotation and expression analysis |
title_sort | establishing and validating regulatory regions for variant annotation and expression analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928138/ https://www.ncbi.nlm.nih.gov/pubmed/27357948 http://dx.doi.org/10.1186/s12864-016-2724-0 |
work_keys_str_mv | AT kaplunalexander establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis AT krullmathias establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis AT lakshmankarthick establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis AT matysvolker establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis AT lewickibirgit establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis AT hoganjenniferd establishingandvalidatingregulatoryregionsforvariantannotationandexpressionanalysis |