Cargando…

Deconvolving sequence features that discriminate between overlapping regulatory annotations

Genomic loci with regulatory potential can be annotated with various properties. For example, genomic sites bound by a given transcription factor (TF) can be divided according to whether they are proximal or distal to known promoters. Sites can be further labeled according to the cell types and cond...

Descripción completa

Detalles Bibliográficos
Autores principales: Kakumanu, Akshay, Velasco, Silvia, Mazzoni, Esteban, Mahony, Shaun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5663517/
https://www.ncbi.nlm.nih.gov/pubmed/29049320
http://dx.doi.org/10.1371/journal.pcbi.1005795
_version_ 1783274822586859520
author Kakumanu, Akshay
Velasco, Silvia
Mazzoni, Esteban
Mahony, Shaun
author_facet Kakumanu, Akshay
Velasco, Silvia
Mazzoni, Esteban
Mahony, Shaun
author_sort Kakumanu, Akshay
collection PubMed
description Genomic loci with regulatory potential can be annotated with various properties. For example, genomic sites bound by a given transcription factor (TF) can be divided according to whether they are proximal or distal to known promoters. Sites can be further labeled according to the cell types and conditions in which they are active. Given such a collection of labeled sites, it is natural to ask what sequence features are associated with each annotation label. However, discovering such label-specific sequence features is often confounded by overlaps between the labels; e.g. if regulatory sites specific to a given cell type are also more likely to be promoter-proximal, it is difficult to assess whether motifs identified in that set of sites are associated with the cell type or associated with promoters. In order to meet this challenge, we developed SeqUnwinder, a principled approach to deconvolving interpretable discriminative sequence features associated with overlapping annotation labels. We demonstrate the novel analysis abilities of SeqUnwinder using three examples. Firstly, SeqUnwinder is able to unravel sequence features associated with the dynamic binding behavior of TFs during motor neuron programming from features associated with chromatin state in the initial embryonic stem cells. Secondly, we characterize distinct sequence properties of multi-condition and cell-specific TF binding sites after controlling for uneven associations with promoter proximity. Finally, we demonstrate the scalability of SeqUnwinder to discover cell-specific sequence features from over one hundred thousand genomic loci that display DNase I hypersensitivity in one or more ENCODE cell lines.
format Online
Article
Text
id pubmed-5663517
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-56635172017-11-08 Deconvolving sequence features that discriminate between overlapping regulatory annotations Kakumanu, Akshay Velasco, Silvia Mazzoni, Esteban Mahony, Shaun PLoS Comput Biol Research Article Genomic loci with regulatory potential can be annotated with various properties. For example, genomic sites bound by a given transcription factor (TF) can be divided according to whether they are proximal or distal to known promoters. Sites can be further labeled according to the cell types and conditions in which they are active. Given such a collection of labeled sites, it is natural to ask what sequence features are associated with each annotation label. However, discovering such label-specific sequence features is often confounded by overlaps between the labels; e.g. if regulatory sites specific to a given cell type are also more likely to be promoter-proximal, it is difficult to assess whether motifs identified in that set of sites are associated with the cell type or associated with promoters. In order to meet this challenge, we developed SeqUnwinder, a principled approach to deconvolving interpretable discriminative sequence features associated with overlapping annotation labels. We demonstrate the novel analysis abilities of SeqUnwinder using three examples. Firstly, SeqUnwinder is able to unravel sequence features associated with the dynamic binding behavior of TFs during motor neuron programming from features associated with chromatin state in the initial embryonic stem cells. Secondly, we characterize distinct sequence properties of multi-condition and cell-specific TF binding sites after controlling for uneven associations with promoter proximity. Finally, we demonstrate the scalability of SeqUnwinder to discover cell-specific sequence features from over one hundred thousand genomic loci that display DNase I hypersensitivity in one or more ENCODE cell lines. Public Library of Science 2017-10-19 /pmc/articles/PMC5663517/ /pubmed/29049320 http://dx.doi.org/10.1371/journal.pcbi.1005795 Text en © 2017 Kakumanu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kakumanu, Akshay
Velasco, Silvia
Mazzoni, Esteban
Mahony, Shaun
Deconvolving sequence features that discriminate between overlapping regulatory annotations
title Deconvolving sequence features that discriminate between overlapping regulatory annotations
title_full Deconvolving sequence features that discriminate between overlapping regulatory annotations
title_fullStr Deconvolving sequence features that discriminate between overlapping regulatory annotations
title_full_unstemmed Deconvolving sequence features that discriminate between overlapping regulatory annotations
title_short Deconvolving sequence features that discriminate between overlapping regulatory annotations
title_sort deconvolving sequence features that discriminate between overlapping regulatory annotations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5663517/
https://www.ncbi.nlm.nih.gov/pubmed/29049320
http://dx.doi.org/10.1371/journal.pcbi.1005795
work_keys_str_mv AT kakumanuakshay deconvolvingsequencefeaturesthatdiscriminatebetweenoverlappingregulatoryannotations
AT velascosilvia deconvolvingsequencefeaturesthatdiscriminatebetweenoverlappingregulatoryannotations
AT mazzoniesteban deconvolvingsequencefeaturesthatdiscriminatebetweenoverlappingregulatoryannotations
AT mahonyshaun deconvolvingsequencefeaturesthatdiscriminatebetweenoverlappingregulatoryannotations