Cargando…

A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing

High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these assoc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kremsky, Isaac, Bellora, Nicolás, Eyras, Eduardo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4514851/ https://www.ncbi.nlm.nih.gov/pubmed/26207626 http://dx.doi.org/10.1371/journal.pone.0132448

_version_	1782382826306404352
author	Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo
author_facet	Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo
author_sort	Kremsky, Isaac
collection	PubMed
description	High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/.
format	Online Article Text
id	pubmed-4514851
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-45148512015-07-29 A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo PLoS One Research Article High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/. Public Library of Science 2015-07-24 /pmc/articles/PMC4514851/ /pubmed/26207626 http://dx.doi.org/10.1371/journal.pone.0132448 Text en © 2015 Kremsky et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title_full	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title_fullStr	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title_full_unstemmed	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title_short	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
title_sort	quantitative profiling tool for diverse genomic data types reveals potential associations between chromatin and pre-mrna processing
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4514851/ https://www.ncbi.nlm.nih.gov/pubmed/26207626 http://dx.doi.org/10.1371/journal.pone.0132448
work_keys_str_mv	AT kremskyisaac aquantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing AT belloranicolas aquantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing AT eyraseduardo aquantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing AT kremskyisaac quantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing AT belloranicolas quantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing AT eyraseduardo quantitativeprofilingtoolfordiversegenomicdatatypesrevealspotentialassociationsbetweenchromatinandpremrnaprocessing

A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing

Ejemplares similares