Cargando…

Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites

BACKGROUND: From initial seed germination through reproduction, plants continuously reprogram their transcriptional repertoire to facilitate growth and development. This dynamic is mediated by a diverse but inextricably-linked catalog of regulatory proteins called transcription factors (TFs). Statis...

Descripción completa

Detalles Bibliográficos
Autores principales: Hosseini, Parsa, Ovcharenko, Ivan, Matthews, Benjamin F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639912/
https://www.ncbi.nlm.nih.gov/pubmed/23578135
http://dx.doi.org/10.1186/1746-4811-9-12
_version_ 1782476016971677696
author Hosseini, Parsa
Ovcharenko, Ivan
Matthews, Benjamin F
author_facet Hosseini, Parsa
Ovcharenko, Ivan
Matthews, Benjamin F
author_sort Hosseini, Parsa
collection PubMed
description BACKGROUND: From initial seed germination through reproduction, plants continuously reprogram their transcriptional repertoire to facilitate growth and development. This dynamic is mediated by a diverse but inextricably-linked catalog of regulatory proteins called transcription factors (TFs). Statistically quantifying TF binding site (TFBS) abundance in promoters of differentially expressed genes can be used to identify binding site patterns in promoters that are closely related to stress-response. Output from today’s transcriptomic assays necessitates statistically-oriented software to handle large promoter-sequence sets in a computationally tractable fashion. RESULTS: We present Marina, an open-source software for identifying over-represented TFBSs from amongst large sets of promoter sequences, using an ensemble of 7 statistical metrics and binding-site profiles. Through software comparison, we show that Marina can identify considerably more over-represented plant TFBSs compared to a popular software alternative. CONCLUSIONS: Marina was used to identify over-represented TFBSs in a two time-point RNA-Seq study exploring the transcriptomic interplay between soybean (Glycine max) and soybean rust (Phakopsora pachyrhizi). Marina identified numerous abundant TFBSs recognized by transcription factors that are associated with defense-response such as WRKY, HY5 and MYB2. Comparing results from Marina to that of a popular software alternative suggests that regardless of the number of promoter-sequences, Marina is able to identify significantly more over-represented TFBSs.
format Online
Article
Text
id pubmed-3639912
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36399122013-05-06 Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites Hosseini, Parsa Ovcharenko, Ivan Matthews, Benjamin F Plant Methods Software BACKGROUND: From initial seed germination through reproduction, plants continuously reprogram their transcriptional repertoire to facilitate growth and development. This dynamic is mediated by a diverse but inextricably-linked catalog of regulatory proteins called transcription factors (TFs). Statistically quantifying TF binding site (TFBS) abundance in promoters of differentially expressed genes can be used to identify binding site patterns in promoters that are closely related to stress-response. Output from today’s transcriptomic assays necessitates statistically-oriented software to handle large promoter-sequence sets in a computationally tractable fashion. RESULTS: We present Marina, an open-source software for identifying over-represented TFBSs from amongst large sets of promoter sequences, using an ensemble of 7 statistical metrics and binding-site profiles. Through software comparison, we show that Marina can identify considerably more over-represented plant TFBSs compared to a popular software alternative. CONCLUSIONS: Marina was used to identify over-represented TFBSs in a two time-point RNA-Seq study exploring the transcriptomic interplay between soybean (Glycine max) and soybean rust (Phakopsora pachyrhizi). Marina identified numerous abundant TFBSs recognized by transcription factors that are associated with defense-response such as WRKY, HY5 and MYB2. Comparing results from Marina to that of a popular software alternative suggests that regardless of the number of promoter-sequences, Marina is able to identify significantly more over-represented TFBSs. BioMed Central 2013-04-11 /pmc/articles/PMC3639912/ /pubmed/23578135 http://dx.doi.org/10.1186/1746-4811-9-12 Text en Copyright © 2013 Hosseini et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Hosseini, Parsa
Ovcharenko, Ivan
Matthews, Benjamin F
Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title_full Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title_fullStr Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title_full_unstemmed Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title_short Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
title_sort using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639912/
https://www.ncbi.nlm.nih.gov/pubmed/23578135
http://dx.doi.org/10.1186/1746-4811-9-12
work_keys_str_mv AT hosseiniparsa usinganensembleofstatisticalmetricstoquantifylargesetsofplanttranscriptionfactorbindingsites
AT ovcharenkoivan usinganensembleofstatisticalmetricstoquantifylargesetsofplanttranscriptionfactorbindingsites
AT matthewsbenjaminf usinganensembleofstatisticalmetricstoquantifylargesetsofplanttranscriptionfactorbindingsites