Cargando…

A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)

BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset...

Descripción completa

Detalles Bibliográficos
Autores principales: Reiss, Daniel J., Howard, Frederick M., Mobley, Harry L. T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3430675/
https://www.ncbi.nlm.nih.gov/pubmed/22956994
http://dx.doi.org/10.1371/journal.pone.0042761
_version_ 1782241976887803904
author Reiss, Daniel J.
Howard, Frederick M.
Mobley, Harry L. T.
author_facet Reiss, Daniel J.
Howard, Frederick M.
Mobley, Harry L. T.
author_sort Reiss, Daniel J.
collection PubMed
description BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers. METHODOLOGY: Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences. PRINCIPAL FINDINGS: Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5×10(12), 2.9×10(−46), and 1.2×10(−73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp). CONCLUSIONS/SIGNIFICANCE: TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site.
format Online
Article
Text
id pubmed-3430675
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34306752012-09-06 A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) Reiss, Daniel J. Howard, Frederick M. Mobley, Harry L. T. PLoS One Research Article BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers. METHODOLOGY: Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences. PRINCIPAL FINDINGS: Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5×10(12), 2.9×10(−46), and 1.2×10(−73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp). CONCLUSIONS/SIGNIFICANCE: TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site. Public Library of Science 2012-08-03 /pmc/articles/PMC3430675/ /pubmed/22956994 http://dx.doi.org/10.1371/journal.pone.0042761 Text en © 2012 Reiss et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Reiss, Daniel J.
Howard, Frederick M.
Mobley, Harry L. T.
A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title_full A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title_fullStr A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title_full_unstemmed A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title_short A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
title_sort novel approach for transcription factor analysis using selex with high-throughput sequencing (tfast)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3430675/
https://www.ncbi.nlm.nih.gov/pubmed/22956994
http://dx.doi.org/10.1371/journal.pone.0042761
work_keys_str_mv AT reissdanielj anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast
AT howardfrederickm anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast
AT mobleyharrylt anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast
AT reissdanielj novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast
AT howardfrederickm novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast
AT mobleyharrylt novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast