Cargando…
A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)
BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3430675/ https://www.ncbi.nlm.nih.gov/pubmed/22956994 http://dx.doi.org/10.1371/journal.pone.0042761 |
_version_ | 1782241976887803904 |
---|---|
author | Reiss, Daniel J. Howard, Frederick M. Mobley, Harry L. T. |
author_facet | Reiss, Daniel J. Howard, Frederick M. Mobley, Harry L. T. |
author_sort | Reiss, Daniel J. |
collection | PubMed |
description | BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers. METHODOLOGY: Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences. PRINCIPAL FINDINGS: Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5×10(12), 2.9×10(−46), and 1.2×10(−73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp). CONCLUSIONS/SIGNIFICANCE: TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site. |
format | Online Article Text |
id | pubmed-3430675 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34306752012-09-06 A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) Reiss, Daniel J. Howard, Frederick M. Mobley, Harry L. T. PLoS One Research Article BACKGROUND: In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers. METHODOLOGY: Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences. PRINCIPAL FINDINGS: Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5×10(12), 2.9×10(−46), and 1.2×10(−73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp). CONCLUSIONS/SIGNIFICANCE: TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site. Public Library of Science 2012-08-03 /pmc/articles/PMC3430675/ /pubmed/22956994 http://dx.doi.org/10.1371/journal.pone.0042761 Text en © 2012 Reiss et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Reiss, Daniel J. Howard, Frederick M. Mobley, Harry L. T. A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title | A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title_full | A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title_fullStr | A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title_full_unstemmed | A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title_short | A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST) |
title_sort | novel approach for transcription factor analysis using selex with high-throughput sequencing (tfast) |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3430675/ https://www.ncbi.nlm.nih.gov/pubmed/22956994 http://dx.doi.org/10.1371/journal.pone.0042761 |
work_keys_str_mv | AT reissdanielj anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT howardfrederickm anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT mobleyharrylt anovelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT reissdanielj novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT howardfrederickm novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast AT mobleyharrylt novelapproachfortranscriptionfactoranalysisusingselexwithhighthroughputsequencingtfast |