Cargando…
Transcription factor motif quality assessment requires systematic comparative analysis
Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000Research
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4821295/ https://www.ncbi.nlm.nih.gov/pubmed/27092243 http://dx.doi.org/10.12688/f1000research.7408.2 |
_version_ | 1782425561760530432 |
---|---|
author | Kibet, Caleb Kipkurui Machanick, Philip |
author_facet | Kibet, Caleb Kipkurui Machanick, Philip |
author_sort | Kibet, Caleb Kipkurui |
collection | PubMed |
description | Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. |
format | Online Article Text |
id | pubmed-4821295 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | F1000Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-48212952016-04-17 Transcription factor motif quality assessment requires systematic comparative analysis Kibet, Caleb Kipkurui Machanick, Philip F1000Res Research Article Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. F1000Research 2016-03-14 /pmc/articles/PMC4821295/ /pubmed/27092243 http://dx.doi.org/10.12688/f1000research.7408.2 Text en Copyright: © 2016 Kibet CK and Machanick P http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Kibet, Caleb Kipkurui Machanick, Philip Transcription factor motif quality assessment requires systematic comparative analysis |
title | Transcription factor motif quality assessment requires systematic comparative analysis |
title_full | Transcription factor motif quality assessment requires systematic comparative analysis |
title_fullStr | Transcription factor motif quality assessment requires systematic comparative analysis |
title_full_unstemmed | Transcription factor motif quality assessment requires systematic comparative analysis |
title_short | Transcription factor motif quality assessment requires systematic comparative analysis |
title_sort | transcription factor motif quality assessment requires systematic comparative analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4821295/ https://www.ncbi.nlm.nih.gov/pubmed/27092243 http://dx.doi.org/10.12688/f1000research.7408.2 |
work_keys_str_mv | AT kibetcalebkipkurui transcriptionfactormotifqualityassessmentrequiressystematiccomparativeanalysis AT machanickphilip transcriptionfactormotifqualityassessmentrequiressystematiccomparativeanalysis |