Cargando…

Transcription factor motif quality assessment requires systematic comparative analysis

Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a...

Descripción completa

Detalles Bibliográficos
Autores principales: Kibet, Caleb Kipkurui, Machanick, Philip
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4821295/
https://www.ncbi.nlm.nih.gov/pubmed/27092243
http://dx.doi.org/10.12688/f1000research.7408.2
_version_ 1782425561760530432
author Kibet, Caleb Kipkurui
Machanick, Philip
author_facet Kibet, Caleb Kipkurui
Machanick, Philip
author_sort Kibet, Caleb Kipkurui
collection PubMed
description Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis.
format Online
Article
Text
id pubmed-4821295
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-48212952016-04-17 Transcription factor motif quality assessment requires systematic comparative analysis Kibet, Caleb Kipkurui Machanick, Philip F1000Res Research Article Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. F1000Research 2016-03-14 /pmc/articles/PMC4821295/ /pubmed/27092243 http://dx.doi.org/10.12688/f1000research.7408.2 Text en Copyright: © 2016 Kibet CK and Machanick P http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kibet, Caleb Kipkurui
Machanick, Philip
Transcription factor motif quality assessment requires systematic comparative analysis
title Transcription factor motif quality assessment requires systematic comparative analysis
title_full Transcription factor motif quality assessment requires systematic comparative analysis
title_fullStr Transcription factor motif quality assessment requires systematic comparative analysis
title_full_unstemmed Transcription factor motif quality assessment requires systematic comparative analysis
title_short Transcription factor motif quality assessment requires systematic comparative analysis
title_sort transcription factor motif quality assessment requires systematic comparative analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4821295/
https://www.ncbi.nlm.nih.gov/pubmed/27092243
http://dx.doi.org/10.12688/f1000research.7408.2
work_keys_str_mv AT kibetcalebkipkurui transcriptionfactormotifqualityassessmentrequiressystematiccomparativeanalysis
AT machanickphilip transcriptionfactormotifqualityassessmentrequiressystematiccomparativeanalysis