Cargando…

Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites

BACKGROUND: The computational prediction of Transcription Factor Binding Sites (TFBS) remains a challenge due to their short length and low information content. Comparative genomics approaches that simultaneously consider several related species and favor sites that have been conserved throughout ev...

Descripción completa

Detalles Bibliográficos
Autor principal: Blanchette, Mathieu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526440/
https://www.ncbi.nlm.nih.gov/pubmed/23281809
http://dx.doi.org/10.1186/1471-2105-13-S19-S2
_version_ 1782253560875974656
author Blanchette, Mathieu
author_facet Blanchette, Mathieu
author_sort Blanchette, Mathieu
collection PubMed
description BACKGROUND: The computational prediction of Transcription Factor Binding Sites (TFBS) remains a challenge due to their short length and low information content. Comparative genomics approaches that simultaneously consider several related species and favor sites that have been conserved throughout evolution improve the accuracy (specificity) of the predictions but are limited due to a phenomenon called binding site turnover, where sequence evolution causes one TFBS to replace another in the same region. In parallel to this development, an increasing number of mammalian genomes are now sequenced and it is becoming possible to infer, to a surprisingly high degree of accuracy, ancestral mammalian sequences. RESULTS: We propose a TFBS prediction approach that makes use of the availability of inferred ancestral mammalian genomes to improve its accuracy. This method aims to identify binding loci, which are regions of a few hundred base pairs that have preserved their potential to bind a given transcription factor over evolutionary time. After proposing a neutral evolutionary model of predicted TFBS counts in a DNA region of a given length, we use it to identify regions that have preserved the number of predicted TFBS they contain to an unexpected degree given their divergence. The approach is applied to human chromosome 1 and shows significant gains in accuracy as compared to both existing single-species and multi-species TFBS prediction approaches, in particular for transcription factors that are subject to high turnover rates. AVAILABILITY: The source code and predictions made by the program are available at http://www.cs.mcgill.ca/~blanchem/bindingLoci.
format Online
Article
Text
id pubmed-3526440
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35264402013-01-10 Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites Blanchette, Mathieu BMC Bioinformatics Proceedings BACKGROUND: The computational prediction of Transcription Factor Binding Sites (TFBS) remains a challenge due to their short length and low information content. Comparative genomics approaches that simultaneously consider several related species and favor sites that have been conserved throughout evolution improve the accuracy (specificity) of the predictions but are limited due to a phenomenon called binding site turnover, where sequence evolution causes one TFBS to replace another in the same region. In parallel to this development, an increasing number of mammalian genomes are now sequenced and it is becoming possible to infer, to a surprisingly high degree of accuracy, ancestral mammalian sequences. RESULTS: We propose a TFBS prediction approach that makes use of the availability of inferred ancestral mammalian genomes to improve its accuracy. This method aims to identify binding loci, which are regions of a few hundred base pairs that have preserved their potential to bind a given transcription factor over evolutionary time. After proposing a neutral evolutionary model of predicted TFBS counts in a DNA region of a given length, we use it to identify regions that have preserved the number of predicted TFBS they contain to an unexpected degree given their divergence. The approach is applied to human chromosome 1 and shows significant gains in accuracy as compared to both existing single-species and multi-species TFBS prediction approaches, in particular for transcription factors that are subject to high turnover rates. AVAILABILITY: The source code and predictions made by the program are available at http://www.cs.mcgill.ca/~blanchem/bindingLoci. BioMed Central 2012-12-19 /pmc/articles/PMC3526440/ /pubmed/23281809 http://dx.doi.org/10.1186/1471-2105-13-S19-S2 Text en Copyright ©2012 Blanchette; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Blanchette, Mathieu
Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title_full Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title_fullStr Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title_full_unstemmed Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title_short Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
title_sort exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526440/
https://www.ncbi.nlm.nih.gov/pubmed/23281809
http://dx.doi.org/10.1186/1471-2105-13-S19-S2
work_keys_str_mv AT blanchettemathieu exploitingancestralmammaliangenomesforthepredictionofhumantranscriptionfactorbindingsites