Cargando…
The twilight zone of cis element alignments
Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed ye...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561995/ https://www.ncbi.nlm.nih.gov/pubmed/23268451 http://dx.doi.org/10.1093/nar/gks1301 |
_version_ | 1782258030137573376 |
---|---|
author | Sebastian, Alvaro Contreras-Moreira, Bruno |
author_facet | Sebastian, Alvaro Contreras-Moreira, Bruno |
author_sort | Sebastian, Alvaro |
collection | PubMed |
description | Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. |
format | Online Article Text |
id | pubmed-3561995 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35619952013-02-01 The twilight zone of cis element alignments Sebastian, Alvaro Contreras-Moreira, Bruno Nucleic Acids Res Computational Biology Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. Oxford University Press 2013-02 2012-12-24 /pmc/articles/PMC3561995/ /pubmed/23268451 http://dx.doi.org/10.1093/nar/gks1301 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com. |
spellingShingle | Computational Biology Sebastian, Alvaro Contreras-Moreira, Bruno The twilight zone of cis element alignments |
title | The twilight zone of cis element alignments |
title_full | The twilight zone of cis element alignments |
title_fullStr | The twilight zone of cis element alignments |
title_full_unstemmed | The twilight zone of cis element alignments |
title_short | The twilight zone of cis element alignments |
title_sort | twilight zone of cis element alignments |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561995/ https://www.ncbi.nlm.nih.gov/pubmed/23268451 http://dx.doi.org/10.1093/nar/gks1301 |
work_keys_str_mv | AT sebastianalvaro thetwilightzoneofciselementalignments AT contrerasmoreirabruno thetwilightzoneofciselementalignments AT sebastianalvaro twilightzoneofciselementalignments AT contrerasmoreirabruno twilightzoneofciselementalignments |