Cargando…

TwinCons: Conservation score for uncovering deep sequence similarity and divergence

We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a ‘cost’ of transforming one group to the other at each position of the alignme...

Descripción completa

Detalles Bibliográficos
Autores principales: Penev, Petar I., Alvarez-Carreño, Claudia, Smith, Eric, Petrov, Anton S., Williams, Loren Dean
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580257/
https://www.ncbi.nlm.nih.gov/pubmed/34714829
http://dx.doi.org/10.1371/journal.pcbi.1009541
_version_ 1784596575935266816
author Penev, Petar I.
Alvarez-Carreño, Claudia
Smith, Eric
Petrov, Anton S.
Williams, Loren Dean
author_facet Penev, Petar I.
Alvarez-Carreño, Claudia
Smith, Eric
Petrov, Anton S.
Williams, Loren Dean
author_sort Penev, Petar I.
collection PubMed
description We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a ‘cost’ of transforming one group to the other at each position of the alignment. The output distinguishes conserved, variable and signature positions. A signature is conserved within groups but differs between groups. The method automatically detects continuous characteristic stretches (segments) within alignments. TwinCons provides a convenient representation of conserved, variable and signature positions as a single score, enabling the structural mapping and visualization of these characteristics. Structure is more conserved than sequence. TwinCons highlights alternative sequences of conserved structures. Using TwinCons, we detected highly similar segments between proteins from the translation and transcription systems. TwinCons detects conserved residues within regions of high functional importance for the ribosomal RNA (rRNA) and demonstrates that signatures are not confined to specific regions but are distributed across the rRNA structure. The ability to evaluate both nucleic acid and protein alignments allows TwinCons to be used in combined sequence and structural analysis of signatures and conservation in rRNA and in ribosomal proteins (rProteins). TwinCons detects a strong sequence conservation signal between bacterial and archaeal rProteins related by circular permutation. This conserved sequence is structurally colocalized with conserved rRNA, indicated by TwinCons scores of rRNA alignments of bacterial and archaeal groups. This combined analysis revealed deep co-evolution of rRNA and rProtein buried within the deepest branching points in the tree of life.
format Online
Article
Text
id pubmed-8580257
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-85802572021-11-11 TwinCons: Conservation score for uncovering deep sequence similarity and divergence Penev, Petar I. Alvarez-Carreño, Claudia Smith, Eric Petrov, Anton S. Williams, Loren Dean PLoS Comput Biol Research Article We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a ‘cost’ of transforming one group to the other at each position of the alignment. The output distinguishes conserved, variable and signature positions. A signature is conserved within groups but differs between groups. The method automatically detects continuous characteristic stretches (segments) within alignments. TwinCons provides a convenient representation of conserved, variable and signature positions as a single score, enabling the structural mapping and visualization of these characteristics. Structure is more conserved than sequence. TwinCons highlights alternative sequences of conserved structures. Using TwinCons, we detected highly similar segments between proteins from the translation and transcription systems. TwinCons detects conserved residues within regions of high functional importance for the ribosomal RNA (rRNA) and demonstrates that signatures are not confined to specific regions but are distributed across the rRNA structure. The ability to evaluate both nucleic acid and protein alignments allows TwinCons to be used in combined sequence and structural analysis of signatures and conservation in rRNA and in ribosomal proteins (rProteins). TwinCons detects a strong sequence conservation signal between bacterial and archaeal rProteins related by circular permutation. This conserved sequence is structurally colocalized with conserved rRNA, indicated by TwinCons scores of rRNA alignments of bacterial and archaeal groups. This combined analysis revealed deep co-evolution of rRNA and rProtein buried within the deepest branching points in the tree of life. Public Library of Science 2021-10-29 /pmc/articles/PMC8580257/ /pubmed/34714829 http://dx.doi.org/10.1371/journal.pcbi.1009541 Text en © 2021 Penev et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Penev, Petar I.
Alvarez-Carreño, Claudia
Smith, Eric
Petrov, Anton S.
Williams, Loren Dean
TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title_full TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title_fullStr TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title_full_unstemmed TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title_short TwinCons: Conservation score for uncovering deep sequence similarity and divergence
title_sort twincons: conservation score for uncovering deep sequence similarity and divergence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580257/
https://www.ncbi.nlm.nih.gov/pubmed/34714829
http://dx.doi.org/10.1371/journal.pcbi.1009541
work_keys_str_mv AT penevpetari twinconsconservationscoreforuncoveringdeepsequencesimilarityanddivergence
AT alvarezcarrenoclaudia twinconsconservationscoreforuncoveringdeepsequencesimilarityanddivergence
AT smitheric twinconsconservationscoreforuncoveringdeepsequencesimilarityanddivergence
AT petrovantons twinconsconservationscoreforuncoveringdeepsequencesimilarityanddivergence
AT williamslorendean twinconsconservationscoreforuncoveringdeepsequencesimilarityanddivergence