Cargando…

Back-translation for discovering distant protein homologies in the presence of frameshift mutations

BACKGROUND: Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in t...

Descripción completa

Detalles Bibliográficos
Autores principales: Gîrdea, Marta, Noé, Laurent, Kucherov, Gregory
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2821327/
https://www.ncbi.nlm.nih.gov/pubmed/20047662
http://dx.doi.org/10.1186/1748-7188-5-6
_version_ 1782177427027394560
author Gîrdea, Marta
Noé, Laurent
Kucherov, Gregory
author_facet Gîrdea, Marta
Noé, Laurent
Kucherov, Gregory
author_sort Gîrdea, Marta
collection PubMed
description BACKGROUND: Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. RESULTS: We developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/. CONCLUSIONS: Our approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.
format Text
id pubmed-2821327
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28213272010-02-15 Back-translation for discovering distant protein homologies in the presence of frameshift mutations Gîrdea, Marta Noé, Laurent Kucherov, Gregory Algorithms Mol Biol Research BACKGROUND: Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. RESULTS: We developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/. CONCLUSIONS: Our approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples. BioMed Central 2010-01-04 /pmc/articles/PMC2821327/ /pubmed/20047662 http://dx.doi.org/10.1186/1748-7188-5-6 Text en Copyright © 2010 Gîrdea et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Gîrdea, Marta
Noé, Laurent
Kucherov, Gregory
Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title_full Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title_fullStr Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title_full_unstemmed Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title_short Back-translation for discovering distant protein homologies in the presence of frameshift mutations
title_sort back-translation for discovering distant protein homologies in the presence of frameshift mutations
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2821327/
https://www.ncbi.nlm.nih.gov/pubmed/20047662
http://dx.doi.org/10.1186/1748-7188-5-6
work_keys_str_mv AT girdeamarta backtranslationfordiscoveringdistantproteinhomologiesinthepresenceofframeshiftmutations
AT noelaurent backtranslationfordiscoveringdistantproteinhomologiesinthepresenceofframeshiftmutations
AT kucherovgregory backtranslationfordiscoveringdistantproteinhomologiesinthepresenceofframeshiftmutations