Cargando…

Identifying Coevolving Partners from Paralogous Gene Families

Many methods have been developed to detect coevolution from aligned sequences. However, all the existing methods require a one-to-one mapping of candidate coevolving partners (nucleotides, amino acids) a priori. When two families of sequences have distinct duplication and loss histories, finding the...

Descripción completa

Detalles Bibliográficos
Autor principal: Yeang, Chen-Hsiang
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2614191/
https://www.ncbi.nlm.nih.gov/pubmed/19204811
_version_ 1782163220136460288
author Yeang, Chen-Hsiang
author_facet Yeang, Chen-Hsiang
author_sort Yeang, Chen-Hsiang
collection PubMed
description Many methods have been developed to detect coevolution from aligned sequences. However, all the existing methods require a one-to-one mapping of candidate coevolving partners (nucleotides, amino acids) a priori. When two families of sequences have distinct duplication and loss histories, finding the one-to-one mapping of coevolving partners can be computationally involved. We propose an algorithm to identify the coevolving partners from two families of sequences with distinct phylogenetic trees. The algorithm maps each gene tree to a reference species tree, and builds a joint state of sequence composition and assignments of coevolving partners for each species tree node. By applying dynamic programming on the joint states, the optimal assignments can be identified. Time complexity is quadratic to the size of the species tree, and space complexity is exponential to the maximum number of gene tree nodes mapped to the same species tree node. Analysis on both simulated data and Pfam protein domain sequences demonstrates that the paralog coevolution algorithm picks up the coevolving partners with 60% 88% accuracy. This algorithm extends phylogeny-based coevolutionary models and make them applicable to a wide range of problems such as predicting protein-protein, protein-DNA and DNA-RNA interactions of two distinct families of sequences.
format Text
id pubmed-2614191
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-26141912009-02-09 Identifying Coevolving Partners from Paralogous Gene Families Yeang, Chen-Hsiang Evol Bioinform Online Original Research Many methods have been developed to detect coevolution from aligned sequences. However, all the existing methods require a one-to-one mapping of candidate coevolving partners (nucleotides, amino acids) a priori. When two families of sequences have distinct duplication and loss histories, finding the one-to-one mapping of coevolving partners can be computationally involved. We propose an algorithm to identify the coevolving partners from two families of sequences with distinct phylogenetic trees. The algorithm maps each gene tree to a reference species tree, and builds a joint state of sequence composition and assignments of coevolving partners for each species tree node. By applying dynamic programming on the joint states, the optimal assignments can be identified. Time complexity is quadratic to the size of the species tree, and space complexity is exponential to the maximum number of gene tree nodes mapped to the same species tree node. Analysis on both simulated data and Pfam protein domain sequences demonstrates that the paralog coevolution algorithm picks up the coevolving partners with 60% 88% accuracy. This algorithm extends phylogeny-based coevolutionary models and make them applicable to a wide range of problems such as predicting protein-protein, protein-DNA and DNA-RNA interactions of two distinct families of sequences. Libertas Academica 2008-04-24 /pmc/articles/PMC2614191/ /pubmed/19204811 Text en Copyright © 2008 The authors. http://creativecommons.org/licenses/by/3.0 This article is published under the Creative Commons Attribution By licence. For further information go to: http://creativecommons.org/licenses/by/3.0. (http://creativecommons.org/licenses/by/3.0)
spellingShingle Original Research
Yeang, Chen-Hsiang
Identifying Coevolving Partners from Paralogous Gene Families
title Identifying Coevolving Partners from Paralogous Gene Families
title_full Identifying Coevolving Partners from Paralogous Gene Families
title_fullStr Identifying Coevolving Partners from Paralogous Gene Families
title_full_unstemmed Identifying Coevolving Partners from Paralogous Gene Families
title_short Identifying Coevolving Partners from Paralogous Gene Families
title_sort identifying coevolving partners from paralogous gene families
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2614191/
https://www.ncbi.nlm.nih.gov/pubmed/19204811
work_keys_str_mv AT yeangchenhsiang identifyingcoevolvingpartnersfromparalogousgenefamilies