Cargando…

Linear predictive coding representation of correlated mutation for protein sequence alignment

BACKGROUND: Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mu...

Descripción completa

Detalles Bibliográficos
Autores principales: Jeong, Chan-seok, Kim, Dongsup
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3165164/
https://www.ncbi.nlm.nih.gov/pubmed/20406500
http://dx.doi.org/10.1186/1471-2105-11-S2-S2
_version_ 1782211094961455104
author Jeong, Chan-seok
Kim, Dongsup
author_facet Jeong, Chan-seok
Kim, Dongsup
author_sort Jeong, Chan-seok
collection PubMed
description BACKGROUND: Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporate it with sequence alignment yet. METHODS: We develop a novel method, CM profile, to represent correlated mutation as the spectral feature derived by using linear predictive coding where correlated mutations among different positions are represented by a fixed number of values. We combine CM profile with conventional sequence profile to improve alignment quality. RESULTS: For distantly related protein pairs, using CM profile improves the profile-profile alignment with or without predicted secondary structure. Especially, at superfamily level, combining CM profile with sequence profile improves profile-profile alignment by 9.5% while predicted secondary structure does by 6.0%. More significantly, using both of them improves profile-profile alignment by 13.9%. We also exemplify the effectiveness of CM profile by demonstrating that the resulting alignment preserves share coevolution and contacts. CONCLUSIONS: In this work, we introduce a novel method, CM profile, which represents correlated mutation information as paralleled form, and apply it to the protein sequence alignment problem. When combined with conventional sequence profile, CM profile improves alignment quality significantly better than predicted secondary structure information, which should be beneficial for target-template alignment in protein structure prediction. Because of the generality of CM profile, it can be used for other bioinformatics applications in the same way of using sequence profile.
format Online
Article
Text
id pubmed-3165164
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31651642011-09-03 Linear predictive coding representation of correlated mutation for protein sequence alignment Jeong, Chan-seok Kim, Dongsup BMC Bioinformatics Proceedings BACKGROUND: Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporate it with sequence alignment yet. METHODS: We develop a novel method, CM profile, to represent correlated mutation as the spectral feature derived by using linear predictive coding where correlated mutations among different positions are represented by a fixed number of values. We combine CM profile with conventional sequence profile to improve alignment quality. RESULTS: For distantly related protein pairs, using CM profile improves the profile-profile alignment with or without predicted secondary structure. Especially, at superfamily level, combining CM profile with sequence profile improves profile-profile alignment by 9.5% while predicted secondary structure does by 6.0%. More significantly, using both of them improves profile-profile alignment by 13.9%. We also exemplify the effectiveness of CM profile by demonstrating that the resulting alignment preserves share coevolution and contacts. CONCLUSIONS: In this work, we introduce a novel method, CM profile, which represents correlated mutation information as paralleled form, and apply it to the protein sequence alignment problem. When combined with conventional sequence profile, CM profile improves alignment quality significantly better than predicted secondary structure information, which should be beneficial for target-template alignment in protein structure prediction. Because of the generality of CM profile, it can be used for other bioinformatics applications in the same way of using sequence profile. BioMed Central 2010-04-16 /pmc/articles/PMC3165164/ /pubmed/20406500 http://dx.doi.org/10.1186/1471-2105-11-S2-S2 Text en Copyright ©2010 Kim and Jeong; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Jeong, Chan-seok
Kim, Dongsup
Linear predictive coding representation of correlated mutation for protein sequence alignment
title Linear predictive coding representation of correlated mutation for protein sequence alignment
title_full Linear predictive coding representation of correlated mutation for protein sequence alignment
title_fullStr Linear predictive coding representation of correlated mutation for protein sequence alignment
title_full_unstemmed Linear predictive coding representation of correlated mutation for protein sequence alignment
title_short Linear predictive coding representation of correlated mutation for protein sequence alignment
title_sort linear predictive coding representation of correlated mutation for protein sequence alignment
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3165164/
https://www.ncbi.nlm.nih.gov/pubmed/20406500
http://dx.doi.org/10.1186/1471-2105-11-S2-S2
work_keys_str_mv AT jeongchanseok linearpredictivecodingrepresentationofcorrelatedmutationforproteinsequencealignment
AT kimdongsup linearpredictivecodingrepresentationofcorrelatedmutationforproteinsequencealignment