Cargando…
Protein sequence alignment with family-specific amino acid similarity matrices
BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one nee...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201029/ https://www.ncbi.nlm.nih.gov/pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296 |
_version_ | 1782214803825098752 |
---|---|
author | Kuznetsov, Igor B |
author_facet | Kuznetsov, Igor B |
author_sort | Kuznetsov, Igor B |
collection | PubMed |
description | BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one needs a way of selecting the best performing scoring function. This work is focused on the issue of finding optimized protein family- and fold-specific scoring functions for global similarity matrix-based sequence alignment. FINDINGS: I utilize a comprehensive set of reference alignments obtained from structural superposition of homologous and analogous proteins to design a quantitative statistical framework for evaluating the performance of alignment scoring functions in global pairwise sequence alignment. This framework is applied to study how existing general-purpose amino acid similarity matrices perform on individual protein families and structural folds, and to compare them to family-specific and fold-specific matrices derived in this work. I describe an adaptive alignment procedure that automatically selects an appropriate similarity matrix and optimized gap penalties based on the properties of the sequences being aligned. CONCLUSIONS: The results of this work indicate that using family-specific similarity matrices significantly improves the quality of the alignment of homologous sequences over the traditional sequence alignment based on a single general-purpose similarity matrix. However, using fold-specific similarity matrices can only marginally improve sequence alignment of proteins that share the same structural fold but do not share a common evolutionary origin. The family-specific matrices derived in this work and the optimized gap penalties are available at http://taurus.crc.albany.edu/fsm. |
format | Online Article Text |
id | pubmed-3201029 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32010292011-10-26 Protein sequence alignment with family-specific amino acid similarity matrices Kuznetsov, Igor B BMC Res Notes Technical Note BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one needs a way of selecting the best performing scoring function. This work is focused on the issue of finding optimized protein family- and fold-specific scoring functions for global similarity matrix-based sequence alignment. FINDINGS: I utilize a comprehensive set of reference alignments obtained from structural superposition of homologous and analogous proteins to design a quantitative statistical framework for evaluating the performance of alignment scoring functions in global pairwise sequence alignment. This framework is applied to study how existing general-purpose amino acid similarity matrices perform on individual protein families and structural folds, and to compare them to family-specific and fold-specific matrices derived in this work. I describe an adaptive alignment procedure that automatically selects an appropriate similarity matrix and optimized gap penalties based on the properties of the sequences being aligned. CONCLUSIONS: The results of this work indicate that using family-specific similarity matrices significantly improves the quality of the alignment of homologous sequences over the traditional sequence alignment based on a single general-purpose similarity matrix. However, using fold-specific similarity matrices can only marginally improve sequence alignment of proteins that share the same structural fold but do not share a common evolutionary origin. The family-specific matrices derived in this work and the optimized gap penalties are available at http://taurus.crc.albany.edu/fsm. BioMed Central 2011-08-16 /pmc/articles/PMC3201029/ /pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296 Text en Copyright ©2011 Kuznetsov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Kuznetsov, Igor B Protein sequence alignment with family-specific amino acid similarity matrices |
title | Protein sequence alignment with family-specific amino acid similarity matrices |
title_full | Protein sequence alignment with family-specific amino acid similarity matrices |
title_fullStr | Protein sequence alignment with family-specific amino acid similarity matrices |
title_full_unstemmed | Protein sequence alignment with family-specific amino acid similarity matrices |
title_short | Protein sequence alignment with family-specific amino acid similarity matrices |
title_sort | protein sequence alignment with family-specific amino acid similarity matrices |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201029/ https://www.ncbi.nlm.nih.gov/pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296 |
work_keys_str_mv | AT kuznetsovigorb proteinsequencealignmentwithfamilyspecificaminoacidsimilaritymatrices |