Cargando…

Protein sequence alignment with family-specific amino acid similarity matrices

BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one nee...

Descripción completa

Detalles Bibliográficos
Autor principal:	Kuznetsov, Igor B
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Technical Note
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201029/ https://www.ncbi.nlm.nih.gov/pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296

_version_	1782214803825098752
author	Kuznetsov, Igor B
author_facet	Kuznetsov, Igor B
author_sort	Kuznetsov, Igor B
collection	PubMed
description	BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one needs a way of selecting the best performing scoring function. This work is focused on the issue of finding optimized protein family- and fold-specific scoring functions for global similarity matrix-based sequence alignment. FINDINGS: I utilize a comprehensive set of reference alignments obtained from structural superposition of homologous and analogous proteins to design a quantitative statistical framework for evaluating the performance of alignment scoring functions in global pairwise sequence alignment. This framework is applied to study how existing general-purpose amino acid similarity matrices perform on individual protein families and structural folds, and to compare them to family-specific and fold-specific matrices derived in this work. I describe an adaptive alignment procedure that automatically selects an appropriate similarity matrix and optimized gap penalties based on the properties of the sequences being aligned. CONCLUSIONS: The results of this work indicate that using family-specific similarity matrices significantly improves the quality of the alignment of homologous sequences over the traditional sequence alignment based on a single general-purpose similarity matrix. However, using fold-specific similarity matrices can only marginally improve sequence alignment of proteins that share the same structural fold but do not share a common evolutionary origin. The family-specific matrices derived in this work and the optimized gap penalties are available at http://taurus.crc.albany.edu/fsm.
format	Online Article Text
id	pubmed-3201029
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32010292011-10-26 Protein sequence alignment with family-specific amino acid similarity matrices Kuznetsov, Igor B BMC Res Notes Technical Note BACKGROUND: Alignment of amino acid sequences by means of dynamic programming is a cornerstone sequence comparison method. The quality of alignments produced by dynamic programming critically depends on the choice of the alignment scoring function. Therefore, for a specific alignment problem one needs a way of selecting the best performing scoring function. This work is focused on the issue of finding optimized protein family- and fold-specific scoring functions for global similarity matrix-based sequence alignment. FINDINGS: I utilize a comprehensive set of reference alignments obtained from structural superposition of homologous and analogous proteins to design a quantitative statistical framework for evaluating the performance of alignment scoring functions in global pairwise sequence alignment. This framework is applied to study how existing general-purpose amino acid similarity matrices perform on individual protein families and structural folds, and to compare them to family-specific and fold-specific matrices derived in this work. I describe an adaptive alignment procedure that automatically selects an appropriate similarity matrix and optimized gap penalties based on the properties of the sequences being aligned. CONCLUSIONS: The results of this work indicate that using family-specific similarity matrices significantly improves the quality of the alignment of homologous sequences over the traditional sequence alignment based on a single general-purpose similarity matrix. However, using fold-specific similarity matrices can only marginally improve sequence alignment of proteins that share the same structural fold but do not share a common evolutionary origin. The family-specific matrices derived in this work and the optimized gap penalties are available at http://taurus.crc.albany.edu/fsm. BioMed Central 2011-08-16 /pmc/articles/PMC3201029/ /pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296 Text en Copyright ©2011 Kuznetsov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Technical Note Kuznetsov, Igor B Protein sequence alignment with family-specific amino acid similarity matrices
title	Protein sequence alignment with family-specific amino acid similarity matrices
title_full	Protein sequence alignment with family-specific amino acid similarity matrices
title_fullStr	Protein sequence alignment with family-specific amino acid similarity matrices
title_full_unstemmed	Protein sequence alignment with family-specific amino acid similarity matrices
title_short	Protein sequence alignment with family-specific amino acid similarity matrices
title_sort	protein sequence alignment with family-specific amino acid similarity matrices
topic	Technical Note
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201029/ https://www.ncbi.nlm.nih.gov/pubmed/21846354 http://dx.doi.org/10.1186/1756-0500-4-296
work_keys_str_mv	AT kuznetsovigorb proteinsequencealignmentwithfamilyspecificaminoacidsimilaritymatrices

Protein sequence alignment with family-specific amino acid similarity matrices

Ejemplares similares