Cargando…

Iterative refinement of structure-based sequence alignments by Seed Extension

BACKGROUND: Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Changhoon, Tai, Chin-Hsien, Lee, Byungkook
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2753854/
https://www.ncbi.nlm.nih.gov/pubmed/19589133
http://dx.doi.org/10.1186/1471-2105-10-210
_version_ 1782172369224204288
author Kim, Changhoon
Tai, Chin-Hsien
Lee, Byungkook
author_facet Kim, Changhoon
Tai, Chin-Hsien
Lee, Byungkook
author_sort Kim, Changhoon
collection PubMed
description BACKGROUND: Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. RESULTS: RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. CONCLUSION: RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.
format Text
id pubmed-2753854
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27538542009-09-29 Iterative refinement of structure-based sequence alignments by Seed Extension Kim, Changhoon Tai, Chin-Hsien Lee, Byungkook BMC Bioinformatics Research Article BACKGROUND: Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. RESULTS: RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. CONCLUSION: RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. BioMed Central 2009-07-09 /pmc/articles/PMC2753854/ /pubmed/19589133 http://dx.doi.org/10.1186/1471-2105-10-210 Text en Copyright © 2009 Kim et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kim, Changhoon
Tai, Chin-Hsien
Lee, Byungkook
Iterative refinement of structure-based sequence alignments by Seed Extension
title Iterative refinement of structure-based sequence alignments by Seed Extension
title_full Iterative refinement of structure-based sequence alignments by Seed Extension
title_fullStr Iterative refinement of structure-based sequence alignments by Seed Extension
title_full_unstemmed Iterative refinement of structure-based sequence alignments by Seed Extension
title_short Iterative refinement of structure-based sequence alignments by Seed Extension
title_sort iterative refinement of structure-based sequence alignments by seed extension
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2753854/
https://www.ncbi.nlm.nih.gov/pubmed/19589133
http://dx.doi.org/10.1186/1471-2105-10-210
work_keys_str_mv AT kimchanghoon iterativerefinementofstructurebasedsequencealignmentsbyseedextension
AT taichinhsien iterativerefinementofstructurebasedsequencealignmentsbyseedextension
AT leebyungkook iterativerefinementofstructurebasedsequencealignmentsbyseedextension