Cargando…

CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments

BACKGROUND: In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments...

Descripción completa

Detalles Bibliográficos
Autor principal: Zhou, Carol L. Ecale
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526201/
https://www.ncbi.nlm.nih.gov/pubmed/26246852
http://dx.doi.org/10.1186/s13029-015-0039-1
_version_ 1782384392457420800
author Zhou, Carol L. Ecale
author_facet Zhou, Carol L. Ecale
author_sort Zhou, Carol L. Ecale
collection PubMed
description BACKGROUND: In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. RESULTS: This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CONCLUSIONS: CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-015-0039-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4526201
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45262012015-08-06 CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments Zhou, Carol L. Ecale Source Code Biol Med Software BACKGROUND: In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. RESULTS: This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CONCLUSIONS: CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-015-0039-1) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-05 /pmc/articles/PMC4526201/ /pubmed/26246852 http://dx.doi.org/10.1186/s13029-015-0039-1 Text en © Zhou. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Zhou, Carol L. Ecale
CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title_full CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title_fullStr CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title_full_unstemmed CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title_short CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
title_sort combalign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526201/
https://www.ncbi.nlm.nih.gov/pubmed/26246852
http://dx.doi.org/10.1186/s13029-015-0039-1
work_keys_str_mv AT zhoucarollecale combalignacodeforgeneratingaonetomanysequencealignmentfromasetofpairwisestructurebasedsequencealignments