Cargando…

ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family

BACKGROUND: Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation. RESULTS: We introduce Pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Roca, Alberto I, Almada, Albert E, Abajian, Aaron C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2663765/
https://www.ncbi.nlm.nih.gov/pubmed/19102758
http://dx.doi.org/10.1186/1471-2105-9-554
_version_ 1782165919749898240
author Roca, Alberto I
Almada, Albert E
Abajian, Aaron C
author_facet Roca, Alberto I
Almada, Albert E
Abajian, Aaron C
author_sort Roca, Alberto I
collection PubMed
description BACKGROUND: Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation. RESULTS: We introduce ProfileGrids that represent a multiple sequence alignment as a matrix color-coded according to the residue frequency occurring at each column position. JProfileGrid is a Java application for computing and analyzing ProfileGrids. A dynamic interaction with the alignment information is achieved by changing the ProfileGrid color scheme, by extracting sequence subsets at selected residues of interest, and by relating alignment information to residue physical properties. Conserved family motifs can be identified by the overlay of similarity plot calculations on a ProfileGrid. Figures suitable for publication can be generated from the saved spreadsheet output of the colored matrices as well as by the export of conservation information for use in the PyMOL molecular visualization program. We demonstrate the utility of ProfileGrids on 300 bacterial homologs of the RecA family – a universally conserved protein involved in DNA recombination and repair. Careful attention was paid to curating the collected RecA sequences since ProfileGrids allow the easy identification of rare residues in an alignment. We relate the RecA alignment sequence conservation to the following three topics: the recently identified DNA binding residues, the unexplored MAW motif, and a unique Bacillus subtilis RecA homolog sequence feature. CONCLUSION: ProfileGrids allow large protein families to be visualized more effectively than the traditional stacked sequence alignment form. This new graphical representation facilitates the determination of the sequence conservation at residue positions of interest, enables the examination of structural patterns by using residue physical properties, and permits the display of rare sequence features within the context of an entire alignment. JProfileGrid is free for non-commercial use and is available from . Furthermore, we present a curated RecA protein collection that is more diverse than previous data sets; and, therefore, this RecA ProfileGrid is a rich source of information for nanoanatomy analysis.
format Text
id pubmed-2663765
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26637652009-04-02 ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family Roca, Alberto I Almada, Albert E Abajian, Aaron C BMC Bioinformatics Software BACKGROUND: Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation. RESULTS: We introduce ProfileGrids that represent a multiple sequence alignment as a matrix color-coded according to the residue frequency occurring at each column position. JProfileGrid is a Java application for computing and analyzing ProfileGrids. A dynamic interaction with the alignment information is achieved by changing the ProfileGrid color scheme, by extracting sequence subsets at selected residues of interest, and by relating alignment information to residue physical properties. Conserved family motifs can be identified by the overlay of similarity plot calculations on a ProfileGrid. Figures suitable for publication can be generated from the saved spreadsheet output of the colored matrices as well as by the export of conservation information for use in the PyMOL molecular visualization program. We demonstrate the utility of ProfileGrids on 300 bacterial homologs of the RecA family – a universally conserved protein involved in DNA recombination and repair. Careful attention was paid to curating the collected RecA sequences since ProfileGrids allow the easy identification of rare residues in an alignment. We relate the RecA alignment sequence conservation to the following three topics: the recently identified DNA binding residues, the unexplored MAW motif, and a unique Bacillus subtilis RecA homolog sequence feature. CONCLUSION: ProfileGrids allow large protein families to be visualized more effectively than the traditional stacked sequence alignment form. This new graphical representation facilitates the determination of the sequence conservation at residue positions of interest, enables the examination of structural patterns by using residue physical properties, and permits the display of rare sequence features within the context of an entire alignment. JProfileGrid is free for non-commercial use and is available from . Furthermore, we present a curated RecA protein collection that is more diverse than previous data sets; and, therefore, this RecA ProfileGrid is a rich source of information for nanoanatomy analysis. BioMed Central 2008-12-22 /pmc/articles/PMC2663765/ /pubmed/19102758 http://dx.doi.org/10.1186/1471-2105-9-554 Text en Copyright © 2008 Roca et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Roca, Alberto I
Almada, Albert E
Abajian, Aaron C
ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title_full ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title_fullStr ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title_full_unstemmed ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title_short ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family
title_sort profilegrids as a new visual representation of large multiple sequence alignments: a case study of the reca protein family
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2663765/
https://www.ncbi.nlm.nih.gov/pubmed/19102758
http://dx.doi.org/10.1186/1471-2105-9-554
work_keys_str_mv AT rocaalbertoi profilegridsasanewvisualrepresentationoflargemultiplesequencealignmentsacasestudyoftherecaproteinfamily
AT almadaalberte profilegridsasanewvisualrepresentationoflargemultiplesequencealignmentsacasestudyoftherecaproteinfamily
AT abajianaaronc profilegridsasanewvisualrepresentationoflargemultiplesequencealignmentsacasestudyoftherecaproteinfamily