Cargando…

Sequence Diversity Diagram for comparative analysis of multiple sequence alignments

BACKGROUND: The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when...

Descripción completa

Detalles Bibliográficos
Autores principales: Sakai, Ryo, Aerts, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155614/
https://www.ncbi.nlm.nih.gov/pubmed/25237396
http://dx.doi.org/10.1186/1753-6561-8-S2-S9
_version_ 1782333601225900032
author Sakai, Ryo
Aerts, Jan
author_facet Sakai, Ryo
Aerts, Jan
author_sort Sakai, Ryo
collection PubMed
description BACKGROUND: The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. METHODS: Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. RESULTS: The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
format Online
Article
Text
id pubmed-4155614
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41556142014-09-18 Sequence Diversity Diagram for comparative analysis of multiple sequence alignments Sakai, Ryo Aerts, Jan BMC Proc Research BACKGROUND: The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. METHODS: Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. RESULTS: The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks. BioMed Central 2014-08-28 /pmc/articles/PMC4155614/ /pubmed/25237396 http://dx.doi.org/10.1186/1753-6561-8-S2-S9 Text en Copyright © 2014 Sakai and Aerts; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sakai, Ryo
Aerts, Jan
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title_full Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title_fullStr Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title_full_unstemmed Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title_short Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
title_sort sequence diversity diagram for comparative analysis of multiple sequence alignments
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155614/
https://www.ncbi.nlm.nih.gov/pubmed/25237396
http://dx.doi.org/10.1186/1753-6561-8-S2-S9
work_keys_str_mv AT sakairyo sequencediversitydiagramforcomparativeanalysisofmultiplesequencealignments
AT aertsjan sequencediversitydiagramforcomparativeanalysisofmultiplesequencealignments