Cargando…

A new sequence logo plot to highlight enrichment and depletion

BACKGROUND: Sequence logo plots have become a standard graphical tool for visualizing sequence motifs in DNA, RNA or protein sequences. However standard logo plots primarily highlight enrichment of symbols, and may fail to highlight interesting depletions. Current alternatives that try to highlight...

Descripción completa

Detalles Bibliográficos
Autores principales: Dey, Kushal K., Xie, Dongyue, Stephens, Matthew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288878/
https://www.ncbi.nlm.nih.gov/pubmed/30526486
http://dx.doi.org/10.1186/s12859-018-2489-3
_version_ 1783379876529569792
author Dey, Kushal K.
Xie, Dongyue
Stephens, Matthew
author_facet Dey, Kushal K.
Xie, Dongyue
Stephens, Matthew
author_sort Dey, Kushal K.
collection PubMed
description BACKGROUND: Sequence logo plots have become a standard graphical tool for visualizing sequence motifs in DNA, RNA or protein sequences. However standard logo plots primarily highlight enrichment of symbols, and may fail to highlight interesting depletions. Current alternatives that try to highlight depletion often produce visually cluttered logos. RESULTS: We introduce a new sequence logo plot, the EDLogo plot, that highlights both enrichment and depletion, while minimizing visual clutter. We provide an easy-to-use and highly customizable R package Logolas to produce a range of logo plots, including EDLogo plots. This software also allows elements in the logo plot to be strings of characters, rather than a single character, extending the range of applications beyond the usual DNA, RNA or protein sequences. And the software includes new Empirical Bayes methods to stabilize estimates of enrichment and depletion, and thus better highlight the most significant patterns in data. We illustrate our methods and software on applications to transcription factor binding site motifs, protein sequence alignments and cancer mutation signature profiles. CONCLUSIONS: Our new EDLogo plots and flexible software implementation can help data analysts visualize both enrichment and depletion of characters (DNA sequence bases, amino acids, etc.) across a wide range of applications. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2489-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6288878
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62888782018-12-14 A new sequence logo plot to highlight enrichment and depletion Dey, Kushal K. Xie, Dongyue Stephens, Matthew BMC Bioinformatics Research Article BACKGROUND: Sequence logo plots have become a standard graphical tool for visualizing sequence motifs in DNA, RNA or protein sequences. However standard logo plots primarily highlight enrichment of symbols, and may fail to highlight interesting depletions. Current alternatives that try to highlight depletion often produce visually cluttered logos. RESULTS: We introduce a new sequence logo plot, the EDLogo plot, that highlights both enrichment and depletion, while minimizing visual clutter. We provide an easy-to-use and highly customizable R package Logolas to produce a range of logo plots, including EDLogo plots. This software also allows elements in the logo plot to be strings of characters, rather than a single character, extending the range of applications beyond the usual DNA, RNA or protein sequences. And the software includes new Empirical Bayes methods to stabilize estimates of enrichment and depletion, and thus better highlight the most significant patterns in data. We illustrate our methods and software on applications to transcription factor binding site motifs, protein sequence alignments and cancer mutation signature profiles. CONCLUSIONS: Our new EDLogo plots and flexible software implementation can help data analysts visualize both enrichment and depletion of characters (DNA sequence bases, amino acids, etc.) across a wide range of applications. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2489-3) contains supplementary material, which is available to authorized users. BioMed Central 2018-12-10 /pmc/articles/PMC6288878/ /pubmed/30526486 http://dx.doi.org/10.1186/s12859-018-2489-3 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Dey, Kushal K.
Xie, Dongyue
Stephens, Matthew
A new sequence logo plot to highlight enrichment and depletion
title A new sequence logo plot to highlight enrichment and depletion
title_full A new sequence logo plot to highlight enrichment and depletion
title_fullStr A new sequence logo plot to highlight enrichment and depletion
title_full_unstemmed A new sequence logo plot to highlight enrichment and depletion
title_short A new sequence logo plot to highlight enrichment and depletion
title_sort new sequence logo plot to highlight enrichment and depletion
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288878/
https://www.ncbi.nlm.nih.gov/pubmed/30526486
http://dx.doi.org/10.1186/s12859-018-2489-3
work_keys_str_mv AT deykushalk anewsequencelogoplottohighlightenrichmentanddepletion
AT xiedongyue anewsequencelogoplottohighlightenrichmentanddepletion
AT stephensmatthew anewsequencelogoplottohighlightenrichmentanddepletion
AT deykushalk newsequencelogoplottohighlightenrichmentanddepletion
AT xiedongyue newsequencelogoplottohighlightenrichmentanddepletion
AT stephensmatthew newsequencelogoplottohighlightenrichmentanddepletion