Cargando…

dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data

Sequence logos have been widely used as graphical representations of conserved nucleic acid and protein motifs. Due to the complexity of the amino acid (AA) alphabet, rich post-translational modification, and diverse subcellular localization of proteins, few versatile tools are available for effecti...

Descripción completa

Detalles Bibliográficos
Autores principales: Ou, Jianhong, Liu, Haibo, Nirala, Niraj K., Stukalov, Alexey, Acharya, Usha, Green, Michael R., Zhu, Lihua Julie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7647101/
https://www.ncbi.nlm.nih.gov/pubmed/33156866
http://dx.doi.org/10.1371/journal.pone.0242030
_version_ 1783606888706867200
author Ou, Jianhong
Liu, Haibo
Nirala, Niraj K.
Stukalov, Alexey
Acharya, Usha
Green, Michael R.
Zhu, Lihua Julie
author_facet Ou, Jianhong
Liu, Haibo
Nirala, Niraj K.
Stukalov, Alexey
Acharya, Usha
Green, Michael R.
Zhu, Lihua Julie
author_sort Ou, Jianhong
collection PubMed
description Sequence logos have been widely used as graphical representations of conserved nucleic acid and protein motifs. Due to the complexity of the amino acid (AA) alphabet, rich post-translational modification, and diverse subcellular localization of proteins, few versatile tools are available for effective identification and visualization of protein motifs. In addition, various reduced AA alphabets based on physicochemical, structural, or functional properties have been valuable in the study of protein alignment, folding, structure prediction, and evolution. However, there is lack of tools for applying reduced AA alphabets to the identification and visualization of statistically significant motifs. To fill this gap, we developed an R/Bioconductor package dagLogo, which has several advantages over existing tools. First, dagLogo allows various formats for input sets and provides comprehensive options to build optimal background models. It implements different reduced AA alphabets to group AAs of similar properties. Furthermore, dagLogo provides statistical and visual solutions for differential AA (or AA group) usage analysis of both large and small data sets. Case studies showed that dagLogo can better identify and visualize conserved protein sequence patterns from different types of inputs and can potentially reveal the biological patterns that could be missed by other logo generators.
format Online
Article
Text
id pubmed-7647101
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-76471012020-11-16 dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data Ou, Jianhong Liu, Haibo Nirala, Niraj K. Stukalov, Alexey Acharya, Usha Green, Michael R. Zhu, Lihua Julie PLoS One Research Article Sequence logos have been widely used as graphical representations of conserved nucleic acid and protein motifs. Due to the complexity of the amino acid (AA) alphabet, rich post-translational modification, and diverse subcellular localization of proteins, few versatile tools are available for effective identification and visualization of protein motifs. In addition, various reduced AA alphabets based on physicochemical, structural, or functional properties have been valuable in the study of protein alignment, folding, structure prediction, and evolution. However, there is lack of tools for applying reduced AA alphabets to the identification and visualization of statistically significant motifs. To fill this gap, we developed an R/Bioconductor package dagLogo, which has several advantages over existing tools. First, dagLogo allows various formats for input sets and provides comprehensive options to build optimal background models. It implements different reduced AA alphabets to group AAs of similar properties. Furthermore, dagLogo provides statistical and visual solutions for differential AA (or AA group) usage analysis of both large and small data sets. Case studies showed that dagLogo can better identify and visualize conserved protein sequence patterns from different types of inputs and can potentially reveal the biological patterns that could be missed by other logo generators. Public Library of Science 2020-11-06 /pmc/articles/PMC7647101/ /pubmed/33156866 http://dx.doi.org/10.1371/journal.pone.0242030 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Ou, Jianhong
Liu, Haibo
Nirala, Niraj K.
Stukalov, Alexey
Acharya, Usha
Green, Michael R.
Zhu, Lihua Julie
dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title_full dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title_fullStr dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title_full_unstemmed dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title_short dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
title_sort daglogo: an r/bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7647101/
https://www.ncbi.nlm.nih.gov/pubmed/33156866
http://dx.doi.org/10.1371/journal.pone.0242030
work_keys_str_mv AT oujianhong daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT liuhaibo daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT niralanirajk daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT stukalovalexey daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT acharyausha daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT greenmichaelr daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata
AT zhulihuajulie daglogoanrbioconductorpackageforidentifyingandvisualizingdifferentialaminoacidgroupusageinproteomicsdata