Cargando…

Logo2PWM: a tool to convert sequence logo to position weight matrix

BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Zhen, Liu, Lu, Ruan, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629559/
https://www.ncbi.nlm.nih.gov/pubmed/28984206
http://dx.doi.org/10.1186/s12864-017-4023-9
_version_ 1783269067135647744
author Gao, Zhen
Liu, Lu
Ruan, Jianhua
author_facet Gao, Zhen
Liu, Lu
Ruan, Jianhua
author_sort Gao, Zhen
collection PubMed
description BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness of human perception, rich information, and simple format. Different from sequence logo, PWM works great as a precise and compact digitalized form, which can be easily used by a variety of motif analysis software. There are a few available tools to generate sequence logos from PWM; however, no tool does the reverse. Such tool to convert sequence logo back to PWM is needed to scan a TFBS represented in logo format in a publication where the PWM is not provided or hard to be acquired. A major difficulty in developing such tool to convert sequence logo to PWM is to deal with the diversity of sequence logo images. RESULTS: We propose logo2PWM for reconstructing PWM from a large variety of sequence logo images. Evaluation results on over one thousand logos from three sources of different logo format show that the correlation between the reconstructed PWMs and the original PWMs are constantly high, where median correlation is greater than 0.97. CONCLUSION: Because of the high recognition accuracy, the easiness of usage, and, the availability of both web-based service and stand-alone application, we believe that logo2PWM can readily benefit the study of transcription by filling the gap between sequence logo and PWM.
format Online
Article
Text
id pubmed-5629559
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56295592017-10-13 Logo2PWM: a tool to convert sequence logo to position weight matrix Gao, Zhen Liu, Lu Ruan, Jianhua BMC Genomics Research BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness of human perception, rich information, and simple format. Different from sequence logo, PWM works great as a precise and compact digitalized form, which can be easily used by a variety of motif analysis software. There are a few available tools to generate sequence logos from PWM; however, no tool does the reverse. Such tool to convert sequence logo back to PWM is needed to scan a TFBS represented in logo format in a publication where the PWM is not provided or hard to be acquired. A major difficulty in developing such tool to convert sequence logo to PWM is to deal with the diversity of sequence logo images. RESULTS: We propose logo2PWM for reconstructing PWM from a large variety of sequence logo images. Evaluation results on over one thousand logos from three sources of different logo format show that the correlation between the reconstructed PWMs and the original PWMs are constantly high, where median correlation is greater than 0.97. CONCLUSION: Because of the high recognition accuracy, the easiness of usage, and, the availability of both web-based service and stand-alone application, we believe that logo2PWM can readily benefit the study of transcription by filling the gap between sequence logo and PWM. BioMed Central 2017-10-03 /pmc/articles/PMC5629559/ /pubmed/28984206 http://dx.doi.org/10.1186/s12864-017-4023-9 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Gao, Zhen
Liu, Lu
Ruan, Jianhua
Logo2PWM: a tool to convert sequence logo to position weight matrix
title Logo2PWM: a tool to convert sequence logo to position weight matrix
title_full Logo2PWM: a tool to convert sequence logo to position weight matrix
title_fullStr Logo2PWM: a tool to convert sequence logo to position weight matrix
title_full_unstemmed Logo2PWM: a tool to convert sequence logo to position weight matrix
title_short Logo2PWM: a tool to convert sequence logo to position weight matrix
title_sort logo2pwm: a tool to convert sequence logo to position weight matrix
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629559/
https://www.ncbi.nlm.nih.gov/pubmed/28984206
http://dx.doi.org/10.1186/s12864-017-4023-9
work_keys_str_mv AT gaozhen logo2pwmatooltoconvertsequencelogotopositionweightmatrix
AT liulu logo2pwmatooltoconvertsequencelogotopositionweightmatrix
AT ruanjianhua logo2pwmatooltoconvertsequencelogotopositionweightmatrix