Cargando…
Logo2PWM: a tool to convert sequence logo to position weight matrix
BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629559/ https://www.ncbi.nlm.nih.gov/pubmed/28984206 http://dx.doi.org/10.1186/s12864-017-4023-9 |
_version_ | 1783269067135647744 |
---|---|
author | Gao, Zhen Liu, Lu Ruan, Jianhua |
author_facet | Gao, Zhen Liu, Lu Ruan, Jianhua |
author_sort | Gao, Zhen |
collection | PubMed |
description | BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness of human perception, rich information, and simple format. Different from sequence logo, PWM works great as a precise and compact digitalized form, which can be easily used by a variety of motif analysis software. There are a few available tools to generate sequence logos from PWM; however, no tool does the reverse. Such tool to convert sequence logo back to PWM is needed to scan a TFBS represented in logo format in a publication where the PWM is not provided or hard to be acquired. A major difficulty in developing such tool to convert sequence logo to PWM is to deal with the diversity of sequence logo images. RESULTS: We propose logo2PWM for reconstructing PWM from a large variety of sequence logo images. Evaluation results on over one thousand logos from three sources of different logo format show that the correlation between the reconstructed PWMs and the original PWMs are constantly high, where median correlation is greater than 0.97. CONCLUSION: Because of the high recognition accuracy, the easiness of usage, and, the availability of both web-based service and stand-alone application, we believe that logo2PWM can readily benefit the study of transcription by filling the gap between sequence logo and PWM. |
format | Online Article Text |
id | pubmed-5629559 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56295592017-10-13 Logo2PWM: a tool to convert sequence logo to position weight matrix Gao, Zhen Liu, Lu Ruan, Jianhua BMC Genomics Research BACKGROUND: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness of human perception, rich information, and simple format. Different from sequence logo, PWM works great as a precise and compact digitalized form, which can be easily used by a variety of motif analysis software. There are a few available tools to generate sequence logos from PWM; however, no tool does the reverse. Such tool to convert sequence logo back to PWM is needed to scan a TFBS represented in logo format in a publication where the PWM is not provided or hard to be acquired. A major difficulty in developing such tool to convert sequence logo to PWM is to deal with the diversity of sequence logo images. RESULTS: We propose logo2PWM for reconstructing PWM from a large variety of sequence logo images. Evaluation results on over one thousand logos from three sources of different logo format show that the correlation between the reconstructed PWMs and the original PWMs are constantly high, where median correlation is greater than 0.97. CONCLUSION: Because of the high recognition accuracy, the easiness of usage, and, the availability of both web-based service and stand-alone application, we believe that logo2PWM can readily benefit the study of transcription by filling the gap between sequence logo and PWM. BioMed Central 2017-10-03 /pmc/articles/PMC5629559/ /pubmed/28984206 http://dx.doi.org/10.1186/s12864-017-4023-9 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Gao, Zhen Liu, Lu Ruan, Jianhua Logo2PWM: a tool to convert sequence logo to position weight matrix |
title | Logo2PWM: a tool to convert sequence logo to position weight matrix |
title_full | Logo2PWM: a tool to convert sequence logo to position weight matrix |
title_fullStr | Logo2PWM: a tool to convert sequence logo to position weight matrix |
title_full_unstemmed | Logo2PWM: a tool to convert sequence logo to position weight matrix |
title_short | Logo2PWM: a tool to convert sequence logo to position weight matrix |
title_sort | logo2pwm: a tool to convert sequence logo to position weight matrix |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629559/ https://www.ncbi.nlm.nih.gov/pubmed/28984206 http://dx.doi.org/10.1186/s12864-017-4023-9 |
work_keys_str_mv | AT gaozhen logo2pwmatooltoconvertsequencelogotopositionweightmatrix AT liulu logo2pwmatooltoconvertsequencelogotopositionweightmatrix AT ruanjianhua logo2pwmatooltoconvertsequencelogotopositionweightmatrix |