Cargando…
Addressing the unmet need for visualizing conditional random fields in biological data
BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to mo...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227292/ https://www.ncbi.nlm.nih.gov/pubmed/25000815 http://dx.doi.org/10.1186/1471-2105-15-202 |
_version_ | 1782343777670660096 |
---|---|
author | Ray, William C Wolock, Samuel L Callahan, Nicholas W Dong, Min Li, Q Quinn Liang, Chun Magliery, Thomas J Bartlett, Christopher W |
author_facet | Ray, William C Wolock, Samuel L Callahan, Nicholas W Dong, Min Li, Q Quinn Liang, Chun Magliery, Thomas J Bartlett, Christopher W |
author_sort | Ray, William C |
collection | PubMed |
description | BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon. DISCUSSION: We propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data. CONCLUSIONS: In this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects. Software and tutorials are available at http://www.stickwrld.org/ |
format | Online Article Text |
id | pubmed-4227292 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42272922014-11-12 Addressing the unmet need for visualizing conditional random fields in biological data Ray, William C Wolock, Samuel L Callahan, Nicholas W Dong, Min Li, Q Quinn Liang, Chun Magliery, Thomas J Bartlett, Christopher W BMC Bioinformatics Research Article BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon. DISCUSSION: We propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data. CONCLUSIONS: In this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects. Software and tutorials are available at http://www.stickwrld.org/ BioMed Central 2014-07-07 /pmc/articles/PMC4227292/ /pubmed/25000815 http://dx.doi.org/10.1186/1471-2105-15-202 Text en Copyright © 2014 Ray et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Ray, William C Wolock, Samuel L Callahan, Nicholas W Dong, Min Li, Q Quinn Liang, Chun Magliery, Thomas J Bartlett, Christopher W Addressing the unmet need for visualizing conditional random fields in biological data |
title | Addressing the unmet need for visualizing conditional random fields in biological data |
title_full | Addressing the unmet need for visualizing conditional random fields in biological data |
title_fullStr | Addressing the unmet need for visualizing conditional random fields in biological data |
title_full_unstemmed | Addressing the unmet need for visualizing conditional random fields in biological data |
title_short | Addressing the unmet need for visualizing conditional random fields in biological data |
title_sort | addressing the unmet need for visualizing conditional random fields in biological data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227292/ https://www.ncbi.nlm.nih.gov/pubmed/25000815 http://dx.doi.org/10.1186/1471-2105-15-202 |
work_keys_str_mv | AT raywilliamc addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT wolocksamuell addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT callahannicholasw addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT dongmin addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT liqquinn addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT liangchun addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT maglierythomasj addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata AT bartlettchristopherw addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata |