Cargando…

Addressing the unmet need for visualizing conditional random fields in biological data

BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Ray, William C, Wolock, Samuel L, Callahan, Nicholas W, Dong, Min, Li, Q Quinn, Liang, Chun, Magliery, Thomas J, Bartlett, Christopher W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227292/
https://www.ncbi.nlm.nih.gov/pubmed/25000815
http://dx.doi.org/10.1186/1471-2105-15-202
_version_ 1782343777670660096
author Ray, William C
Wolock, Samuel L
Callahan, Nicholas W
Dong, Min
Li, Q Quinn
Liang, Chun
Magliery, Thomas J
Bartlett, Christopher W
author_facet Ray, William C
Wolock, Samuel L
Callahan, Nicholas W
Dong, Min
Li, Q Quinn
Liang, Chun
Magliery, Thomas J
Bartlett, Christopher W
author_sort Ray, William C
collection PubMed
description BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon. DISCUSSION: We propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data. CONCLUSIONS: In this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects. Software and tutorials are available at http://www.stickwrld.org/
format Online
Article
Text
id pubmed-4227292
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42272922014-11-12 Addressing the unmet need for visualizing conditional random fields in biological data Ray, William C Wolock, Samuel L Callahan, Nicholas W Dong, Min Li, Q Quinn Liang, Chun Magliery, Thomas J Bartlett, Christopher W BMC Bioinformatics Research Article BACKGROUND: The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon. DISCUSSION: We propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data. CONCLUSIONS: In this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects. Software and tutorials are available at http://www.stickwrld.org/ BioMed Central 2014-07-07 /pmc/articles/PMC4227292/ /pubmed/25000815 http://dx.doi.org/10.1186/1471-2105-15-202 Text en Copyright © 2014 Ray et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Ray, William C
Wolock, Samuel L
Callahan, Nicholas W
Dong, Min
Li, Q Quinn
Liang, Chun
Magliery, Thomas J
Bartlett, Christopher W
Addressing the unmet need for visualizing conditional random fields in biological data
title Addressing the unmet need for visualizing conditional random fields in biological data
title_full Addressing the unmet need for visualizing conditional random fields in biological data
title_fullStr Addressing the unmet need for visualizing conditional random fields in biological data
title_full_unstemmed Addressing the unmet need for visualizing conditional random fields in biological data
title_short Addressing the unmet need for visualizing conditional random fields in biological data
title_sort addressing the unmet need for visualizing conditional random fields in biological data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227292/
https://www.ncbi.nlm.nih.gov/pubmed/25000815
http://dx.doi.org/10.1186/1471-2105-15-202
work_keys_str_mv AT raywilliamc addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT wolocksamuell addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT callahannicholasw addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT dongmin addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT liqquinn addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT liangchun addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT maglierythomasj addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata
AT bartlettchristopherw addressingtheunmetneedforvisualizingconditionalrandomfieldsinbiologicaldata