Cargando…

StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data

As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of...

Descripción completa

Detalles Bibliográficos
Autores principales: Rumpf, Robert Wolfgang, Wolock, Samuel L, Ray, William C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4214597/
https://www.ncbi.nlm.nih.gov/pubmed/25368511
http://dx.doi.org/10.4137/CIN.S14024
Descripción
Sumario:As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria – effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene–SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.