Cargando…

Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses

With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Young Hwan, Korkola, James, Amin, Dhara N., Moasser, Mark M., Carmena, Jose M., Gray, Joe W., Tomlin, Claire J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4406848/
https://www.ncbi.nlm.nih.gov/pubmed/25901353
http://dx.doi.org/10.1371/journal.pone.0121607
_version_ 1782367832298749952
author Chang, Young Hwan
Korkola, James
Amin, Dhara N.
Moasser, Mark M.
Carmena, Jose M.
Gray, Joe W.
Tomlin, Claire J.
author_facet Chang, Young Hwan
Korkola, James
Amin, Dhara N.
Moasser, Mark M.
Carmena, Jose M.
Gray, Joe W.
Tomlin, Claire J.
author_sort Chang, Young Hwan
collection PubMed
description With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinct subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data.
format Online
Article
Text
id pubmed-4406848
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44068482015-05-07 Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses Chang, Young Hwan Korkola, James Amin, Dhara N. Moasser, Mark M. Carmena, Jose M. Gray, Joe W. Tomlin, Claire J. PLoS One Research Article With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures and how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinct subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data. Public Library of Science 2015-04-22 /pmc/articles/PMC4406848/ /pubmed/25901353 http://dx.doi.org/10.1371/journal.pone.0121607 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Chang, Young Hwan
Korkola, James
Amin, Dhara N.
Moasser, Mark M.
Carmena, Jose M.
Gray, Joe W.
Tomlin, Claire J.
Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title_full Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title_fullStr Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title_full_unstemmed Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title_short Disentangling Multidimensional Spatio-Temporal Data into Their Common and Aberrant Responses
title_sort disentangling multidimensional spatio-temporal data into their common and aberrant responses
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4406848/
https://www.ncbi.nlm.nih.gov/pubmed/25901353
http://dx.doi.org/10.1371/journal.pone.0121607
work_keys_str_mv AT changyounghwan disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT korkolajames disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT amindharan disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT moassermarkm disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT carmenajosem disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT grayjoew disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses
AT tomlinclairej disentanglingmultidimensionalspatiotemporaldataintotheircommonandaberrantresponses