Cargando…

Unsupervised Outlier Profile Analysis

In much of the analysis of high-throughput genomic data, “interesting” genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we e...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghosh, Debashis, Li, Song
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4218656/
https://www.ncbi.nlm.nih.gov/pubmed/25452686
http://dx.doi.org/10.4137/CIN.S13969
_version_ 1782342453098971136
author Ghosh, Debashis
Li, Song
author_facet Ghosh, Debashis
Li, Song
author_sort Ghosh, Debashis
collection PubMed
description In much of the analysis of high-throughput genomic data, “interesting” genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we explore the use of C(α) tests, which have been applied in other genomic data settings. Their use for the outlier expression problem, in particular with continuous data, is problematic but nevertheless motivates new statistics that give an unsupervised analog to previously developed outlier profile analysis approaches. Some simulation studies are used to evaluate the proposal. A bivariate extension is described that can accommodate data from two platforms on matched samples. The proposed methods are applied to data from a prostate cancer study.
format Online
Article
Text
id pubmed-4218656
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-42186562014-12-01 Unsupervised Outlier Profile Analysis Ghosh, Debashis Li, Song Cancer Inform Review In much of the analysis of high-throughput genomic data, “interesting” genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we explore the use of C(α) tests, which have been applied in other genomic data settings. Their use for the outlier expression problem, in particular with continuous data, is problematic but nevertheless motivates new statistics that give an unsupervised analog to previously developed outlier profile analysis approaches. Some simulation studies are used to evaluate the proposal. A bivariate extension is described that can accommodate data from two platforms on matched samples. The proposed methods are applied to data from a prostate cancer study. Libertas Academica 2014-10-15 /pmc/articles/PMC4218656/ /pubmed/25452686 http://dx.doi.org/10.4137/CIN.S13969 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Review
Ghosh, Debashis
Li, Song
Unsupervised Outlier Profile Analysis
title Unsupervised Outlier Profile Analysis
title_full Unsupervised Outlier Profile Analysis
title_fullStr Unsupervised Outlier Profile Analysis
title_full_unstemmed Unsupervised Outlier Profile Analysis
title_short Unsupervised Outlier Profile Analysis
title_sort unsupervised outlier profile analysis
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4218656/
https://www.ncbi.nlm.nih.gov/pubmed/25452686
http://dx.doi.org/10.4137/CIN.S13969
work_keys_str_mv AT ghoshdebashis unsupervisedoutlierprofileanalysis
AT lisong unsupervisedoutlierprofileanalysis