Cargando…

Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster

BACKGROUND: Heterogeneously and differentially expressed genes (hDEG) are a common phenomenon due to bio-logical diversity. A hDEG is often observed in gene expression experiments (with two experimental conditions) where it is highly expressed in a few experimental samples, or in drug trial experime...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Zihua, Yang, Zhengrong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699364/
https://www.ncbi.nlm.nih.gov/pubmed/23497043
http://dx.doi.org/10.1186/1471-2105-14-81
_version_ 1782275371358486528
author Yang, Zihua
Yang, Zhengrong
author_facet Yang, Zihua
Yang, Zhengrong
author_sort Yang, Zihua
collection PubMed
description BACKGROUND: Heterogeneously and differentially expressed genes (hDEG) are a common phenomenon due to bio-logical diversity. A hDEG is often observed in gene expression experiments (with two experimental conditions) where it is highly expressed in a few experimental samples, or in drug trial experiments for cancer studies with drug resistance heterogeneity among the disease group. These highly expressed samples are called outliers. Accurate detection of outliers among hDEGs is then desirable for dis- ease diagnosis and effective drug design. The standard approach for detecting hDEGs is to choose the appropriate subset of outliers to represent the experimental group. However, existing methods typically overlook hDEGs with very few outliers. RESULTS: We present in this paper a simple algorithm for detecting hDEGs by sequentially testing for potential outliers with respect to a tight cluster of non- outliers, among an ordered subset of the experimental samples. This avoids making any restrictive assumptions about how the outliers are distributed. We use simulated and real data to illustrate that the proposed algorithm achieves a good separation between the tight cluster of low expressions and the outliers for hDEGs. CONCLUSIONS: The proposed algorithm assesses each potential outlier in relation to the cluster of potential outliers without making explicit assumptions about the outlier distribution. Simulated examples and and breast cancer data sets are used to illustrate the suitability of the proposed algorithm for identifying hDEGs with small numbers of outliers.
format Online
Article
Text
id pubmed-3699364
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36993642013-07-03 Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster Yang, Zihua Yang, Zhengrong BMC Bioinformatics Methodology Article BACKGROUND: Heterogeneously and differentially expressed genes (hDEG) are a common phenomenon due to bio-logical diversity. A hDEG is often observed in gene expression experiments (with two experimental conditions) where it is highly expressed in a few experimental samples, or in drug trial experiments for cancer studies with drug resistance heterogeneity among the disease group. These highly expressed samples are called outliers. Accurate detection of outliers among hDEGs is then desirable for dis- ease diagnosis and effective drug design. The standard approach for detecting hDEGs is to choose the appropriate subset of outliers to represent the experimental group. However, existing methods typically overlook hDEGs with very few outliers. RESULTS: We present in this paper a simple algorithm for detecting hDEGs by sequentially testing for potential outliers with respect to a tight cluster of non- outliers, among an ordered subset of the experimental samples. This avoids making any restrictive assumptions about how the outliers are distributed. We use simulated and real data to illustrate that the proposed algorithm achieves a good separation between the tight cluster of low expressions and the outliers for hDEGs. CONCLUSIONS: The proposed algorithm assesses each potential outlier in relation to the cluster of potential outliers without making explicit assumptions about the outlier distribution. Simulated examples and and breast cancer data sets are used to illustrate the suitability of the proposed algorithm for identifying hDEGs with small numbers of outliers. BioMed Central 2013-03-05 /pmc/articles/PMC3699364/ /pubmed/23497043 http://dx.doi.org/10.1186/1471-2105-14-81 Text en Copyright © 2013 Yang and Yang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Yang, Zihua
Yang, Zhengrong
Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title_full Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title_fullStr Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title_full_unstemmed Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title_short Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster
title_sort prediction of heterogeneous differential genes by detecting outliers to a gaussian tight cluster
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699364/
https://www.ncbi.nlm.nih.gov/pubmed/23497043
http://dx.doi.org/10.1186/1471-2105-14-81
work_keys_str_mv AT yangzihua predictionofheterogeneousdifferentialgenesbydetectingoutlierstoagaussiantightcluster
AT yangzhengrong predictionofheterogeneousdifferentialgenesbydetectingoutlierstoagaussiantightcluster