Cargando…

Identification of differentially expressed genes by means of outlier detection

BACKGROUND: An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple compar...

Descripción completa

Detalles Bibliográficos
Autores principales: Irigoien, Itziar, Arenas, Concepción
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131896/
https://www.ncbi.nlm.nih.gov/pubmed/30200879
http://dx.doi.org/10.1186/s12859-018-2318-8
_version_ 1783354219821006848
author Irigoien, Itziar
Arenas, Concepción
author_facet Irigoien, Itziar
Arenas, Concepción
author_sort Irigoien, Itziar
collection PubMed
description BACKGROUND: An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple comparison correction method must be used. Consequently, the resulting cut-off value may be too small. Moreover, an important issue is the selection’s replicability of the DE genes. We present a new method, called ORdensity, to obtain a reproducible selection of DE genes. It takes into account the relation between all genes and it is not a gene-by-gene approach, unlike the usually applied techniques to DE gene selection. RESULTS: The proposed method returns three measures, related to the concepts of outlier and density of false positives in a neighbourhood, which allow us to identify the DE genes with high classification accuracy. To assess the performance of ORdensity, we used simulated microarray data and four real microarray cancer data sets. The results indicated that the method correctly detects the DE genes; it is competitive with other well accepted methods; the list of DE genes that it obtains is useful for the correct classification or diagnosis of new future samples and, in general, it is more stable than other procedures. CONCLUSIONS: ORdensity is a new method for identifying DE genes that avoids some of the shortcomings of the individual gene identification and it is stable when the original sample is changed by subsamples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2318-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6131896
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61318962018-09-13 Identification of differentially expressed genes by means of outlier detection Irigoien, Itziar Arenas, Concepción BMC Bioinformatics Methodology Article BACKGROUND: An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple comparison correction method must be used. Consequently, the resulting cut-off value may be too small. Moreover, an important issue is the selection’s replicability of the DE genes. We present a new method, called ORdensity, to obtain a reproducible selection of DE genes. It takes into account the relation between all genes and it is not a gene-by-gene approach, unlike the usually applied techniques to DE gene selection. RESULTS: The proposed method returns three measures, related to the concepts of outlier and density of false positives in a neighbourhood, which allow us to identify the DE genes with high classification accuracy. To assess the performance of ORdensity, we used simulated microarray data and four real microarray cancer data sets. The results indicated that the method correctly detects the DE genes; it is competitive with other well accepted methods; the list of DE genes that it obtains is useful for the correct classification or diagnosis of new future samples and, in general, it is more stable than other procedures. CONCLUSIONS: ORdensity is a new method for identifying DE genes that avoids some of the shortcomings of the individual gene identification and it is stable when the original sample is changed by subsamples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2318-8) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-10 /pmc/articles/PMC6131896/ /pubmed/30200879 http://dx.doi.org/10.1186/s12859-018-2318-8 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Irigoien, Itziar
Arenas, Concepción
Identification of differentially expressed genes by means of outlier detection
title Identification of differentially expressed genes by means of outlier detection
title_full Identification of differentially expressed genes by means of outlier detection
title_fullStr Identification of differentially expressed genes by means of outlier detection
title_full_unstemmed Identification of differentially expressed genes by means of outlier detection
title_short Identification of differentially expressed genes by means of outlier detection
title_sort identification of differentially expressed genes by means of outlier detection
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131896/
https://www.ncbi.nlm.nih.gov/pubmed/30200879
http://dx.doi.org/10.1186/s12859-018-2318-8
work_keys_str_mv AT irigoienitziar identificationofdifferentiallyexpressedgenesbymeansofoutlierdetection
AT arenasconcepcion identificationofdifferentiallyexpressedgenesbymeansofoutlierdetection