Cargando…
Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments
Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expre...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3795718/ https://www.ncbi.nlm.nih.gov/pubmed/24146889 http://dx.doi.org/10.1371/journal.pone.0076561 |
_version_ | 1782287420703637504 |
---|---|
author | Singh, Nitesh Kumar Repsilber, Dirk Liebscher, Volkmar Taher, Leila Fuellen, Georg |
author_facet | Singh, Nitesh Kumar Repsilber, Dirk Liebscher, Volkmar Taher, Leila Fuellen, Georg |
author_sort | Singh, Nitesh Kumar |
collection | PubMed |
description | Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call “relative Signal-to-Noise ratio” (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods. |
format | Online Article Text |
id | pubmed-3795718 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37957182013-10-21 Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments Singh, Nitesh Kumar Repsilber, Dirk Liebscher, Volkmar Taher, Leila Fuellen, Georg PLoS One Research Article Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call “relative Signal-to-Noise ratio” (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods. Public Library of Science 2013-10-11 /pmc/articles/PMC3795718/ /pubmed/24146889 http://dx.doi.org/10.1371/journal.pone.0076561 Text en © 2013 Singh et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Singh, Nitesh Kumar Repsilber, Dirk Liebscher, Volkmar Taher, Leila Fuellen, Georg Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title | Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title_full | Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title_fullStr | Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title_full_unstemmed | Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title_short | Identifying Genes Relevant to Specific Biological Conditions in Time Course Microarray Experiments |
title_sort | identifying genes relevant to specific biological conditions in time course microarray experiments |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3795718/ https://www.ncbi.nlm.nih.gov/pubmed/24146889 http://dx.doi.org/10.1371/journal.pone.0076561 |
work_keys_str_mv | AT singhniteshkumar identifyinggenesrelevanttospecificbiologicalconditionsintimecoursemicroarrayexperiments AT repsilberdirk identifyinggenesrelevanttospecificbiologicalconditionsintimecoursemicroarrayexperiments AT liebschervolkmar identifyinggenesrelevanttospecificbiologicalconditionsintimecoursemicroarrayexperiments AT taherleila identifyinggenesrelevanttospecificbiologicalconditionsintimecoursemicroarrayexperiments AT fuellengeorg identifyinggenesrelevanttospecificbiologicalconditionsintimecoursemicroarrayexperiments |