Cargando…
Robust gene selection methods using weighting schemes for microarray data analysis
BACKGROUND: A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the p...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5581932/ https://www.ncbi.nlm.nih.gov/pubmed/28865426 http://dx.doi.org/10.1186/s12859-017-1810-x |
_version_ | 1783261118860361728 |
---|---|
author | Kang, Suyeon Song, Jongwoo |
author_facet | Kang, Suyeon Song, Jongwoo |
author_sort | Kang, Suyeon |
collection | PubMed |
description | BACKGROUND: A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. RESULTS: We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. CONCLUSIONS: The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1810-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5581932 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-55819322017-09-06 Robust gene selection methods using weighting schemes for microarray data analysis Kang, Suyeon Song, Jongwoo BMC Bioinformatics Methodology Article BACKGROUND: A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. RESULTS: We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. CONCLUSIONS: The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1810-x) contains supplementary material, which is available to authorized users. BioMed Central 2017-09-02 /pmc/articles/PMC5581932/ /pubmed/28865426 http://dx.doi.org/10.1186/s12859-017-1810-x Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Kang, Suyeon Song, Jongwoo Robust gene selection methods using weighting schemes for microarray data analysis |
title | Robust gene selection methods using weighting schemes for microarray data analysis |
title_full | Robust gene selection methods using weighting schemes for microarray data analysis |
title_fullStr | Robust gene selection methods using weighting schemes for microarray data analysis |
title_full_unstemmed | Robust gene selection methods using weighting schemes for microarray data analysis |
title_short | Robust gene selection methods using weighting schemes for microarray data analysis |
title_sort | robust gene selection methods using weighting schemes for microarray data analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5581932/ https://www.ncbi.nlm.nih.gov/pubmed/28865426 http://dx.doi.org/10.1186/s12859-017-1810-x |
work_keys_str_mv | AT kangsuyeon robustgeneselectionmethodsusingweightingschemesformicroarraydataanalysis AT songjongwoo robustgeneselectionmethodsusingweightingschemesformicroarraydataanalysis |