Cargando…

Shrinkage regression-based methods for microarray missing value imputation

BACKGROUND: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Hsiuying, Chiu, Chia-Chun, Wu, Yi-Ching, Wu, Wei-Sheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028886/
https://www.ncbi.nlm.nih.gov/pubmed/24565159
http://dx.doi.org/10.1186/1752-0509-7-S6-S11
_version_ 1782317122423095296
author Wang, Hsiuying
Chiu, Chia-Chun
Wu, Yi-Ching
Wu, Wei-Sheng
author_facet Wang, Hsiuying
Chiu, Chia-Chun
Wu, Yi-Ching
Wu, Wei-Sheng
author_sort Wang, Hsiuying
collection PubMed
description BACKGROUND: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. RESULTS: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. CONCLUSIONS: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.
format Online
Article
Text
id pubmed-4028886
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40288862014-06-04 Shrinkage regression-based methods for microarray missing value imputation Wang, Hsiuying Chiu, Chia-Chun Wu, Yi-Ching Wu, Wei-Sheng BMC Syst Biol Research BACKGROUND: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. RESULTS: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. CONCLUSIONS: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods. BioMed Central 2013-12-13 /pmc/articles/PMC4028886/ /pubmed/24565159 http://dx.doi.org/10.1186/1752-0509-7-S6-S11 Text en Copyright © 2013 Wang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wang, Hsiuying
Chiu, Chia-Chun
Wu, Yi-Ching
Wu, Wei-Sheng
Shrinkage regression-based methods for microarray missing value imputation
title Shrinkage regression-based methods for microarray missing value imputation
title_full Shrinkage regression-based methods for microarray missing value imputation
title_fullStr Shrinkage regression-based methods for microarray missing value imputation
title_full_unstemmed Shrinkage regression-based methods for microarray missing value imputation
title_short Shrinkage regression-based methods for microarray missing value imputation
title_sort shrinkage regression-based methods for microarray missing value imputation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028886/
https://www.ncbi.nlm.nih.gov/pubmed/24565159
http://dx.doi.org/10.1186/1752-0509-7-S6-S11
work_keys_str_mv AT wanghsiuying shrinkageregressionbasedmethodsformicroarraymissingvalueimputation
AT chiuchiachun shrinkageregressionbasedmethodsformicroarraymissingvalueimputation
AT wuyiching shrinkageregressionbasedmethodsformicroarraymissingvalueimputation
AT wuweisheng shrinkageregressionbasedmethodsformicroarraymissingvalueimputation