Cargando…

Robust imputation method for missing values in microarray data

BACKGROUND: When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputation algorithms have been proposed to estimate the missing...

Descripción completa

Detalles Bibliográficos
Autores principales: Yoon, Dankyu, Lee, Eun-Kyung, Park, Taesung
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892075/
https://www.ncbi.nlm.nih.gov/pubmed/17493255
http://dx.doi.org/10.1186/1471-2105-8-S2-S6
_version_ 1782133821354803200
author Yoon, Dankyu
Lee, Eun-Kyung
Park, Taesung
author_facet Yoon, Dankyu
Lee, Eun-Kyung
Park, Taesung
author_sort Yoon, Dankyu
collection PubMed
description BACKGROUND: When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputation algorithms have been proposed to estimate the missing values. In this study, we develop a robust least squares estimation with principal components (RLSP) method by extending the local least square imputation (LLSimpute) method. The basic idea of our method is to employ quantile regression to estimate the missing values, using the estimated principal components of a selected set of similar genes. RESULTS: Using the normalized root mean squares error, the performance of the proposed method was evaluated and compared with other previously proposed imputation methods. The proposed RLSP method clearly outperformed the weighted k-nearest neighbors imputation (kNNimpute) method and LLSimpute method, and showed competitive results with Bayesian principal component analysis (BPCA) method. CONCLUSION: Adapting the principal components of the selected genes and employing the quantile regression model improved the robustness and accuracy of missing value imputation. Thus, the proposed RLSP method is, according to our empirical studies, more robust and accurate than the widely used kNNimpute and LLSimpute methods.
format Text
id pubmed-1892075
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18920752007-06-15 Robust imputation method for missing values in microarray data Yoon, Dankyu Lee, Eun-Kyung Park, Taesung BMC Bioinformatics Research BACKGROUND: When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputation algorithms have been proposed to estimate the missing values. In this study, we develop a robust least squares estimation with principal components (RLSP) method by extending the local least square imputation (LLSimpute) method. The basic idea of our method is to employ quantile regression to estimate the missing values, using the estimated principal components of a selected set of similar genes. RESULTS: Using the normalized root mean squares error, the performance of the proposed method was evaluated and compared with other previously proposed imputation methods. The proposed RLSP method clearly outperformed the weighted k-nearest neighbors imputation (kNNimpute) method and LLSimpute method, and showed competitive results with Bayesian principal component analysis (BPCA) method. CONCLUSION: Adapting the principal components of the selected genes and employing the quantile regression model improved the robustness and accuracy of missing value imputation. Thus, the proposed RLSP method is, according to our empirical studies, more robust and accurate than the widely used kNNimpute and LLSimpute methods. BioMed Central 2007-05-03 /pmc/articles/PMC1892075/ /pubmed/17493255 http://dx.doi.org/10.1186/1471-2105-8-S2-S6 Text en Copyright © 2007 Yoon et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Yoon, Dankyu
Lee, Eun-Kyung
Park, Taesung
Robust imputation method for missing values in microarray data
title Robust imputation method for missing values in microarray data
title_full Robust imputation method for missing values in microarray data
title_fullStr Robust imputation method for missing values in microarray data
title_full_unstemmed Robust imputation method for missing values in microarray data
title_short Robust imputation method for missing values in microarray data
title_sort robust imputation method for missing values in microarray data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892075/
https://www.ncbi.nlm.nih.gov/pubmed/17493255
http://dx.doi.org/10.1186/1471-2105-8-S2-S6
work_keys_str_mv AT yoondankyu robustimputationmethodformissingvaluesinmicroarraydata
AT leeeunkyung robustimputationmethodformissingvaluesinmicroarraydata
AT parktaesung robustimputationmethodformissingvaluesinmicroarraydata