Cargando…
Dimension reduction with gene expression data using targeted variable importance measurement
BACKGROUND: When a large number of candidate variables are present, a dimension reduction procedure is usually conducted to reduce the variable space before the subsequent analysis is carried out. The goal of dimension reduction is to find a list of candidate genes with a more operable length ideall...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3166941/ https://www.ncbi.nlm.nih.gov/pubmed/21849016 http://dx.doi.org/10.1186/1471-2105-12-312 |
_version_ | 1782211210841686016 |
---|---|
author | Wang, Hui van der Laan, Mark J |
author_facet | Wang, Hui van der Laan, Mark J |
author_sort | Wang, Hui |
collection | PubMed |
description | BACKGROUND: When a large number of candidate variables are present, a dimension reduction procedure is usually conducted to reduce the variable space before the subsequent analysis is carried out. The goal of dimension reduction is to find a list of candidate genes with a more operable length ideally including all the relevant genes. Leaving many uninformative genes in the analysis can lead to biased estimates and reduced power. Therefore, dimension reduction is often considered a necessary predecessor of the analysis because it can not only reduce the cost of handling numerous variables, but also has the potential to improve the performance of the downstream analysis algorithms. RESULTS: We propose a TMLE-VIM dimension reduction procedure based on the variable importance measurement (VIM) in the frame work of targeted maximum likelihood estimation (TMLE). TMLE is an extension of maximum likelihood estimation targeting the parameter of interest. TMLE-VIM is a two-stage procedure. The first stage resorts to a machine learning algorithm, and the second step improves the first stage estimation with respect to the parameter of interest. CONCLUSIONS: We demonstrate with simulations and data analyses that our approach not only enjoys the prediction power of machine learning algorithms, but also accounts for the correlation structures among variables and therefore produces better variable rankings. When utilized in dimension reduction, TMLE-VIM can help to obtain the shortest possible list with the most truly associated variables. |
format | Online Article Text |
id | pubmed-3166941 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-31669412011-09-06 Dimension reduction with gene expression data using targeted variable importance measurement Wang, Hui van der Laan, Mark J BMC Bioinformatics Methodology Article BACKGROUND: When a large number of candidate variables are present, a dimension reduction procedure is usually conducted to reduce the variable space before the subsequent analysis is carried out. The goal of dimension reduction is to find a list of candidate genes with a more operable length ideally including all the relevant genes. Leaving many uninformative genes in the analysis can lead to biased estimates and reduced power. Therefore, dimension reduction is often considered a necessary predecessor of the analysis because it can not only reduce the cost of handling numerous variables, but also has the potential to improve the performance of the downstream analysis algorithms. RESULTS: We propose a TMLE-VIM dimension reduction procedure based on the variable importance measurement (VIM) in the frame work of targeted maximum likelihood estimation (TMLE). TMLE is an extension of maximum likelihood estimation targeting the parameter of interest. TMLE-VIM is a two-stage procedure. The first stage resorts to a machine learning algorithm, and the second step improves the first stage estimation with respect to the parameter of interest. CONCLUSIONS: We demonstrate with simulations and data analyses that our approach not only enjoys the prediction power of machine learning algorithms, but also accounts for the correlation structures among variables and therefore produces better variable rankings. When utilized in dimension reduction, TMLE-VIM can help to obtain the shortest possible list with the most truly associated variables. BioMed Central 2011-07-29 /pmc/articles/PMC3166941/ /pubmed/21849016 http://dx.doi.org/10.1186/1471-2105-12-312 Text en Copyright ©2011 Wang and van der Laan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Wang, Hui van der Laan, Mark J Dimension reduction with gene expression data using targeted variable importance measurement |
title | Dimension reduction with gene expression data using targeted variable importance measurement |
title_full | Dimension reduction with gene expression data using targeted variable importance measurement |
title_fullStr | Dimension reduction with gene expression data using targeted variable importance measurement |
title_full_unstemmed | Dimension reduction with gene expression data using targeted variable importance measurement |
title_short | Dimension reduction with gene expression data using targeted variable importance measurement |
title_sort | dimension reduction with gene expression data using targeted variable importance measurement |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3166941/ https://www.ncbi.nlm.nih.gov/pubmed/21849016 http://dx.doi.org/10.1186/1471-2105-12-312 |
work_keys_str_mv | AT wanghui dimensionreductionwithgeneexpressiondatausingtargetedvariableimportancemeasurement AT vanderlaanmarkj dimensionreductionwithgeneexpressiondatausingtargetedvariableimportancemeasurement |