Cargando…

Very Important Pool (VIP) genes – an application for microarray-based molecular signatures

BACKGROUND: Advances in DNA microarray technology portend that molecular signatures from which microarray will eventually be used in clinical environments and personalized medicine. Derivation of biomarkers is a large step beyond hypothesis generation and imposes considerably more stringency for acc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Su, Zhenqiang, Hong, Huixiao, Fang, Hong, Shi, Leming, Perkins, Roger, Tong, Weida
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537560/ https://www.ncbi.nlm.nih.gov/pubmed/18793473 http://dx.doi.org/10.1186/1471-2105-9-S9-S9

_version_	1782159106809790464
author	Su, Zhenqiang Hong, Huixiao Fang, Hong Shi, Leming Perkins, Roger Tong, Weida
author_facet	Su, Zhenqiang Hong, Huixiao Fang, Hong Shi, Leming Perkins, Roger Tong, Weida
author_sort	Su, Zhenqiang
collection	PubMed
description	BACKGROUND: Advances in DNA microarray technology portend that molecular signatures from which microarray will eventually be used in clinical environments and personalized medicine. Derivation of biomarkers is a large step beyond hypothesis generation and imposes considerably more stringency for accuracy in identifying informative gene subsets to differentiate phenotypes. The inherent nature of microarray data, with fewer samples and replicates compared to the large number of genes, requires identifying informative genes prior to classifier construction. However, improving the ability to identify differentiating genes remains a challenge in bioinformatics. RESULTS: A new hybrid gene selection approach was investigated and tested with nine publicly available microarray datasets. The new method identifies a Very Important Pool (VIP) of genes from the broad patterns of gene expression data. The method uses a bagging sampling principle, where the re-sampled arrays are used to identify the most informative genes. Frequency of selection is used in a repetitive process to identify the VIP genes. The putative informative genes are selected using two methods, t-statistic and discriminatory analysis. In the t-statistic, the informative genes are identified based on p-values. In the discriminatory analysis, disjoint Principal Component Analyses (PCAs) are conducted for each class of samples, and genes with high discrimination power (DP) are identified. The VIP gene selection approach was compared with the p-value ranking approach. The genes identified by the VIP method but not by the p-value ranking approach are also related to the disease investigated. More importantly, these genes are part of the pathways derived from the common genes shared by both the VIP and p-ranking methods. Moreover, the binary classifiers built from these genes are statistically equivalent to those built from the top 50 p-value ranked genes in distinguishing different types of samples. CONCLUSION: The VIP gene selection approach could identify additional subsets of informative genes that would not always be selected by the p-value ranking method. These genes are likely to be additional true positives since they are a part of pathways identified by the p-value ranking method and expected to be related to the relevant biology. Therefore, these additional genes derived from the VIP method potentially provide valuable biological insights.
format	Text
id	pubmed-2537560
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-25375602008-09-17 Very Important Pool (VIP) genes – an application for microarray-based molecular signatures Su, Zhenqiang Hong, Huixiao Fang, Hong Shi, Leming Perkins, Roger Tong, Weida BMC Bioinformatics Proceedings BACKGROUND: Advances in DNA microarray technology portend that molecular signatures from which microarray will eventually be used in clinical environments and personalized medicine. Derivation of biomarkers is a large step beyond hypothesis generation and imposes considerably more stringency for accuracy in identifying informative gene subsets to differentiate phenotypes. The inherent nature of microarray data, with fewer samples and replicates compared to the large number of genes, requires identifying informative genes prior to classifier construction. However, improving the ability to identify differentiating genes remains a challenge in bioinformatics. RESULTS: A new hybrid gene selection approach was investigated and tested with nine publicly available microarray datasets. The new method identifies a Very Important Pool (VIP) of genes from the broad patterns of gene expression data. The method uses a bagging sampling principle, where the re-sampled arrays are used to identify the most informative genes. Frequency of selection is used in a repetitive process to identify the VIP genes. The putative informative genes are selected using two methods, t-statistic and discriminatory analysis. In the t-statistic, the informative genes are identified based on p-values. In the discriminatory analysis, disjoint Principal Component Analyses (PCAs) are conducted for each class of samples, and genes with high discrimination power (DP) are identified. The VIP gene selection approach was compared with the p-value ranking approach. The genes identified by the VIP method but not by the p-value ranking approach are also related to the disease investigated. More importantly, these genes are part of the pathways derived from the common genes shared by both the VIP and p-ranking methods. Moreover, the binary classifiers built from these genes are statistically equivalent to those built from the top 50 p-value ranked genes in distinguishing different types of samples. CONCLUSION: The VIP gene selection approach could identify additional subsets of informative genes that would not always be selected by the p-value ranking method. These genes are likely to be additional true positives since they are a part of pathways identified by the p-value ranking method and expected to be related to the relevant biology. Therefore, these additional genes derived from the VIP method potentially provide valuable biological insights. BioMed Central 2008-08-12 /pmc/articles/PMC2537560/ /pubmed/18793473 http://dx.doi.org/10.1186/1471-2105-9-S9-S9 Text en Copyright © 2008 Su et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Su, Zhenqiang Hong, Huixiao Fang, Hong Shi, Leming Perkins, Roger Tong, Weida Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title	Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title_full	Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title_fullStr	Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title_full_unstemmed	Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title_short	Very Important Pool (VIP) genes – an application for microarray-based molecular signatures
title_sort	very important pool (vip) genes – an application for microarray-based molecular signatures
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537560/ https://www.ncbi.nlm.nih.gov/pubmed/18793473 http://dx.doi.org/10.1186/1471-2105-9-S9-S9
work_keys_str_mv	AT suzhenqiang veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures AT honghuixiao veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures AT fanghong veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures AT shileming veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures AT perkinsroger veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures AT tongweida veryimportantpoolvipgenesanapplicationformicroarraybasedmolecularsignatures

Very Important Pool (VIP) genes – an application for microarray-based molecular signatures

Ejemplares similares