Cargando…

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tian, Suyan, Wang, Chi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/ https://www.ncbi.nlm.nih.gov/pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898

_version_	1783407998087987200
author	Tian, Suyan Wang, Chi
author_facet	Tian, Suyan Wang, Chi
author_sort	Tian, Suyan
collection	PubMed
description	With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data.
format	Online Article Text
id	pubmed-6444255
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-64442552019-04-23 Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time Tian, Suyan Wang, Chi Biomed Res Int Research Article With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data. Hindawi 2019-03-19 /pmc/articles/PMC6444255/ /pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898 Text en Copyright © 2019 Suyan Tian and Chi Wang. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Tian, Suyan Wang, Chi Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title	Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_full	Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_fullStr	Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_full_unstemmed	Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_short	Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_sort	feature selection for longitudinal data by using sign averages to summarize gene expression values over time
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/ https://www.ncbi.nlm.nih.gov/pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898
work_keys_str_mv	AT tiansuyan featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime AT wangchi featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

Ejemplares similares