Cargando…

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Suyan, Wang, Chi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/
https://www.ncbi.nlm.nih.gov/pubmed/31016185
http://dx.doi.org/10.1155/2019/1724898
_version_ 1783407998087987200
author Tian, Suyan
Wang, Chi
author_facet Tian, Suyan
Wang, Chi
author_sort Tian, Suyan
collection PubMed
description With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data.
format Online
Article
Text
id pubmed-6444255
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-64442552019-04-23 Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time Tian, Suyan Wang, Chi Biomed Res Int Research Article With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data. Hindawi 2019-03-19 /pmc/articles/PMC6444255/ /pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898 Text en Copyright © 2019 Suyan Tian and Chi Wang. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tian, Suyan
Wang, Chi
Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_full Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_fullStr Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_full_unstemmed Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_short Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
title_sort feature selection for longitudinal data by using sign averages to summarize gene expression values over time
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/
https://www.ncbi.nlm.nih.gov/pubmed/31016185
http://dx.doi.org/10.1155/2019/1724898
work_keys_str_mv AT tiansuyan featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime
AT wangchi featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime