Cargando…
Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time
With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/ https://www.ncbi.nlm.nih.gov/pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898 |
_version_ | 1783407998087987200 |
---|---|
author | Tian, Suyan Wang, Chi |
author_facet | Tian, Suyan Wang, Chi |
author_sort | Tian, Suyan |
collection | PubMed |
description | With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data. |
format | Online Article Text |
id | pubmed-6444255 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-64442552019-04-23 Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time Tian, Suyan Wang, Chi Biomed Res Int Research Article With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data. Hindawi 2019-03-19 /pmc/articles/PMC6444255/ /pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898 Text en Copyright © 2019 Suyan Tian and Chi Wang. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Tian, Suyan Wang, Chi Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title | Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title_full | Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title_fullStr | Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title_full_unstemmed | Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title_short | Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time |
title_sort | feature selection for longitudinal data by using sign averages to summarize gene expression values over time |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444255/ https://www.ncbi.nlm.nih.gov/pubmed/31016185 http://dx.doi.org/10.1155/2019/1724898 |
work_keys_str_mv | AT tiansuyan featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime AT wangchi featureselectionforlongitudinaldatabyusingsignaveragestosummarizegeneexpressionvaluesovertime |