Cargando…

Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction

OBJECTIVES: We evaluated autoencoders as a feature engineering and pretraining technique to improve major depressive disorder (MDD) prognostic risk prediction. Autoencoders can represent temporal feature relationships not identified by aggregate features. The predictive performance of autoencoders o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jones, Barrett W, Taylor, Warren D, Walsh, Colin G
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10561992/ https://www.ncbi.nlm.nih.gov/pubmed/37818308 http://dx.doi.org/10.1093/jamiaopen/ooad086

_version_	1785118032173989888
author	Jones, Barrett W Taylor, Warren D Walsh, Colin G
author_facet	Jones, Barrett W Taylor, Warren D Walsh, Colin G
author_sort	Jones, Barrett W
collection	PubMed
description	OBJECTIVES: We evaluated autoencoders as a feature engineering and pretraining technique to improve major depressive disorder (MDD) prognostic risk prediction. Autoencoders can represent temporal feature relationships not identified by aggregate features. The predictive performance of autoencoders of multiple sequential structures was evaluated as feature engineering and pretraining strategies on an array of prediction tasks and compared to a restricted Boltzmann machine (RBM) and random forests as a benchmark. MATERIALS AND METHODS: We study MDD patients from Vanderbilt University Medical Center. Autoencoder models with Attention and long-short-term memory (LSTM) layers were trained to create latent representations of the input data. Predictive performance was evaluated temporally by fitting random forest models to predict future outcomes with engineered features as input and using autoencoder weights to initialize neural network layers. We evaluated area under the precision-recall curve (AUPRC) trends and variation over the study population’s treatment course. RESULTS: The pretrained LSTM model improved predictive performance over pretrained Attention models and benchmarks in 3 of 4 outcomes including self-harm/suicide attempt (AUPRCs, LSTM pretrained = 0.012, Attention pretrained = 0.010, RBM = 0.009, random forest = 0.005). The use of autoencoders for feature engineering had varied results, with benchmarks outperforming LSTM and Attention encodings on the self-harm/suicide attempt outcome (AUPRCs, LSTM encodings = 0.003, Attention encodings = 0.004, RBM = 0.009, random forest = 0.005). DISCUSSION: Improvement in prediction resulting from pretraining has the potential for increased clinical impact of MDD risk models. We did not find evidence that the use of temporal feature encodings was additive to predictive performance in the study population. This suggests that predictive information retained by model weights may be lost during encoding. LSTM pretrained model predictive performance is shown to be clinically useful and improves over state-of-the-art predictors in the MDD phenotype. LSTM model performance warrants consideration of use in future related studies. CONCLUSION: LSTM models with pretrained weights from autoencoders were able to outperform the benchmark and a pretrained Attention model. Future researchers developing risk models in MDD may benefit from the use of LSTM autoencoder pretrained weights.
format	Online Article Text
id	pubmed-10561992
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-105619922023-10-10 Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction Jones, Barrett W Taylor, Warren D Walsh, Colin G JAMIA Open Research and Applications OBJECTIVES: We evaluated autoencoders as a feature engineering and pretraining technique to improve major depressive disorder (MDD) prognostic risk prediction. Autoencoders can represent temporal feature relationships not identified by aggregate features. The predictive performance of autoencoders of multiple sequential structures was evaluated as feature engineering and pretraining strategies on an array of prediction tasks and compared to a restricted Boltzmann machine (RBM) and random forests as a benchmark. MATERIALS AND METHODS: We study MDD patients from Vanderbilt University Medical Center. Autoencoder models with Attention and long-short-term memory (LSTM) layers were trained to create latent representations of the input data. Predictive performance was evaluated temporally by fitting random forest models to predict future outcomes with engineered features as input and using autoencoder weights to initialize neural network layers. We evaluated area under the precision-recall curve (AUPRC) trends and variation over the study population’s treatment course. RESULTS: The pretrained LSTM model improved predictive performance over pretrained Attention models and benchmarks in 3 of 4 outcomes including self-harm/suicide attempt (AUPRCs, LSTM pretrained = 0.012, Attention pretrained = 0.010, RBM = 0.009, random forest = 0.005). The use of autoencoders for feature engineering had varied results, with benchmarks outperforming LSTM and Attention encodings on the self-harm/suicide attempt outcome (AUPRCs, LSTM encodings = 0.003, Attention encodings = 0.004, RBM = 0.009, random forest = 0.005). DISCUSSION: Improvement in prediction resulting from pretraining has the potential for increased clinical impact of MDD risk models. We did not find evidence that the use of temporal feature encodings was additive to predictive performance in the study population. This suggests that predictive information retained by model weights may be lost during encoding. LSTM pretrained model predictive performance is shown to be clinically useful and improves over state-of-the-art predictors in the MDD phenotype. LSTM model performance warrants consideration of use in future related studies. CONCLUSION: LSTM models with pretrained weights from autoencoders were able to outperform the benchmark and a pretrained Attention model. Future researchers developing risk models in MDD may benefit from the use of LSTM autoencoder pretrained weights. Oxford University Press 2023-10-09 /pmc/articles/PMC10561992/ /pubmed/37818308 http://dx.doi.org/10.1093/jamiaopen/ooad086 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research and Applications Jones, Barrett W Taylor, Warren D Walsh, Colin G Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title	Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title_full	Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title_fullStr	Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title_full_unstemmed	Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title_short	Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
title_sort	sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10561992/ https://www.ncbi.nlm.nih.gov/pubmed/37818308 http://dx.doi.org/10.1093/jamiaopen/ooad086
work_keys_str_mv	AT jonesbarrettw sequentialautoencodersforfeatureengineeringandpretraininginmajordepressivedisorderriskprediction AT taylorwarrend sequentialautoencodersforfeatureengineeringandpretraininginmajordepressivedisorderriskprediction AT walshcoling sequentialautoencodersforfeatureengineeringandpretraininginmajordepressivedisorderriskprediction

Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction

Ejemplares similares