Cargando…

Learning a confidence score and the latent space of a new supervised autoencoder for diagnosis and prognosis in clinical metabolomic studies

BACKGROUND: Presently, there is a wide variety of classification methods and deep neural network approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Ther...

Descripción completa

Detalles Bibliográficos
Autores principales: Chardin, David, Gille, Cyprien, Pourcher, Thierry, Humbert, Olivier, Barlaud, Michel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9434875/
https://www.ncbi.nlm.nih.gov/pubmed/36050631
http://dx.doi.org/10.1186/s12859-022-04900-x
Descripción
Sumario:BACKGROUND: Presently, there is a wide variety of classification methods and deep neural network approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Therefore, these innovative methods are not appropriate for decision support systems in healthcare. Indeed, to allow clinicians to make informed and well thought out decisions, the algorithm should provide the main pieces of information used to compute the predicted diagnosis and/or prognosis, as well as a confidence score for this prediction. METHODS: Herein, we used a new supervised autoencoder (SAE) approach for classification of clinical metabolomic data. This new method has the advantage of providing a confidence score for each prediction thanks to a softmax classifier and a meaningful latent space visualization and to include a new efficient feature selection method, with a structured constraint, which allows for biologically interpretable results. RESULTS: Experimental results on three metabolomics datasets of clinical samples illustrate the effectiveness of our SAE and its confidence score. The supervised autoencoder provides an accurate localization of the patients in the latent space, and an efficient confidence score. Experiments show that the SAE outperforms classical methods (PLS-DA, Random Forests, SVM, and neural networks (NN)). Furthermore, the metabolites selected by the SAE were found to be biologically relevant. CONCLUSION: In this paper, we describe a new efficient SAE method to support diagnostic or prognostic evaluation based on metabolomics analyses.