Cargando…

Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection

Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization...

Descripción completa

Detalles Bibliográficos
Autores principales: Alam, Fardina Fathmiul, Rahman, Taseef, Shehu, Amarda
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179114/
https://www.ncbi.nlm.nih.gov/pubmed/32143444
http://dx.doi.org/10.3390/molecules25051146
_version_ 1783525599328862208
author Alam, Fardina Fathmiul
Rahman, Taseef
Shehu, Amarda
author_facet Alam, Fardina Fathmiul
Rahman, Taseef
Shehu, Amarda
author_sort Alam, Fardina Fathmiul
collection PubMed
description Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization of protein structure is active, but here we assess the promise of autoencoders. Motivated by rapid progress in neural network research, we investigate and evaluate autoencoders on yielding linear and nonlinear featurizations of protein tertiary structures. An additional reason we focus on autoencoders as the engine to obtain featurizations is the versatility of their architectures and the ease with which changes to architecture yield linear versus nonlinear features. While open-source neural network libraries, such as Keras, which we employ here, greatly facilitate constructing, training, and evaluating autoencoder architectures and conducting model search, autoencoders have not yet gained popularity in the structure biology community. Here we demonstrate their utility in a practical context. Employing autoencoder-based featurizations, we address the classic problem of decoy selection in protein structure prediction. Utilizing off-the-shelf supervised learning methods, we demonstrate that the featurizations are indeed meaningful and allow detecting active tertiary structures, thus opening the way for further avenues of research.
format Online
Article
Text
id pubmed-7179114
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-71791142020-04-28 Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection Alam, Fardina Fathmiul Rahman, Taseef Shehu, Amarda Molecules Article Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization of protein structure is active, but here we assess the promise of autoencoders. Motivated by rapid progress in neural network research, we investigate and evaluate autoencoders on yielding linear and nonlinear featurizations of protein tertiary structures. An additional reason we focus on autoencoders as the engine to obtain featurizations is the versatility of their architectures and the ease with which changes to architecture yield linear versus nonlinear features. While open-source neural network libraries, such as Keras, which we employ here, greatly facilitate constructing, training, and evaluating autoencoder architectures and conducting model search, autoencoders have not yet gained popularity in the structure biology community. Here we demonstrate their utility in a practical context. Employing autoencoder-based featurizations, we address the classic problem of decoy selection in protein structure prediction. Utilizing off-the-shelf supervised learning methods, we demonstrate that the featurizations are indeed meaningful and allow detecting active tertiary structures, thus opening the way for further avenues of research. MDPI 2020-03-04 /pmc/articles/PMC7179114/ /pubmed/32143444 http://dx.doi.org/10.3390/molecules25051146 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Alam, Fardina Fathmiul
Rahman, Taseef
Shehu, Amarda
Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title_full Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title_fullStr Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title_full_unstemmed Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title_short Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
title_sort evaluating autoencoder-based featurization and supervised learning for protein decoy selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179114/
https://www.ncbi.nlm.nih.gov/pubmed/32143444
http://dx.doi.org/10.3390/molecules25051146
work_keys_str_mv AT alamfardinafathmiul evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection
AT rahmantaseef evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection
AT shehuamarda evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection