Cargando…
Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection
Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179114/ https://www.ncbi.nlm.nih.gov/pubmed/32143444 http://dx.doi.org/10.3390/molecules25051146 |
_version_ | 1783525599328862208 |
---|---|
author | Alam, Fardina Fathmiul Rahman, Taseef Shehu, Amarda |
author_facet | Alam, Fardina Fathmiul Rahman, Taseef Shehu, Amarda |
author_sort | Alam, Fardina Fathmiul |
collection | PubMed |
description | Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization of protein structure is active, but here we assess the promise of autoencoders. Motivated by rapid progress in neural network research, we investigate and evaluate autoencoders on yielding linear and nonlinear featurizations of protein tertiary structures. An additional reason we focus on autoencoders as the engine to obtain featurizations is the versatility of their architectures and the ease with which changes to architecture yield linear versus nonlinear features. While open-source neural network libraries, such as Keras, which we employ here, greatly facilitate constructing, training, and evaluating autoencoder architectures and conducting model search, autoencoders have not yet gained popularity in the structure biology community. Here we demonstrate their utility in a practical context. Employing autoencoder-based featurizations, we address the classic problem of decoy selection in protein structure prediction. Utilizing off-the-shelf supervised learning methods, we demonstrate that the featurizations are indeed meaningful and allow detecting active tertiary structures, thus opening the way for further avenues of research. |
format | Online Article Text |
id | pubmed-7179114 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-71791142020-04-28 Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection Alam, Fardina Fathmiul Rahman, Taseef Shehu, Amarda Molecules Article Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization of protein structure is active, but here we assess the promise of autoencoders. Motivated by rapid progress in neural network research, we investigate and evaluate autoencoders on yielding linear and nonlinear featurizations of protein tertiary structures. An additional reason we focus on autoencoders as the engine to obtain featurizations is the versatility of their architectures and the ease with which changes to architecture yield linear versus nonlinear features. While open-source neural network libraries, such as Keras, which we employ here, greatly facilitate constructing, training, and evaluating autoencoder architectures and conducting model search, autoencoders have not yet gained popularity in the structure biology community. Here we demonstrate their utility in a practical context. Employing autoencoder-based featurizations, we address the classic problem of decoy selection in protein structure prediction. Utilizing off-the-shelf supervised learning methods, we demonstrate that the featurizations are indeed meaningful and allow detecting active tertiary structures, thus opening the way for further avenues of research. MDPI 2020-03-04 /pmc/articles/PMC7179114/ /pubmed/32143444 http://dx.doi.org/10.3390/molecules25051146 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Alam, Fardina Fathmiul Rahman, Taseef Shehu, Amarda Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title | Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title_full | Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title_fullStr | Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title_full_unstemmed | Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title_short | Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection |
title_sort | evaluating autoencoder-based featurization and supervised learning for protein decoy selection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7179114/ https://www.ncbi.nlm.nih.gov/pubmed/32143444 http://dx.doi.org/10.3390/molecules25051146 |
work_keys_str_mv | AT alamfardinafathmiul evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection AT rahmantaseef evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection AT shehuamarda evaluatingautoencoderbasedfeaturizationandsupervisedlearningforproteindecoyselection |