Cargando…

Evaluating uncertainty-based active learning for accelerating the generalization of molecular property prediction

Deep learning models have proven to be a powerful tool for the prediction of molecular properties for applications including drug design and the development of energy storage materials. However, in order to learn accurate and robust structure–property mappings, these models require large amounts of...

Descripción completa

Detalles Bibliográficos
Autores principales: Yin, Tianzhixi, Panapitiya, Gihan, Coda, Elizabeth D., Saldanha, Emily G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10633997/
https://www.ncbi.nlm.nih.gov/pubmed/37941055
http://dx.doi.org/10.1186/s13321-023-00753-5
Descripción
Sumario:Deep learning models have proven to be a powerful tool for the prediction of molecular properties for applications including drug design and the development of energy storage materials. However, in order to learn accurate and robust structure–property mappings, these models require large amounts of data which can be a challenge to collect given the time and resource-intensive nature of experimental material characterization efforts. Additionally, such models fail to generalize to new types of molecular structures that were not included in the model training data. The acceleration of material development through uncertainty-guided experimental design has the promise to significantly reduce the data requirements and enable faster generalization to new types of materials. To evaluate the potential of such approaches for electrolyte design applications, we perform comprehensive evaluation of existing uncertainty quantification methods on the prediction of two relevant molecular properties - aqueous solubility and redox potential. We develop novel evaluation methods to probe the utility of the uncertainty estimates for both in-domain and out-of-domain data sets. Finally, we leverage selected uncertainty estimation methods for active learning to evaluate their capacity to support experimental design.