Cargando…

A community-powered search of machine learning strategy space to find NMR property prediction models

The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of...

Descripción completa

Detalles Bibliográficos
Autores principales: Bratholm, Lars A., Gerrard, Will, Anderson, Brandon, Bai, Shaojie, Choi, Sunghwan, Dang, Lam, Hanchar, Pavel, Howard, Addison, Kim, Sanghoon, Kolter, Zico, Kondor, Risi, Kornbluth, Mordechai, Lee, Youhan, Lee, Youngsoo, Mailoa, Jonathan P., Nguyen, Thanh Tu, Popovic, Milos, Rakocevic, Goran, Reade, Walter, Song, Wonho, Stojanovic, Luka, Thiede, Erik H., Tijanic, Nebojsa, Torrubia, Andres, Willmott, Devin, Butts, Craig P., Glowacki, David R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291653/
https://www.ncbi.nlm.nih.gov/pubmed/34283864
http://dx.doi.org/10.1371/journal.pone.0253612
_version_ 1783724683389042688
author Bratholm, Lars A.
Gerrard, Will
Anderson, Brandon
Bai, Shaojie
Choi, Sunghwan
Dang, Lam
Hanchar, Pavel
Howard, Addison
Kim, Sanghoon
Kolter, Zico
Kondor, Risi
Kornbluth, Mordechai
Lee, Youhan
Lee, Youngsoo
Mailoa, Jonathan P.
Nguyen, Thanh Tu
Popovic, Milos
Rakocevic, Goran
Reade, Walter
Song, Wonho
Stojanovic, Luka
Thiede, Erik H.
Tijanic, Nebojsa
Torrubia, Andres
Willmott, Devin
Butts, Craig P.
Glowacki, David R.
author_facet Bratholm, Lars A.
Gerrard, Will
Anderson, Brandon
Bai, Shaojie
Choi, Sunghwan
Dang, Lam
Hanchar, Pavel
Howard, Addison
Kim, Sanghoon
Kolter, Zico
Kondor, Risi
Kornbluth, Mordechai
Lee, Youhan
Lee, Youngsoo
Mailoa, Jonathan P.
Nguyen, Thanh Tu
Popovic, Milos
Rakocevic, Goran
Reade, Walter
Song, Wonho
Stojanovic, Luka
Thiede, Erik H.
Tijanic, Nebojsa
Torrubia, Andres
Willmott, Devin
Butts, Craig P.
Glowacki, David R.
author_sort Bratholm, Lars A.
collection PubMed
description The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published ‘in-house’ efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.
format Online
Article
Text
id pubmed-8291653
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-82916532021-07-31 A community-powered search of machine learning strategy space to find NMR property prediction models Bratholm, Lars A. Gerrard, Will Anderson, Brandon Bai, Shaojie Choi, Sunghwan Dang, Lam Hanchar, Pavel Howard, Addison Kim, Sanghoon Kolter, Zico Kondor, Risi Kornbluth, Mordechai Lee, Youhan Lee, Youngsoo Mailoa, Jonathan P. Nguyen, Thanh Tu Popovic, Milos Rakocevic, Goran Reade, Walter Song, Wonho Stojanovic, Luka Thiede, Erik H. Tijanic, Nebojsa Torrubia, Andres Willmott, Devin Butts, Craig P. Glowacki, David R. PLoS One Research Article The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published ‘in-house’ efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties. Public Library of Science 2021-07-20 /pmc/articles/PMC8291653/ /pubmed/34283864 http://dx.doi.org/10.1371/journal.pone.0253612 Text en © 2021 Bratholm et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bratholm, Lars A.
Gerrard, Will
Anderson, Brandon
Bai, Shaojie
Choi, Sunghwan
Dang, Lam
Hanchar, Pavel
Howard, Addison
Kim, Sanghoon
Kolter, Zico
Kondor, Risi
Kornbluth, Mordechai
Lee, Youhan
Lee, Youngsoo
Mailoa, Jonathan P.
Nguyen, Thanh Tu
Popovic, Milos
Rakocevic, Goran
Reade, Walter
Song, Wonho
Stojanovic, Luka
Thiede, Erik H.
Tijanic, Nebojsa
Torrubia, Andres
Willmott, Devin
Butts, Craig P.
Glowacki, David R.
A community-powered search of machine learning strategy space to find NMR property prediction models
title A community-powered search of machine learning strategy space to find NMR property prediction models
title_full A community-powered search of machine learning strategy space to find NMR property prediction models
title_fullStr A community-powered search of machine learning strategy space to find NMR property prediction models
title_full_unstemmed A community-powered search of machine learning strategy space to find NMR property prediction models
title_short A community-powered search of machine learning strategy space to find NMR property prediction models
title_sort community-powered search of machine learning strategy space to find nmr property prediction models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291653/
https://www.ncbi.nlm.nih.gov/pubmed/34283864
http://dx.doi.org/10.1371/journal.pone.0253612
work_keys_str_mv AT bratholmlarsa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT gerrardwill acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT andersonbrandon acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT baishaojie acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT choisunghwan acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT danglam acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT hancharpavel acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT howardaddison acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kimsanghoon acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kolterzico acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kondorrisi acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kornbluthmordechai acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT leeyouhan acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT leeyoungsoo acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT mailoajonathanp acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT nguyenthanhtu acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT popovicmilos acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT rakocevicgoran acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT readewalter acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT songwonho acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT stojanovicluka acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT thiedeerikh acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT tijanicnebojsa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT torrubiaandres acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT willmottdevin acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT buttscraigp acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT glowackidavidr acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT bratholmlarsa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT gerrardwill communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT andersonbrandon communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT baishaojie communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT choisunghwan communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT danglam communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT hancharpavel communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT howardaddison communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kimsanghoon communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kolterzico communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kondorrisi communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT kornbluthmordechai communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT leeyouhan communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT leeyoungsoo communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT mailoajonathanp communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT nguyenthanhtu communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT popovicmilos communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT rakocevicgoran communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT readewalter communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT songwonho communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT stojanovicluka communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT thiedeerikh communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT tijanicnebojsa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT torrubiaandres communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT willmottdevin communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT buttscraigp communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT glowackidavidr communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels