Cargando…
A community-powered search of machine learning strategy space to find NMR property prediction models
The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291653/ https://www.ncbi.nlm.nih.gov/pubmed/34283864 http://dx.doi.org/10.1371/journal.pone.0253612 |
_version_ | 1783724683389042688 |
---|---|
author | Bratholm, Lars A. Gerrard, Will Anderson, Brandon Bai, Shaojie Choi, Sunghwan Dang, Lam Hanchar, Pavel Howard, Addison Kim, Sanghoon Kolter, Zico Kondor, Risi Kornbluth, Mordechai Lee, Youhan Lee, Youngsoo Mailoa, Jonathan P. Nguyen, Thanh Tu Popovic, Milos Rakocevic, Goran Reade, Walter Song, Wonho Stojanovic, Luka Thiede, Erik H. Tijanic, Nebojsa Torrubia, Andres Willmott, Devin Butts, Craig P. Glowacki, David R. |
author_facet | Bratholm, Lars A. Gerrard, Will Anderson, Brandon Bai, Shaojie Choi, Sunghwan Dang, Lam Hanchar, Pavel Howard, Addison Kim, Sanghoon Kolter, Zico Kondor, Risi Kornbluth, Mordechai Lee, Youhan Lee, Youngsoo Mailoa, Jonathan P. Nguyen, Thanh Tu Popovic, Milos Rakocevic, Goran Reade, Walter Song, Wonho Stojanovic, Luka Thiede, Erik H. Tijanic, Nebojsa Torrubia, Andres Willmott, Devin Butts, Craig P. Glowacki, David R. |
author_sort | Bratholm, Lars A. |
collection | PubMed |
description | The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published ‘in-house’ efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties. |
format | Online Article Text |
id | pubmed-8291653 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-82916532021-07-31 A community-powered search of machine learning strategy space to find NMR property prediction models Bratholm, Lars A. Gerrard, Will Anderson, Brandon Bai, Shaojie Choi, Sunghwan Dang, Lam Hanchar, Pavel Howard, Addison Kim, Sanghoon Kolter, Zico Kondor, Risi Kornbluth, Mordechai Lee, Youhan Lee, Youngsoo Mailoa, Jonathan P. Nguyen, Thanh Tu Popovic, Milos Rakocevic, Goran Reade, Walter Song, Wonho Stojanovic, Luka Thiede, Erik H. Tijanic, Nebojsa Torrubia, Andres Willmott, Devin Butts, Craig P. Glowacki, David R. PLoS One Research Article The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published ‘in-house’ efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties. Public Library of Science 2021-07-20 /pmc/articles/PMC8291653/ /pubmed/34283864 http://dx.doi.org/10.1371/journal.pone.0253612 Text en © 2021 Bratholm et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Bratholm, Lars A. Gerrard, Will Anderson, Brandon Bai, Shaojie Choi, Sunghwan Dang, Lam Hanchar, Pavel Howard, Addison Kim, Sanghoon Kolter, Zico Kondor, Risi Kornbluth, Mordechai Lee, Youhan Lee, Youngsoo Mailoa, Jonathan P. Nguyen, Thanh Tu Popovic, Milos Rakocevic, Goran Reade, Walter Song, Wonho Stojanovic, Luka Thiede, Erik H. Tijanic, Nebojsa Torrubia, Andres Willmott, Devin Butts, Craig P. Glowacki, David R. A community-powered search of machine learning strategy space to find NMR property prediction models |
title | A community-powered search of machine learning strategy space to find NMR property prediction models |
title_full | A community-powered search of machine learning strategy space to find NMR property prediction models |
title_fullStr | A community-powered search of machine learning strategy space to find NMR property prediction models |
title_full_unstemmed | A community-powered search of machine learning strategy space to find NMR property prediction models |
title_short | A community-powered search of machine learning strategy space to find NMR property prediction models |
title_sort | community-powered search of machine learning strategy space to find nmr property prediction models |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291653/ https://www.ncbi.nlm.nih.gov/pubmed/34283864 http://dx.doi.org/10.1371/journal.pone.0253612 |
work_keys_str_mv | AT bratholmlarsa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT gerrardwill acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT andersonbrandon acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT baishaojie acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT choisunghwan acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT danglam acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT hancharpavel acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT howardaddison acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kimsanghoon acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kolterzico acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kondorrisi acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kornbluthmordechai acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT leeyouhan acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT leeyoungsoo acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT mailoajonathanp acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT nguyenthanhtu acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT popovicmilos acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT rakocevicgoran acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT readewalter acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT songwonho acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT stojanovicluka acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT thiedeerikh acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT tijanicnebojsa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT torrubiaandres acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT willmottdevin acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT buttscraigp acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT glowackidavidr acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT bratholmlarsa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT gerrardwill communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT andersonbrandon communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT baishaojie communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT choisunghwan communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT danglam communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT hancharpavel communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT howardaddison communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kimsanghoon communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kolterzico communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kondorrisi communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT kornbluthmordechai communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT leeyouhan communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT leeyoungsoo communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT mailoajonathanp communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT nguyenthanhtu communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT popovicmilos communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT rakocevicgoran communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT readewalter communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT songwonho communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT stojanovicluka communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT thiedeerikh communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT tijanicnebojsa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT torrubiaandres communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT willmottdevin communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT buttscraigp communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT glowackidavidr communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels |