Cargando…
Faster sorting algorithms discovered using deep reinforcement learning
Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements o...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/ https://www.ncbi.nlm.nih.gov/pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9 |
_version_ | 1785055184835051520 |
---|---|
author | Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David |
author_facet | Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David |
author_sort | Mankowitz, Daniel J. |
collection | PubMed |
description | Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach. |
format | Online Article Text |
id | pubmed-10247365 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-102473652023-06-09 Faster sorting algorithms discovered using deep reinforcement learning Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David Nature Article Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach. Nature Publishing Group UK 2023-06-07 2023 /pmc/articles/PMC10247365/ /pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David Faster sorting algorithms discovered using deep reinforcement learning |
title | Faster sorting algorithms discovered using deep reinforcement learning |
title_full | Faster sorting algorithms discovered using deep reinforcement learning |
title_fullStr | Faster sorting algorithms discovered using deep reinforcement learning |
title_full_unstemmed | Faster sorting algorithms discovered using deep reinforcement learning |
title_short | Faster sorting algorithms discovered using deep reinforcement learning |
title_sort | faster sorting algorithms discovered using deep reinforcement learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/ https://www.ncbi.nlm.nih.gov/pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9 |
work_keys_str_mv | AT mankowitzdanielj fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT michiandrea fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT zhernovanton fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gelmimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT selvimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT padurarucosmin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT leurentedouard fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT iqbalshariq fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT lespiaujeanbaptiste fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT ahernalex fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT koppethomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT millikinkevin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gaffneystephen fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT elstersophie fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT broshearjackson fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gamblechris fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT milankieran fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT tungrobert fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hwangminjae fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT cemgiltaylan fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT barekatainmohammadamin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT liyujia fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT mandhaneamol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hubertthomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT schrittwieserjulian fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hassabisdemis fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT kohlipushmeet fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT riedmillermartin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT vinyalsoriol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT silverdavid fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning |