Cargando…

Faster sorting algorithms discovered using deep reinforcement learning

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements o...

Descripción completa

Detalles Bibliográficos
Autores principales: Mankowitz, Daniel J., Michi, Andrea, Zhernov, Anton, Gelmi, Marco, Selvi, Marco, Paduraru, Cosmin, Leurent, Edouard, Iqbal, Shariq, Lespiau, Jean-Baptiste, Ahern, Alex, Köppe, Thomas, Millikin, Kevin, Gaffney, Stephen, Elster, Sophie, Broshear, Jackson, Gamble, Chris, Milan, Kieran, Tung, Robert, Hwang, Minjae, Cemgil, Taylan, Barekatain, Mohammadamin, Li, Yujia, Mandhane, Amol, Hubert, Thomas, Schrittwieser, Julian, Hassabis, Demis, Kohli, Pushmeet, Riedmiller, Martin, Vinyals, Oriol, Silver, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/
https://www.ncbi.nlm.nih.gov/pubmed/37286649
http://dx.doi.org/10.1038/s41586-023-06004-9
_version_ 1785055184835051520
author Mankowitz, Daniel J.
Michi, Andrea
Zhernov, Anton
Gelmi, Marco
Selvi, Marco
Paduraru, Cosmin
Leurent, Edouard
Iqbal, Shariq
Lespiau, Jean-Baptiste
Ahern, Alex
Köppe, Thomas
Millikin, Kevin
Gaffney, Stephen
Elster, Sophie
Broshear, Jackson
Gamble, Chris
Milan, Kieran
Tung, Robert
Hwang, Minjae
Cemgil, Taylan
Barekatain, Mohammadamin
Li, Yujia
Mandhane, Amol
Hubert, Thomas
Schrittwieser, Julian
Hassabis, Demis
Kohli, Pushmeet
Riedmiller, Martin
Vinyals, Oriol
Silver, David
author_facet Mankowitz, Daniel J.
Michi, Andrea
Zhernov, Anton
Gelmi, Marco
Selvi, Marco
Paduraru, Cosmin
Leurent, Edouard
Iqbal, Shariq
Lespiau, Jean-Baptiste
Ahern, Alex
Köppe, Thomas
Millikin, Kevin
Gaffney, Stephen
Elster, Sophie
Broshear, Jackson
Gamble, Chris
Milan, Kieran
Tung, Robert
Hwang, Minjae
Cemgil, Taylan
Barekatain, Mohammadamin
Li, Yujia
Mandhane, Amol
Hubert, Thomas
Schrittwieser, Julian
Hassabis, Demis
Kohli, Pushmeet
Riedmiller, Martin
Vinyals, Oriol
Silver, David
author_sort Mankowitz, Daniel J.
collection PubMed
description Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.
format Online
Article
Text
id pubmed-10247365
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-102473652023-06-09 Faster sorting algorithms discovered using deep reinforcement learning Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David Nature Article Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach. Nature Publishing Group UK 2023-06-07 2023 /pmc/articles/PMC10247365/ /pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Mankowitz, Daniel J.
Michi, Andrea
Zhernov, Anton
Gelmi, Marco
Selvi, Marco
Paduraru, Cosmin
Leurent, Edouard
Iqbal, Shariq
Lespiau, Jean-Baptiste
Ahern, Alex
Köppe, Thomas
Millikin, Kevin
Gaffney, Stephen
Elster, Sophie
Broshear, Jackson
Gamble, Chris
Milan, Kieran
Tung, Robert
Hwang, Minjae
Cemgil, Taylan
Barekatain, Mohammadamin
Li, Yujia
Mandhane, Amol
Hubert, Thomas
Schrittwieser, Julian
Hassabis, Demis
Kohli, Pushmeet
Riedmiller, Martin
Vinyals, Oriol
Silver, David
Faster sorting algorithms discovered using deep reinforcement learning
title Faster sorting algorithms discovered using deep reinforcement learning
title_full Faster sorting algorithms discovered using deep reinforcement learning
title_fullStr Faster sorting algorithms discovered using deep reinforcement learning
title_full_unstemmed Faster sorting algorithms discovered using deep reinforcement learning
title_short Faster sorting algorithms discovered using deep reinforcement learning
title_sort faster sorting algorithms discovered using deep reinforcement learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/
https://www.ncbi.nlm.nih.gov/pubmed/37286649
http://dx.doi.org/10.1038/s41586-023-06004-9
work_keys_str_mv AT mankowitzdanielj fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT michiandrea fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT zhernovanton fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT gelmimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT selvimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT padurarucosmin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT leurentedouard fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT iqbalshariq fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT lespiaujeanbaptiste fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT ahernalex fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT koppethomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT millikinkevin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT gaffneystephen fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT elstersophie fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT broshearjackson fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT gamblechris fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT milankieran fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT tungrobert fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT hwangminjae fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT cemgiltaylan fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT barekatainmohammadamin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT liyujia fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT mandhaneamol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT hubertthomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT schrittwieserjulian fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT hassabisdemis fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT kohlipushmeet fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT riedmillermartin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT vinyalsoriol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning
AT silverdavid fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning