Cargando…

Faster sorting algorithms discovered using deep reinforcement learning

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mankowitz, Daniel J., Michi, Andrea, Zhernov, Anton, Gelmi, Marco, Selvi, Marco, Paduraru, Cosmin, Leurent, Edouard, Iqbal, Shariq, Lespiau, Jean-Baptiste, Ahern, Alex, Köppe, Thomas, Millikin, Kevin, Gaffney, Stephen, Elster, Sophie, Broshear, Jackson, Gamble, Chris, Milan, Kieran, Tung, Robert, Hwang, Minjae, Cemgil, Taylan, Barekatain, Mohammadamin, Li, Yujia, Mandhane, Amol, Hubert, Thomas, Schrittwieser, Julian, Hassabis, Demis, Kohli, Pushmeet, Riedmiller, Martin, Vinyals, Oriol, Silver, David
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/ https://www.ncbi.nlm.nih.gov/pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9

_version_	1785055184835051520
author	Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David
author_facet	Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David
author_sort	Mankowitz, Daniel J.
collection	PubMed
description	Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.
format	Online Article Text
id	pubmed-10247365
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-102473652023-06-09 Faster sorting algorithms discovered using deep reinforcement learning Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David Nature Article Fundamental algorithms such as sorting or hashing are used trillions of times on any given day(1). As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past(2), making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library(3). This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach. Nature Publishing Group UK 2023-06-07 2023 /pmc/articles/PMC10247365/ /pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Mankowitz, Daniel J. Michi, Andrea Zhernov, Anton Gelmi, Marco Selvi, Marco Paduraru, Cosmin Leurent, Edouard Iqbal, Shariq Lespiau, Jean-Baptiste Ahern, Alex Köppe, Thomas Millikin, Kevin Gaffney, Stephen Elster, Sophie Broshear, Jackson Gamble, Chris Milan, Kieran Tung, Robert Hwang, Minjae Cemgil, Taylan Barekatain, Mohammadamin Li, Yujia Mandhane, Amol Hubert, Thomas Schrittwieser, Julian Hassabis, Demis Kohli, Pushmeet Riedmiller, Martin Vinyals, Oriol Silver, David Faster sorting algorithms discovered using deep reinforcement learning
title	Faster sorting algorithms discovered using deep reinforcement learning
title_full	Faster sorting algorithms discovered using deep reinforcement learning
title_fullStr	Faster sorting algorithms discovered using deep reinforcement learning
title_full_unstemmed	Faster sorting algorithms discovered using deep reinforcement learning
title_short	Faster sorting algorithms discovered using deep reinforcement learning
title_sort	faster sorting algorithms discovered using deep reinforcement learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10247365/ https://www.ncbi.nlm.nih.gov/pubmed/37286649 http://dx.doi.org/10.1038/s41586-023-06004-9
work_keys_str_mv	AT mankowitzdanielj fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT michiandrea fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT zhernovanton fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gelmimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT selvimarco fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT padurarucosmin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT leurentedouard fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT iqbalshariq fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT lespiaujeanbaptiste fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT ahernalex fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT koppethomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT millikinkevin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gaffneystephen fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT elstersophie fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT broshearjackson fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT gamblechris fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT milankieran fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT tungrobert fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hwangminjae fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT cemgiltaylan fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT barekatainmohammadamin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT liyujia fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT mandhaneamol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hubertthomas fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT schrittwieserjulian fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT hassabisdemis fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT kohlipushmeet fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT riedmillermartin fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT vinyalsoriol fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning AT silverdavid fastersortingalgorithmsdiscoveredusingdeepreinforcementlearning

Faster sorting algorithms discovered using deep reinforcement learning

Ejemplares similares