Cargando…

Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning

We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As diff...

Descripción completa

Detalles Bibliográficos
Autores principales: Halverson, James, Nelson, Brent, Ruehle, Fabian
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:https://dx.doi.org/10.1007/JHEP06(2019)003
http://cds.cern.ch/record/2671898
_version_ 1780962423927209984
author Halverson, James
Nelson, Brent
Ruehle, Fabian
author_facet Halverson, James
Nelson, Brent
Ruehle, Fabian
author_sort Halverson, James
collection CERN
description We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent’s policy and value neural networks to improve its behavior. By reinforcement learning, the agent’s performance in both tasks is significantly improved, and for some tasks it finds a factor of $ \mathcal{O}(200) $ more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.
id cern-2671898
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26718982023-10-04T07:44:19Zdoi:10.1007/JHEP06(2019)003http://cds.cern.ch/record/2671898engHalverson, JamesNelson, BrentRuehle, FabianBranes with Brains: Exploring String Vacua with Deep Reinforcement Learninghep-thParticle Physics - TheoryWe propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent’s policy and value neural networks to improve its behavior. By reinforcement learning, the agent’s performance in both tasks is significantly improved, and for some tasks it finds a factor of $ \mathcal{O}(200) $ more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent's policy and value neural networks to improve its behavior. By reinforcement learning, the agent's performance in both tasks is significantly improved, and for some tasks it finds a factor of O(200) more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.arXiv:1903.11616oai:cds.cern.ch:26718982019-03-27
spellingShingle hep-th
Particle Physics - Theory
Halverson, James
Nelson, Brent
Ruehle, Fabian
Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title_full Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title_fullStr Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title_full_unstemmed Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title_short Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
title_sort branes with brains: exploring string vacua with deep reinforcement learning
topic hep-th
Particle Physics - Theory
url https://dx.doi.org/10.1007/JHEP06(2019)003
http://cds.cern.ch/record/2671898
work_keys_str_mv AT halversonjames braneswithbrainsexploringstringvacuawithdeepreinforcementlearning
AT nelsonbrent braneswithbrainsexploringstringvacuawithdeepreinforcementlearning
AT ruehlefabian braneswithbrainsexploringstringvacuawithdeepreinforcementlearning