Cargando…
Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning
We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As diff...
Autores principales: | , , |
---|---|
Lenguaje: | eng |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1007/JHEP06(2019)003 http://cds.cern.ch/record/2671898 |
_version_ | 1780962423927209984 |
---|---|
author | Halverson, James Nelson, Brent Ruehle, Fabian |
author_facet | Halverson, James Nelson, Brent Ruehle, Fabian |
author_sort | Halverson, James |
collection | CERN |
description | We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent’s policy and value neural networks to improve its behavior. By reinforcement learning, the agent’s performance in both tasks is significantly improved, and for some tasks it finds a factor of $ \mathcal{O}(200) $ more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations. |
id | cern-2671898 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2019 |
record_format | invenio |
spelling | cern-26718982023-10-04T07:44:19Zdoi:10.1007/JHEP06(2019)003http://cds.cern.ch/record/2671898engHalverson, JamesNelson, BrentRuehle, FabianBranes with Brains: Exploring String Vacua with Deep Reinforcement Learninghep-thParticle Physics - TheoryWe propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent’s policy and value neural networks to improve its behavior. By reinforcement learning, the agent’s performance in both tasks is significantly improved, and for some tasks it finds a factor of $ \mathcal{O}(200) $ more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent's policy and value neural networks to improve its behavior. By reinforcement learning, the agent's performance in both tasks is significantly improved, and for some tasks it finds a factor of O(200) more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.arXiv:1903.11616oai:cds.cern.ch:26718982019-03-27 |
spellingShingle | hep-th Particle Physics - Theory Halverson, James Nelson, Brent Ruehle, Fabian Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title | Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title_full | Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title_fullStr | Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title_full_unstemmed | Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title_short | Branes with Brains: Exploring String Vacua with Deep Reinforcement Learning |
title_sort | branes with brains: exploring string vacua with deep reinforcement learning |
topic | hep-th Particle Physics - Theory |
url | https://dx.doi.org/10.1007/JHEP06(2019)003 http://cds.cern.ch/record/2671898 |
work_keys_str_mv | AT halversonjames braneswithbrainsexploringstringvacuawithdeepreinforcementlearning AT nelsonbrent braneswithbrainsexploringstringvacuawithdeepreinforcementlearning AT ruehlefabian braneswithbrainsexploringstringvacuawithdeepreinforcementlearning |