Cargando…

Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression

Grammar is a key input in grammar-based genetic programming. Grammar design not only influences performance, but also program size. However, grammar design and the choice of productions often require expert input as no automatic approach exists. This research work discusses our approach to automatic...

Descripción completa

Detalles Bibliográficos
Autores principales: Ali, Muhammad Sarmad, Kshirsagar, Meghana, Naredo, Enrique, Ryan, Conor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10192180/
https://www.ncbi.nlm.nih.gov/pubmed/37214587
http://dx.doi.org/10.1007/s42979-023-01840-y
_version_ 1785043575061348352
author Ali, Muhammad Sarmad
Kshirsagar, Meghana
Naredo, Enrique
Ryan, Conor
author_facet Ali, Muhammad Sarmad
Kshirsagar, Meghana
Naredo, Enrique
Ryan, Conor
author_sort Ali, Muhammad Sarmad
collection PubMed
description Grammar is a key input in grammar-based genetic programming. Grammar design not only influences performance, but also program size. However, grammar design and the choice of productions often require expert input as no automatic approach exists. This research work discusses our approach to automatically reduce a bloated grammar. By utilizing a simple Production Ranking mechanism, we identify productions which are less useful and dynamically prune those to channel evolutionary search towards better (smaller) solutions. Our objective in this work was program size reduction without compromising generalization performance. We tested our approach on 13 standard symbolic regression datasets with Grammatical Evolution. Using a grammar embodying a well-defined function set as a baseline, we compare effective genome length and test performance with our approach. Dynamic grammar pruning achieved significantly better genome lengths for all datasets, while significantly improving generalization performance on three datasets, although it worsened in five datasets. When we utilized linear scaling during the production ranking stages (the first 20 generations) the results dramatically improved. Not only were the programs smaller in all datasets, but generalization scores were also significantly better than the baseline in 6 out of 13 datasets, and comparable in the rest. When the baseline was also linearly scaled as well, the program size was still smaller with the Production Ranking approach, while generalization scores dropped in only three datasets without any significant compromise in the rest.
format Online
Article
Text
id pubmed-10192180
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-101921802023-05-19 Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression Ali, Muhammad Sarmad Kshirsagar, Meghana Naredo, Enrique Ryan, Conor SN Comput Sci Original Research Grammar is a key input in grammar-based genetic programming. Grammar design not only influences performance, but also program size. However, grammar design and the choice of productions often require expert input as no automatic approach exists. This research work discusses our approach to automatically reduce a bloated grammar. By utilizing a simple Production Ranking mechanism, we identify productions which are less useful and dynamically prune those to channel evolutionary search towards better (smaller) solutions. Our objective in this work was program size reduction without compromising generalization performance. We tested our approach on 13 standard symbolic regression datasets with Grammatical Evolution. Using a grammar embodying a well-defined function set as a baseline, we compare effective genome length and test performance with our approach. Dynamic grammar pruning achieved significantly better genome lengths for all datasets, while significantly improving generalization performance on three datasets, although it worsened in five datasets. When we utilized linear scaling during the production ranking stages (the first 20 generations) the results dramatically improved. Not only were the programs smaller in all datasets, but generalization scores were also significantly better than the baseline in 6 out of 13 datasets, and comparable in the rest. When the baseline was also linearly scaled as well, the program size was still smaller with the Production Ranking approach, while generalization scores dropped in only three datasets without any significant compromise in the rest. Springer Nature Singapore 2023-05-17 2023 /pmc/articles/PMC10192180/ /pubmed/37214587 http://dx.doi.org/10.1007/s42979-023-01840-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Research
Ali, Muhammad Sarmad
Kshirsagar, Meghana
Naredo, Enrique
Ryan, Conor
Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title_full Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title_fullStr Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title_full_unstemmed Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title_short Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression
title_sort dynamic grammar pruning for program size reduction in symbolic regression
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10192180/
https://www.ncbi.nlm.nih.gov/pubmed/37214587
http://dx.doi.org/10.1007/s42979-023-01840-y
work_keys_str_mv AT alimuhammadsarmad dynamicgrammarpruningforprogramsizereductioninsymbolicregression
AT kshirsagarmeghana dynamicgrammarpruningforprogramsizereductioninsymbolicregression
AT naredoenrique dynamicgrammarpruningforprogramsizereductioninsymbolicregression
AT ryanconor dynamicgrammarpruningforprogramsizereductioninsymbolicregression