Cargando…

A Reinforcement Learning Approach to Robust Scheduling of Permutation Flow Shop

The permutation flow shop scheduling problem (PFSP) stands as a classic conundrum within the realm of combinatorial optimization, serving as a prevalent organizational structure in authentic production settings. Given that conventional scheduling approaches fall short of effectively addressing the i...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Tao, Luo, Liang, Ji, Shengchen, He, Yuanxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10604314/
https://www.ncbi.nlm.nih.gov/pubmed/37887609
http://dx.doi.org/10.3390/biomimetics8060478
Descripción
Sumario:The permutation flow shop scheduling problem (PFSP) stands as a classic conundrum within the realm of combinatorial optimization, serving as a prevalent organizational structure in authentic production settings. Given that conventional scheduling approaches fall short of effectively addressing the intricate and ever-shifting production landscape of PFSP, this study proposes an end-to-end deep reinforcement learning methodology with the objective of minimizing the maximum completion time. To tackle PFSP, we initially model it as a Markov decision process, delineating pertinent states, actions, and reward functions. A notably innovative facet of our approach involves leveraging disjunctive graphs to represent PFSP state information. To glean the intrinsic topological data embedded within the disjunctive graph’s underpinning, we architect a policy network based on a graph isomorphism network, subsequently trained through proximal policy optimization. Our devised methodology is compared with six baseline methods on randomly generated instances and the Taillard benchmark, respectively. The experimental results unequivocally underscore the superiority of our proposed approach in terms of [Formula: see text] and computation time. Notably, the [Formula: see text] can save up to 183.2 h in randomly generated instances and 188.4 h in the Taillard benchmark. The calculation time can be reduced by up to 18.70 s for randomly generated instances and up to 18.16 s for the Taillard benchmark.