Large Language Model-Assisted Deep Reinforcement Learning from Human Feedback for Job Shop Scheduling

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machines vol. 13, no. 5 (2025), p. 361
1. Verfasser: Zeng Yuhang
Weitere Verfasser: Lou, Ping, Hu, Jianmin, Fan Chuannian, Liu, Quan, Hu, Jiwei
Veröffentlicht:
MDPI AG
Schlagworte:
Online-Zugang:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Tags: Tag hinzufügen
Keine Tags, Fügen Sie das erste Tag hinzu!

MARC

LEADER 00000nab a2200000uu 4500
001 3212071253
003 UK-CbPIL
022 |a 2075-1702 
024 7 |a 10.3390/machines13050361  |2 doi 
035 |a 3212071253 
045 2 |b d20250101  |b d20251231 
084 |a 231531  |2 nlm 
100 1 |a Zeng Yuhang  |u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.) 
245 1 |a Large Language Model-Assisted Deep Reinforcement Learning from Human Feedback for Job Shop Scheduling 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a The job shop scheduling problem (JSSP) is a classical NP-hard combinatorial optimization challenge that plays a crucial role in manufacturing systems. Deep reinforcement learning has shown great potential in solving this problem. However, it still has challenges in reward function design and state feature representation, which makes it suffer from slow policy convergence and low learning efficiency in complex production environments. Therefore, a human feedback-based large language model-assisted deep reinforcement learning (HFLLMDRL) framework is proposed to solve this problem, in which few-shot prompt engineering by human feedback is utilized to assist in designing instructive reward functions and guiding policy convergence. Additionally, a self-adaptation symbolic visualization Kolmogorov–Arnold Network (KAN) is integrated as the policy network in DRL to enhance state feature representation, thereby improving learning efficiency. Experimental results demonstrate that the proposed framework significantly boosts both learning performance and policy convergence, presenting a novel approach to the JSSP. 
653 |a Language 
653 |a Integer programming 
653 |a Deep learning 
653 |a Combinatorial analysis 
653 |a Feedback 
653 |a Optimization 
653 |a Job shops 
653 |a Prompt engineering 
653 |a Automation 
653 |a Manufacturing 
653 |a Machine learning 
653 |a Performance evaluation 
653 |a Heuristic 
653 |a Generative artificial intelligence 
653 |a Representations 
653 |a Natural language 
653 |a Mathematical programming 
653 |a Scheduling 
653 |a Convergence 
653 |a Large language models 
653 |a Decision making 
653 |a Preferences 
653 |a Design 
653 |a Cost analysis 
653 |a Literature reviews 
653 |a Algorithms 
653 |a Job shop scheduling 
700 1 |a Lou, Ping  |u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.) 
700 1 |a Hu, Jianmin  |u School of Information Engineering, Hubei University of Economics, Wuhan 430205, China 
700 1 |a Fan Chuannian  |u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.) 
700 1 |a Liu, Quan  |u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.) 
700 1 |a Hu, Jiwei  |u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.) 
773 0 |t Machines  |g vol. 13, no. 5 (2025), p. 361 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3212071253/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3212071253/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3212071253/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch