Large Language Model-Assisted Deep Reinforcement Learning from Human Feedback for Job Shop Scheduling

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Machines vol. 13, no. 5 (2025), p. 361
1. Verfasser:	Zeng Yuhang
Weitere Verfasser:	Lou, Ping, Hu, Jianmin, Fan Chuannian, Liu, Quan, Hu, Jiwei
Veröffentlicht:	MDPI AG
Schlagworte:	Language Integer programming Deep learning Combinatorial analysis Feedback Optimization Job shops Prompt engineering Automation Manufacturing Machine learning Performance evaluation Heuristic Generative artificial intelligence Representations Natural language Mathematical programming Scheduling Convergence Large language models Decision making Preferences Design Cost analysis Literature reviews Algorithms Job shop scheduling
Online-Zugang:	Citation/Abstract Full Text + Graphics Full Text - PDF
Tags:	Tag hinzufügen Keine Tags, Fügen Sie das erste Tag hinzu!

MARC


LEADER	00000nab a2200000uu 4500
001	3212071253
003	UK-CbPIL
022			\|a 2075-1702
024	7		\|a 10.3390/machines13050361 \|2 doi
035			\|a 3212071253
045	2		\|b d20250101 \|b d20251231
084			\|a 231531 \|2 nlm
100	1		\|a Zeng Yuhang \|u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.)
245	1		\|a Large Language Model-Assisted Deep Reinforcement Learning from Human Feedback for Job Shop Scheduling
260			\|b MDPI AG \|c 2025
513			\|a Journal Article
520	3		\|a The job shop scheduling problem (JSSP) is a classical NP-hard combinatorial optimization challenge that plays a crucial role in manufacturing systems. Deep reinforcement learning has shown great potential in solving this problem. However, it still has challenges in reward function design and state feature representation, which makes it suffer from slow policy convergence and low learning efficiency in complex production environments. Therefore, a human feedback-based large language model-assisted deep reinforcement learning (HFLLMDRL) framework is proposed to solve this problem, in which few-shot prompt engineering by human feedback is utilized to assist in designing instructive reward functions and guiding policy convergence. Additionally, a self-adaptation symbolic visualization Kolmogorov–Arnold Network (KAN) is integrated as the policy network in DRL to enhance state feature representation, thereby improving learning efficiency. Experimental results demonstrate that the proposed framework significantly boosts both learning performance and policy convergence, presenting a novel approach to the JSSP.
653			\|a Language
653			\|a Integer programming
653			\|a Deep learning
653			\|a Combinatorial analysis
653			\|a Feedback
653			\|a Optimization
653			\|a Job shops
653			\|a Prompt engineering
653			\|a Automation
653			\|a Manufacturing
653			\|a Machine learning
653			\|a Performance evaluation
653			\|a Heuristic
653			\|a Generative artificial intelligence
653			\|a Representations
653			\|a Natural language
653			\|a Mathematical programming
653			\|a Scheduling
653			\|a Convergence
653			\|a Large language models
653			\|a Decision making
653			\|a Preferences
653			\|a Design
653			\|a Cost analysis
653			\|a Literature reviews
653			\|a Algorithms
653			\|a Job shop scheduling
700	1		\|a Lou, Ping \|u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.)
700	1		\|a Hu, Jianmin \|u School of Information Engineering, Hubei University of Economics, Wuhan 430205, China
700	1		\|a Fan Chuannian \|u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.)
700	1		\|a Liu, Quan \|u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.)
700	1		\|a Hu, Jiwei \|u School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China; zengyuhang@whut.edu.cn (Y.Z.); louping@whut.edu.cn (P.L.); 305838@whut.edu.cn (C.F.); quanliu@whut.edu.cn (Q.L.); hujiwei@whut.edu.cn (J.H.)
773	0		\|t Machines \|g vol. 13, no. 5 (2025), p. 361
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3212071253/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text + Graphics \|u https://www.proquest.com/docview/3212071253/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3212071253/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch