Energy-Efficient Train Control Based on Energy Consumption Estimation Model and Deep Reinforcement Learning

Guardado en:

Detalles Bibliográficos
Publicado en:	Electronics vol. 14, no. 24 (2025), p. 4939-4962
Autor principal:	Liu, Jia
Otros Autores:	Wang, Yuemiao, Liu, Yirong, Li, Xiaoyu, Chen, Fuwang, Lu, Shaofeng
Publicado:	MDPI AG
Materias:	China Mathematical programming Subways Dynamic programming Accuracy Deep learning Trajectory optimization Artificial intelligence Mathematical models Optimization techniques Carbon Back propagation networks Random noise Algorithms Linear programming Machine learning Energy consumption Optimization algorithms Energy conservation Parameter estimation Run time (computers)
Acceso en línea:	Citation/Abstract Full Text + Graphics Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	Energy-efficient Train Control (EETC) strategy needs to meet safety, punctuality, and energy-saving requirements during train operation, and puts forward higher requirements for online use and adaptive ability. In order to meet the above requirements and reduce the dependence on an accurate mathematical model of train operation, this paper proposes a train-speed trajectory-optimization method combining data-driven energy consumption estimation and deep reinforcement learning. First of all, using real subway operation data, the key unit basic resistance coefficient in train operation is analyzed by regression. Then, based on the identified model, the energy consumption experiment data of train operation is generated, into which Gaussian noise is introduced to simulate real-world sensor measurement errors and environmental uncertainties. The energy consumption estimation model based on a Backpropagation (BP) neural network is constructed and trained. Finally, the energy consumption estimation model serves as a component within the Deep Deterministic Policy Gradient (DDPG) algorithm environment, and the action adjustment mechanism and reward are designed by integrating the expert experience to complete the optimization training of the strategy network. Experimental results demonstrate that the proposed method reduces energy consumption by approximately 4.4% compared to actual manual operation data. Furthermore, it achieves a solution deviation of less than 0.3% compared to the theoretical optimal baseline (Dynamic Programming), proving its ability to approximate global optimality. In addition, the proposed algorithm can adapt to the changes in train mass, initial set running time, and halfway running time while ensuring convergence performance and trajectory energy saving during online use.
ISSN:	2079-9292
DOI:	10.3390/electronics14244939
Fuente:	Advanced Technologies & Aerospace Database