Advancing rail mobility: Generating robust train paths with deep reinforcement learning

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Insinööritieteiden korkeakoulu | Master's thesis
Date
2024-08-19
Department
Major/Subject
Sustainable Urban Mobility Transitions
Mcode
ENG3085
Degree programme
Master’s programme in Urban Mobility
Language
en
Pages
74+7
Series
Abstract
The search for methods generating robust rail path in the timetable has been a focus area in academic research and anticipates potential use in industry. It is a foremost requirement to generate train paths in timetable which are robust inherently and are not affected by minor delays. During the tactical planning stages, it is difficult to predict the future delays to incorporate required supplement times – which are the additional times added to provide a buffer when a delay is experienced in operation. In practical, creating a timetable is still a manual process with minor usage of microscopic level tools for eliminating conflicts. In research there are several many optimization techniques proposed and tested to generate robust train paths and have an optimized overall timetable. In recent years the implementation of reinforcement learning (RL) techniques to schedule or reschedule trains in a timetable has increased due to its ability of finding feasible results by heuristics. This opens a fascinating area of research to test and evaluate the performance of RL models and pave a path towards implementation in real world scenarios. In this thesis, a deep reinforcement learning model is developed to test and evaluate the performance of different deep reinforcement learning (DRL) agents. Initially, a reinforcement learning environment is developed based on the railway timetabling problem. The selection of parameters, constraints and function provides a deeper insight in the significance of these elements in the final performance of the agent. In this environment, two DRL agents, Soft Actor Critic (SAC) and Trust Region Policy Optimization (TRPO) are trained under several constraint conditions and the performance is evaluated. The better performing agent is then tested in the environment to find feasible and optimal train paths. The evaluation is performed under varying conditions of delay and disturbances. Two experiments are conducted in this thesis based on a case study area, which is part of Swedish Rail Network from Sala (Sl) to Västerås Central station (Vc). The results from the evaluation based on the simulation of agent exploiting the learned policy in environment show that the method can generate feasible train paths in a timetable under delay conditions.
Description
Supervisor
Roncoli, Cludio
Thesis advisor
Lindbergh, Jakob
Högdahl, Johan
Keywords
railways, reinforcement learning, timetable optimization, train path generation, robust timetabling
Other note
Citation