Hamilton-Jacobi Reachability Estimation in Reinforcement Learning

I tiakina i:
Ngā taipitopito rārangi puna kōrero
I whakaputaina i:ProQuest Dissertations and Theses (2024)
Kaituhi matua: Ganai, Milan
I whakaputaina:
ProQuest Dissertations & Theses
Ngā marau:
Urunga tuihono:Citation/Abstract
Full Text - PDF
Ngā Tūtohu: Tāpirihia he Tūtohu
Kāore He Tūtohu, Me noho koe te mea tuatahi ki te tūtohu i tēnei pūkete!

MARC

LEADER 00000nab a2200000uu 4500
001 3071384115
003 UK-CbPIL
020 |a 9798383057117 
035 |a 3071384115 
045 2 |b d20240101  |b d20241231 
084 |a 66569  |2 nlm 
100 1 |a Ganai, Milan 
245 1 |a Hamilton-Jacobi Reachability Estimation in Reinforcement Learning 
260 |b ProQuest Dissertations & Theses  |c 2024 
513 |a Dissertation/Thesis 
520 3 |a Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and supervising the training of reinforcement learning-based control policies for complex, high-dimensional systems. Previously, HJ reachability was limited to verifying low-dimensional dynamical systems – this is because the computational complexity of the dynamic programming approach it relied on grows exponentially with the number of system states. To address this limitation, in recent years, there have been methods that compute the reachability value function simultaneously with learning control policies to scale HJ reachability analysis while still maintaining a reliable estimate of the true reachable set. These HJ reachability approximations are used to improve the safety, and even reward performance, of reinforcement learning (RL) based control policies and can solve challenging tasks such as those with dynamic obstacles and/or with lidar-based or vision-based observations. We first introduce the framework for HJ reachability estimation in reinforcement learning. Then, we review the recent developments in the field of HJ reachability estimation research for reliability in high-dimensional systems. Subsequently, we present a new framework called Reachability Estimation for Safe Policy Optimization that employs HJ reachability estimation for stochastic safety-constrained reinforcement learning and provide safety guarantees and optimal convergence analysis. 
653 |a Computer science 
653 |a Computer engineering 
653 |a Robotics 
773 0 |t ProQuest Dissertations and Theses  |g (2024) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3071384115/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3071384115/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch