RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation

Guardado en:
Detalles Bibliográficos
Publicado en:arXiv.org (Feb 22, 2024), p. n/a
Autor principal: Chen, Junting
Otros Autores: Yao, Mu, Yu, Qiaojun, Wei, Tianming, Wu, Silang, Yuan, Zhecheng, Liang, Zhixuan, Yang, Chao, Zhang, Kaipeng, Shao, Wenqi, Yu, Qiao, Xu, Huazhe, Ding, Mingyu, Luo, Ping
Publicado:
Cornell University Library, arXiv.org
Materias:
Acceso en línea:Citation/Abstract
Full text outside of ProQuest
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 2931003411
003 UK-CbPIL
022 |a 2331-8422 
035 |a 2931003411 
045 0 |b d20240222 
100 1 |a Chen, Junting 
245 1 |a RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation 
260 |b Cornell University Library, arXiv.org  |c Feb 22, 2024 
513 |a Working Paper 
520 3 |a Rapid progress in high-level task planning and code generation for open-world robot manipulation has been witnessed in Embodied AI. However, previous studies put much effort into general common sense reasoning and task planning capabilities of large-scale language or multi-modal models, relatively little effort on ensuring the deployability of generated code on real robots, and other fundamental components of autonomous robot systems including robot perception, motion planning, and control. To bridge this ``ideal-to-real'' gap, this paper presents \textbf{RobotScript}, a platform for 1) a deployable robot manipulation pipeline powered by code generation; and 2) a code generation benchmark for robot manipulation tasks in free-form natural language. The RobotScript platform addresses this gap by emphasizing the unified interface with both simulation and real robots, based on abstraction from the Robot Operating System (ROS), ensuring syntax compliance and simulation validation with Gazebo. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms, and multiple grippers. Additionally, our benchmark assesses reasoning abilities for physical space and constraints, highlighting the differences between GPT-3.5, GPT-4, and Gemini in handling complex physical interactions. Finally, we present a thorough evaluation on the whole system, exploring how each module in the pipeline: code generation, perception, motion planning, and even object geometric properties, impact the overall performance of the system. 
653 |a Simulation 
653 |a Task planning (robotics) 
653 |a Robot dynamics 
653 |a Robot arms 
653 |a Reasoning 
653 |a Robots 
653 |a Perception 
653 |a Grippers 
653 |a Multiple robots 
653 |a Artificial intelligence 
653 |a Natural language processing 
653 |a Free form 
653 |a Motion planning 
653 |a Benchmarks 
700 1 |a Yao, Mu 
700 1 |a Yu, Qiaojun 
700 1 |a Wei, Tianming 
700 1 |a Wu, Silang 
700 1 |a Yuan, Zhecheng 
700 1 |a Liang, Zhixuan 
700 1 |a Yang, Chao 
700 1 |a Zhang, Kaipeng 
700 1 |a Shao, Wenqi 
700 1 |a Yu, Qiao 
700 1 |a Xu, Huazhe 
700 1 |a Ding, Mingyu 
700 1 |a Luo, Ping 
773 0 |t arXiv.org  |g (Feb 22, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2931003411/abstract/embedded/WAQKWGCDE3OCLOOD?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2402.14623