RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation

Guardado en:

Detalles Bibliográficos
Publicado en:	arXiv.org (Feb 22, 2024), p. n/a
Autor principal:	Chen, Junting
Otros Autores:	Yao, Mu, Yu, Qiaojun, Wei, Tianming, Wu, Silang, Yuan, Zhecheng, Liang, Zhixuan, Yang, Chao, Zhang, Kaipeng, Shao, Wenqi, Yu, Qiao, Xu, Huazhe, Ding, Mingyu, Luo, Ping
Publicado:	Cornell University Library, arXiv.org
Materias:	Simulation Task planning (robotics) Robot dynamics Robot arms Reasoning Robots Perception Grippers Multiple robots Artificial intelligence Natural language processing Free form Motion planning Benchmarks
Acceso en línea:	Citation/Abstract Full text outside of ProQuest
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC


LEADER	00000nab a2200000uu 4500
001	2931003411
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 2931003411
045	0		\|b d20240222
100	1		\|a Chen, Junting
245	1		\|a RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation
260			\|b Cornell University Library, arXiv.org \|c Feb 22, 2024
513			\|a Working Paper
520	3		\|a Rapid progress in high-level task planning and code generation for open-world robot manipulation has been witnessed in Embodied AI. However, previous studies put much effort into general common sense reasoning and task planning capabilities of large-scale language or multi-modal models, relatively little effort on ensuring the deployability of generated code on real robots, and other fundamental components of autonomous robot systems including robot perception, motion planning, and control. To bridge this ``ideal-to-real'' gap, this paper presents \textbf{RobotScript}, a platform for 1) a deployable robot manipulation pipeline powered by code generation; and 2) a code generation benchmark for robot manipulation tasks in free-form natural language. The RobotScript platform addresses this gap by emphasizing the unified interface with both simulation and real robots, based on abstraction from the Robot Operating System (ROS), ensuring syntax compliance and simulation validation with Gazebo. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms, and multiple grippers. Additionally, our benchmark assesses reasoning abilities for physical space and constraints, highlighting the differences between GPT-3.5, GPT-4, and Gemini in handling complex physical interactions. Finally, we present a thorough evaluation on the whole system, exploring how each module in the pipeline: code generation, perception, motion planning, and even object geometric properties, impact the overall performance of the system.
653			\|a Simulation
653			\|a Task planning (robotics)
653			\|a Robot dynamics
653			\|a Robot arms
653			\|a Reasoning
653			\|a Robots
653			\|a Perception
653			\|a Grippers
653			\|a Multiple robots
653			\|a Artificial intelligence
653			\|a Natural language processing
653			\|a Free form
653			\|a Motion planning
653			\|a Benchmarks
700	1		\|a Yao, Mu
700	1		\|a Yu, Qiaojun
700	1		\|a Wei, Tianming
700	1		\|a Wu, Silang
700	1		\|a Yuan, Zhecheng
700	1		\|a Liang, Zhixuan
700	1		\|a Yang, Chao
700	1		\|a Zhang, Kaipeng
700	1		\|a Shao, Wenqi
700	1		\|a Yu, Qiao
700	1		\|a Xu, Huazhe
700	1		\|a Ding, Mingyu
700	1		\|a Luo, Ping
773	0		\|t arXiv.org \|g (Feb 22, 2024), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/2931003411/abstract/embedded/WAQKWGCDE3OCLOOD?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2402.14623