A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs

Guardado en:
Bibliografiske detaljer
Udgivet i:arXiv.org (Dec 24, 2024), p. n/a
Hovedforfatter: OpenMind
Andre forfattere: Zhong, Shaohong, Zhou, Adam, Chen, Boyuan, Luo, Homin, Liphardt, Jan
Udgivet:
Cornell University Library, arXiv.org
Fag:
Online adgang:Citation/Abstract
Full text outside of ProQuest
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3149109035
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3149109035 
045 0 |b d20241224 
100 1 |a OpenMind 
245 1 |a A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs 
260 |b Cornell University Library, arXiv.org  |c Dec 24, 2024 
513 |a Working Paper 
520 3 |a Large Language Models (LLMs) are compact representations of all public knowledge of our physical environment and animal and human behaviors. The application of LLMs to robotics may offer a path to highly capable robots that perform well across most human tasks with limited or even zero tuning. Aside from increasingly sophisticated reasoning and task planning, networks of (suitably designed) LLMs offer ease of upgrading capabilities and allow humans to directly observe the robot's thinking. Here we explore the advantages, limitations, and particularities of using LLMs to control physical robots. The basic system consists of four LLMs communicating via a human language data bus implemented via web sockets and ROS2 message passing. Surprisingly, rich robot behaviors and good performance across different tasks could be achieved despite the robot's data fusion cycle running at only 1Hz and the central data bus running at the extremely limited rates of the human brain, of around 40 bits/s. The use of natural language for inter-LLM communication allowed the robot's reasoning and decision making to be directly observed by humans and made it trivial to bias the system's behavior with sets of rules written in plain English. These rules were immutably written into Ethereum, a global, public, and censorship resistant Turing-complete computer. We suggest that by using natural language as the data bus among interacting AIs, and immutable public ledgers to store behavior constraints, it is possible to build robots that combine unexpectedly rich performance, upgradability, and durable alignment with humans. 
653 |a Robotics 
653 |a Behavior 
653 |a Task planning (robotics) 
653 |a Message passing 
653 |a Large language models 
653 |a Communication 
653 |a Reasoning 
653 |a Human performance 
653 |a Robots 
653 |a Data buses 
653 |a Data integration 
653 |a Knowledge representation 
653 |a Human behavior 
653 |a Robot control 
653 |a Natural language 
700 1 |a Zhong, Shaohong 
700 1 |a Zhou, Adam 
700 1 |a Chen, Boyuan 
700 1 |a Luo, Homin 
700 1 |a Liphardt, Jan 
773 0 |t arXiv.org  |g (Dec 24, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3149109035/abstract/embedded/ITVB7CEANHELVZIZ?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2412.18588