NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org (Dec 8, 2024), p. n/a
1. Verfasser:	Fiotto-Kaufman, Jaden
Weitere Verfasser:	Loftus, Alexander R, Todd, Eric, Brinkmann, Jannik, Pal, Koyena, Troitskii, Dmitrii, Ripa, Michael, Belfki, Adam, Rager, Can, Juang, Caden, Mueller, Aaron, Marks, Samuel, Arnab Sen Sharma, Lucchetti, Francesca, Prakash, Nikhil, Brodley, Carla, Guha, Arjun, Bell, Jonathan, Wallace, Byron C, Bau, David
Veröffentlicht:	Cornell University Library, arXiv.org
Schlagworte:	Application programming interface Python Source code Neural networks
Online-Zugang:	Citation/Abstract Full text outside of ProQuest
Tags:	Tag hinzufügen Keine Tags, Fügen Sie das erste Tag hinzu!

MARC


LEADER	00000nab a2200000uu 4500
001	3083764303
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 3083764303
045	0		\|b d20241208
100	1		\|a Fiotto-Kaufman, Jaden
245	1		\|a NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
260			\|b Cornell University Library, arXiv.org \|c Dec 8, 2024
513			\|a Working Paper
520	3		\|a We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks. NNsight is an open-source system that extends PyTorch to introduce deferred remote execution. NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models. These technologies are enabled by the intervention graph, an architecture developed to decouple experiment design from model runtime. Together, this framework provides transparent and efficient access to the internals of deep neural networks such as very large language models (LLMs) without imposing the cost or complexity of hosting customized models individually. We conduct a quantitative survey of the machine learning literature that reveals a growing gap in the study of the internals of large-scale AI. We demonstrate the design and use of our framework to address this gap by enabling a range of research methods on huge models. Finally, we conduct benchmarks to compare performance with previous approaches. Code documentation, and materials are available at https://nnsight.net/.
653			\|a Application programming interface
653			\|a Python
653			\|a Source code
653			\|a Neural networks
700	1		\|a Loftus, Alexander R
700	1		\|a Todd, Eric
700	1		\|a Brinkmann, Jannik
700	1		\|a Pal, Koyena
700	1		\|a Troitskii, Dmitrii
700	1		\|a Ripa, Michael
700	1		\|a Belfki, Adam
700	1		\|a Rager, Can
700	1		\|a Juang, Caden
700	1		\|a Mueller, Aaron
700	1		\|a Marks, Samuel
700	1		\|a Arnab Sen Sharma
700	1		\|a Lucchetti, Francesca
700	1		\|a Prakash, Nikhil
700	1		\|a Brodley, Carla
700	1		\|a Guha, Arjun
700	1		\|a Bell, Jonathan
700	1		\|a Wallace, Byron C
700	1		\|a Bau, David
773	0		\|t arXiv.org \|g (Dec 8, 2024), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3083764303/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2407.14561