NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
Gespeichert in:
| Veröffentlicht in: | arXiv.org (Dec 8, 2024), p. n/a |
|---|---|
| 1. Verfasser: | |
| Weitere Verfasser: | , , , , , , , , , , , , , , , , , , |
| Veröffentlicht: |
Cornell University Library, arXiv.org
|
| Schlagworte: | |
| Online-Zugang: | Citation/Abstract Full text outside of ProQuest |
| Tags: |
Keine Tags, Fügen Sie das erste Tag hinzu!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3083764303 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 3083764303 | ||
| 045 | 0 | |b d20241208 | |
| 100 | 1 | |a Fiotto-Kaufman, Jaden | |
| 245 | 1 | |a NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals | |
| 260 | |b Cornell University Library, arXiv.org |c Dec 8, 2024 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks. NNsight is an open-source system that extends PyTorch to introduce deferred remote execution. NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models. These technologies are enabled by the intervention graph, an architecture developed to decouple experiment design from model runtime. Together, this framework provides transparent and efficient access to the internals of deep neural networks such as very large language models (LLMs) without imposing the cost or complexity of hosting customized models individually. We conduct a quantitative survey of the machine learning literature that reveals a growing gap in the study of the internals of large-scale AI. We demonstrate the design and use of our framework to address this gap by enabling a range of research methods on huge models. Finally, we conduct benchmarks to compare performance with previous approaches. Code documentation, and materials are available at https://nnsight.net/. | |
| 653 | |a Application programming interface | ||
| 653 | |a Python | ||
| 653 | |a Source code | ||
| 653 | |a Neural networks | ||
| 700 | 1 | |a Loftus, Alexander R | |
| 700 | 1 | |a Todd, Eric | |
| 700 | 1 | |a Brinkmann, Jannik | |
| 700 | 1 | |a Pal, Koyena | |
| 700 | 1 | |a Troitskii, Dmitrii | |
| 700 | 1 | |a Ripa, Michael | |
| 700 | 1 | |a Belfki, Adam | |
| 700 | 1 | |a Rager, Can | |
| 700 | 1 | |a Juang, Caden | |
| 700 | 1 | |a Mueller, Aaron | |
| 700 | 1 | |a Marks, Samuel | |
| 700 | 1 | |a Arnab Sen Sharma | |
| 700 | 1 | |a Lucchetti, Francesca | |
| 700 | 1 | |a Prakash, Nikhil | |
| 700 | 1 | |a Brodley, Carla | |
| 700 | 1 | |a Guha, Arjun | |
| 700 | 1 | |a Bell, Jonathan | |
| 700 | 1 | |a Wallace, Byron C | |
| 700 | 1 | |a Bau, David | |
| 773 | 0 | |t arXiv.org |g (Dec 8, 2024), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3083764303/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2407.14561 |