SCITUNE: Aligning Large Language Models with Scientific Multimodal Instructions

Kaydedildi:
Detaylı Bibliyografya
Yayımlandı:arXiv.org (Jul 3, 2023), p. n/a
Yazar: Horawalavithana, Sameera
Diğer Yazarlar: Munikoti, Sai, Stewart, Ian, Kvinge, Henry
Baskı/Yayın Bilgisi:
Cornell University Library, arXiv.org
Konular:
Online Erişim:Citation/Abstract
Full text outside of ProQuest
Etiketler: Etiketle
Etiket eklenmemiş, İlk siz ekleyin!

MARC

LEADER 00000nab a2200000uu 4500
001 2832891468
003 UK-CbPIL
022 |a 2331-8422 
035 |a 2832891468 
045 0 |b d20230703 
100 1 |a Horawalavithana, Sameera 
245 1 |a SCITUNE: Aligning Large Language Models with Scientific Multimodal Instructions 
260 |b Cornell University Library, arXiv.org  |c Jul 3, 2023 
513 |a Working Paper 
520 3 |a Instruction finetuning is a popular paradigm to align large language models (LLM) with human intent. Despite its popularity, this idea is less explored in improving the LLMs to align existing foundation models with scientific disciplines, concepts and goals. In this work, we present SciTune as a tuning framework to improve the ability of LLMs to follow scientific multimodal instructions. To test our methodology, we use a human-generated scientific instruction tuning dataset and train a large multimodal model LLaMA-SciTune that connects a vision encoder and LLM for science-focused visual and language understanding. In comparison to the models that are finetuned with machine generated data only, LLaMA-SciTune surpasses human performance on average and in many sub-categories on the ScienceQA benchmark. 
653 |a Tuning 
653 |a Large language models 
653 |a Coders 
653 |a Human performance 
700 1 |a Munikoti, Sai 
700 1 |a Stewart, Ian 
700 1 |a Kvinge, Henry 
773 0 |t arXiv.org  |g (Jul 3, 2023), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2832891468/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2307.01139