SciDaSynth: Interactive Structured Data Extraction From Scientific Literature With Large Language Model

Guardado en:
Detalles Bibliográficos
Publicado en:Campbell Systematic Reviews vol. 21, no. 4 (Dec 1, 2025)
Autor principal: Wang, Xingbo
Otros Autores: Huey, Samantha L., Sheng, Rui, Mehta, Saurabh, Wang, Fei
Publicado:
John Wiley & Sons, Inc.
Materias:
Acceso en línea:Citation/Abstract
Full Text
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3268125478
003 UK-CbPIL
022 |a 1891-1803 
024 7 |a 10.1002/cl2.70073  |2 doi 
035 |a 3268125478 
045 0 |b d20251201 
084 |a 265680  |2 nlm 
100 1 |a Wang, Xingbo  |u Present Address: Bosch Research North America & Bosch Center for Artificial Intelligence (BCAI), Sunnyvale, California, USA 
245 1 |a SciDaSynth: Interactive Structured Data Extraction From Scientific Literature With Large Language Model 
260 |b John Wiley & Sons, Inc.  |c Dec 1, 2025 
513 |a Journal Article 
520 3 |a ABSTRACT The explosion of scientific literature has made the efficient and accurate extraction of structured data a critical component for advancing scientific knowledge and supporting evidence‐based decision‐making. However, existing tools often struggle to extract and structure multimodal, varied, and inconsistent information across documents into standardized formats. We introduce SciDaSynth, a novel interactive system powered by large language models that automatically generates structured data tables according to users' queries by integrating information from diverse sources, including text, tables, and figures. Furthermore, SciDaSynth supports efficient table data validation and refinement, featuring multi‐faceted visual summaries and semantic grouping capabilities to resolve cross‐document data inconsistencies. A within‐subjects study with nutrition and NLP researchers demonstrates SciDaSynth's effectiveness in producing high‐quality structured data more efficiently than baseline methods. We discuss design implications for human–AI collaborative systems supporting data extraction tasks. 
610 4 |a World Health Organization 
653 |a Accuracy 
653 |a Extraction 
653 |a Models 
653 |a Novels 
653 |a Nutrition 
653 |a Adaptation 
653 |a Data quality 
653 |a Documents 
653 |a COVID-19 
653 |a Natural language 
653 |a Semantics 
653 |a Materials science 
653 |a Decision making 
653 |a Literary criticism 
653 |a Scientific knowledge 
653 |a Flexibility 
653 |a Medical research 
653 |a Researcher subject relations 
653 |a Data 
653 |a Language modeling 
653 |a Large language models 
653 |a Tables (Data) 
653 |a Prompting 
653 |a Researchers 
653 |a Meta Analysis 
653 |a Comprehension 
653 |a Information Needs 
653 |a Reference Materials 
653 |a Natural Language Processing 
653 |a Scientific Research 
653 |a Information Seeking 
653 |a Artificial Intelligence 
653 |a Evidence Based Practice 
653 |a Science Materials 
653 |a Language Processing 
653 |a Algorithms 
700 1 |a Huey, Samantha L.  |u Cornell Joan Klein Jacobs Center for Precision Nutrition and Health, Cornell University, Ithaca, New York, USA 
700 1 |a Sheng, Rui  |u Hong Kong University of Science and Technology, Hong Kong, Hong Kong 
700 1 |a Mehta, Saurabh  |u Weill Cornell Medicine, Cornell University, New York, New York, USA 
700 1 |a Wang, Fei  |u Weill Cornell Medicine, Cornell University, New York, New York, USA 
773 0 |t Campbell Systematic Reviews  |g vol. 21, no. 4 (Dec 1, 2025) 
786 0 |d ProQuest  |t Sociology Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3268125478/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3268125478/fulltext/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3268125478/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch