From concept to manufacturing: evaluating vision-language models for engineering design

Guardat en:
Dades bibliogràfiques
Publicat a:The Artificial Intelligence Review vol. 58, no. 9 (Sep 2025), p. 288
Autor principal: Picard, Cyril
Altres autors: Edwards, Kristen M., Doris, Anna C., Man, Brandon, Giannone, Giorgio, Alam, Md Ferdous, Ahmed, Faez
Publicat:
Springer Nature B.V.
Matèries:
Accés en línia:Citation/Abstract
Full Text
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC

LEADER 00000nab a2200000uu 4500
001 3226019174
003 UK-CbPIL
022 |a 0269-2821 
022 |a 1573-7462 
024 7 |a 10.1007/s10462-025-11290-y  |2 doi 
035 |a 3226019174 
045 2 |b d20250901  |b d20250930 
084 |a 68693  |2 nlm 
100 1 |a Picard, Cyril  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
245 1 |a From concept to manufacturing: evaluating vision-language models for engineering design 
260 |b Springer Nature B.V.  |c Sep 2025 
513 |a Journal Article 
520 3 |a Engineering design is undergoing a transformative shift with the advent of AI, marking a new era in how we approach product, system, and service planning. Large language models have demonstrated impressive capabilities in enabling this shift. Yet, with text as their only input modality, they cannot leverage the large body of visual artifacts that engineers have used for centuries and are accustomed to. This gap is addressed with the release of multimodal vision-language models (VLMs), such as GPT-4V, enabling AI to impact many more types of tasks. Our work presents a comprehensive evaluation of VLMs across a spectrum of engineering design tasks, categorized into four main areas: Conceptual Design, System-Level and Detailed Design, Manufacturing and Inspection, and Engineering Education Tasks. Specifically in this paper, we assess the capabilities of two VLMs, GPT-4V and LLaVA 1.6 34B, in design tasks such as sketch similarity analysis, CAD generation, topology optimization, manufacturability assessment, and engineering textbook problems. Through this structured evaluation, we not only explore VLMs’ proficiency in handling complex design challenges but also identify their limitations in complex engineering design applications. Our research establishes a foundation for future assessments of vision language models. It also contributes a set of benchmark testing datasets, with more than 1000 queries, for ongoing advancements and applications in this field. 
610 4 |a OpenAI 
653 |a Language 
653 |a Vision 
653 |a Datasets 
653 |a Engineering drawings 
653 |a Large language models 
653 |a Design engineering 
653 |a Engineering education 
653 |a Optimization 
653 |a Medical research 
653 |a Benchmarks 
653 |a Natural language processing 
653 |a Manufacturability 
653 |a Manufacturing 
653 |a Automation 
653 |a Conceptual design 
653 |a Topology optimization 
653 |a Cognition & reasoning 
653 |a Skills 
653 |a Artifacts 
653 |a Models 
653 |a Engineering 
653 |a Competence 
653 |a Research design 
653 |a Tasks 
653 |a Language planning 
653 |a Language shift 
653 |a Language modeling 
653 |a Research applications 
653 |a Educational systems 
653 |a Evaluation 
700 1 |a Edwards, Kristen M.  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
700 1 |a Doris, Anna C.  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
700 1 |a Man, Brandon  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
700 1 |a Giannone, Giorgio  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786); Technical University of Denmark, Department of Applied Mathematics and Computer Science, Lyngby, Denmark (GRID:grid.5170.3) (ISNI:0000 0001 2181 8870) 
700 1 |a Alam, Md Ferdous  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
700 1 |a Ahmed, Faez  |u Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, USA (GRID:grid.116068.8) (ISNI:0000 0001 2341 2786) 
773 0 |t The Artificial Intelligence Review  |g vol. 58, no. 9 (Sep 2025), p. 288 
786 0 |d ProQuest  |t ABI/INFORM Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3226019174/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3226019174/fulltext/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3226019174/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch