A New Large Language Model for Attribute Extraction in E-Commerce Product Categorization

Sparad:
Bibliografiska uppgifter
I publikationen:Electronics vol. 14, no. 10 (2025), p. 1930
Huvudupphov: Serhan, Çiftlikçi Mehmet
Övriga upphov: Çakmak Yusuf, Kalaycı, Tolga Ahmet, Abut Fatih, Akay, Mehmet Fatih, Kızıldağ Mehmet
Utgiven:
MDPI AG
Ämnen:
Länkar:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Taggar: Lägg till en tagg
Inga taggar, Lägg till första taggen!

MARC

LEADER 00000nab a2200000uu 4500
001 3211937582
003 UK-CbPIL
022 |a 2079-9292 
024 7 |a 10.3390/electronics14101930  |2 doi 
035 |a 3211937582 
045 2 |b d20250101  |b d20251231 
084 |a 231458  |2 nlm 
100 1 |a Serhan, Çiftlikçi Mehmet  |u Department of Data Science, Trendyol, Istanbul 34485, Turkey 
245 1 |a A New Large Language Model for Attribute Extraction in E-Commerce Product Categorization 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a In the rapidly evolving field of e-commerce, precise and efficient attribute extraction from product descriptions is crucial for enhancing search functionality, improving customer experience, and streamlining the listing process for sellers. This study proposes a large language model (LLM)-based approach for automated attribute extraction on Trendyol’s e-commerce platform. For comparison purposes, a deep learning (DL) model is also developed, leveraging a transformer-based architecture to efficiently identify explicit attributes. In contrast, the LLM, built on the Mistral architecture, demonstrates superior contextual understanding, enabling the extraction of both explicit and implicit attributes from unstructured text. The models are evaluated on an extensive dataset derived from Trendyol’s Turkish-language product catalog, using performance metrics such as precision, recall, and F1-score. Results indicate that the proposed LLM outperforms the DL model across most metrics, demonstrating superiority not only in direct single-model comparisons but also in average performance across all evaluated categories. This advantage is particularly evident in handling complex linguistic structures and diverse product descriptions. The system has been integrated into Trendyol’s platform with a scalable backend infrastructure, employing Kubernetes and Nvidia Triton Inference Server for efficient bulk processing and real-time attribute suggestions during the product listing process. This study not only advances attribute extraction for Turkish-language e-commerce but also provides a scalable and efficient NLP-based solution applicable to large-scale marketplaces. The findings offer critical insights into the trade-offs between accuracy and computational efficiency in large-scale multilingual NLP applications, contributing to the broader field of automated product classification and information retrieval in e-commerce ecosystems. 
653 |a Language 
653 |a Descriptions 
653 |a Accuracy 
653 |a Performance measurement 
653 |a Datasets 
653 |a Performance evaluation 
653 |a Large language models 
653 |a Information retrieval 
653 |a Data collection 
653 |a Product lines 
653 |a Natural language processing 
653 |a Multilingualism 
653 |a Linguistics 
653 |a Unstructured data 
653 |a Electronic commerce 
653 |a Batch processing 
653 |a Automation 
653 |a Machine learning 
653 |a Real time 
653 |a Semantics 
653 |a Comparative analysis 
700 1 |a Çakmak Yusuf  |u Department of Data Science, Trendyol, Istanbul 34485, Turkey 
700 1 |a Kalaycı, Tolga Ahmet  |u Department of Data Science, Trendyol, Istanbul 34485, Turkey 
700 1 |a Abut Fatih  |u Department of Computer Engineering, Faculty of Engineering, Çukurova University, Adana 01250, Turkey 
700 1 |a Akay, Mehmet Fatih  |u Department of Computer Engineering, Faculty of Engineering, Çukurova University, Adana 01250, Turkey 
700 1 |a Kızıldağ Mehmet  |u BADEM Bilgi Sistemleri Danışmanlık Sağlık Hizm. Tic. Ltd. Şti, Adana 01250, Turkey 
773 0 |t Electronics  |g vol. 14, no. 10 (2025), p. 1930 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3211937582/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3211937582/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3211937582/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch