A Content Based E-Commerce Dataset Recommendation System Using BERT and Named Entity Recognition

Zapisane w:
Opis bibliograficzny
Wydane w:The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings (2025), p. 704-711
1. autor: Oduba, Ayomide E
Kolejni autorzy: Ezeife, C I, Nasir, Mahreen
Wydane:
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Hasła przedmiotowe:
Dostęp online:Citation/Abstract
Etykiety: Dodaj etykietę
Nie ma etykietki, Dołącz pierwszą etykiete!
Opis
Streszczenie:Conference Title: 2025 IEEE/ACIS 29th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)Conference Start Date: 2025 June 25Conference End Date: 2025 June 27Conference Location: Busan, Korea, Republic ofExisting dataset recommendation (rec) systems including those named as ZhangRec23, WangRec22, and GDS19, face certain limitations, such as lack of focus on e-commerce datasets, inability to address complex queries, and reliance on inconsistent metadata (e.g., data structure of domain of products being recommended). This leads to incomplete or mismatched results returned by the system for complex query searches, such as "impact of seasonal sales on customer reviews for electronics". These traditional dataset rec systems rely on simple keyword matching, failing to interpret context-sensitive queries that researchers often need, and are unable to capture the dynamic trends in the e-commerce domain. This highlights the need for an advanced dataset rec system that improves metadata quality and integrates semantic understanding to recommend precise and relevant e-commerce datasets to researchers. This paper proposes an E-commerce Datasets Mining Rec System (EDMRec), an adaptation of ZhangRec23 approach. EDMRec combines content-based filtering, advanced metadata processing, and machine learning approach in a three-layered structure involving (i) Data Collection, (ii)Data Processing, and (iii) Query Processing. It utilizes Named Entity Recognition (NER) to complete metadata and uses TF-IDF with Bidirectional Encoder Representations from Transformers (BERT) embeddings to capture both keyword relevance and semantic context, enhancing recommendation precision for complex queries. Experimental results show that EDMRec improves precision, recall, and F1 score by 15% over existing systems, consistently providing contextually accurate recommendations across 4,373 metadata entries from sources such as Kaggle and Google Dataset Search, making it well-suited for supporting data-driven insights in e-commerce.
DOI:10.1109/SNPD65828.2025.11253422
Źródło:Science Database