Multilabel Classification of Bilingual Patents Using OneVsRestClassifier: A Semiautomated Approach

Guardado en:
Bibliografiske detaljer
Udgivet i:International Journal of Advanced Computer Science and Applications vol. 16, no. 1 (2025)
Hovedforfatter: PDF
Udgivet:
Science and Information (SAI) Organization Limited
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3168740464
003 UK-CbPIL
022 |a 2158-107X 
022 |a 2156-5570 
024 7 |a 10.14569/IJACSA.2025.01601106  |2 doi 
035 |a 3168740464 
045 2 |b d20250101  |b d20251231 
100 1 |a PDF 
245 1 |a Multilabel Classification of Bilingual Patents Using OneVsRestClassifier: A Semiautomated Approach 
260 |b Science and Information (SAI) Organization Limited  |c 2025 
513 |a Journal Article 
520 3 |a In response to the increasing complexity and volume of patent applications, this research introduces a semiautomated system to streamline the literature review process for Indonesian patent data. The proposed system employs a synthesis of multilabel classification techniques based on natural language processing (NLP) algorithms. This methodology focuses on developing an iterative and modular system, with each step visualised in detailed flowcharts. The system design incorporates data collection and preprocessing, multilabel classification model development, model optimisation, query and prediction, and results presentation modules. Experimental results demonstrate the promising potential of the multilabel classification model, achieving a micro F1 score of 0.6723 and a macro F1 score of 0.6009. The OneVsRestClassifier model with LinearSVC as the base classifier shows reasonably good performance in handling a bilingual dataset comprising 15,097 patent documents. The optimal model configuration uses TfidfVectorizer with 20,000 features, including bigrams, and an optimal C parameter of 0.1 for LinearSVC. Performance analysis reveals variations across IPC classes, indicating areas for further improvement. The discussion highlights the implications of the proposed system for researchers, patent examiners and industry professionals by facilitating efficient searches within patent databases. This study acknowledges the potential of semiautomated systems to enhance the efficiency of patent analysis while emphasising the need for further research to address identified challenges, such as class imbalance and performance variations across patent categories. This research paves the way for further developments in the field of automated patent classification, aiming to improve efficiency and accuracy in international patent systems while recognising the crucial role of human experts in the patent classification process. 
651 4 |a Indonesia 
653 |a Parameter identification 
653 |a Configuration management 
653 |a Classification 
653 |a Modular systems 
653 |a Patent applications 
653 |a Systems design 
653 |a Algorithms 
653 |a Industrial development 
653 |a Design optimization 
653 |a Natural language processing 
653 |a Data collection 
653 |a Literature reviews 
653 |a Language 
653 |a Accuracy 
653 |a Datasets 
653 |a Computer science 
653 |a Automation 
653 |a Research & development--R&D 
653 |a Strategic planning 
653 |a Efficiency 
653 |a Innovations 
653 |a Machine learning 
653 |a Artificial intelligence 
653 |a Intellectual property 
653 |a Computer engineering 
653 |a Databases 
653 |a Human error 
653 |a Bilingualism 
773 0 |t International Journal of Advanced Computer Science and Applications  |g vol. 16, no. 1 (2025) 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3168740464/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3168740464/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch