Multilabel Classification of Bilingual Patents Using OneVsRestClassifier: A Semiautomated Approach

Guardado en:
Bibliografiske detaljer
Udgivet i:International Journal of Advanced Computer Science and Applications vol. 16, no. 1 (2025)
Hovedforfatter: PDF
Udgivet:
Science and Information (SAI) Organization Limited
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!
Beskrivelse
Resumen:In response to the increasing complexity and volume of patent applications, this research introduces a semiautomated system to streamline the literature review process for Indonesian patent data. The proposed system employs a synthesis of multilabel classification techniques based on natural language processing (NLP) algorithms. This methodology focuses on developing an iterative and modular system, with each step visualised in detailed flowcharts. The system design incorporates data collection and preprocessing, multilabel classification model development, model optimisation, query and prediction, and results presentation modules. Experimental results demonstrate the promising potential of the multilabel classification model, achieving a micro F1 score of 0.6723 and a macro F1 score of 0.6009. The OneVsRestClassifier model with LinearSVC as the base classifier shows reasonably good performance in handling a bilingual dataset comprising 15,097 patent documents. The optimal model configuration uses TfidfVectorizer with 20,000 features, including bigrams, and an optimal C parameter of 0.1 for LinearSVC. Performance analysis reveals variations across IPC classes, indicating areas for further improvement. The discussion highlights the implications of the proposed system for researchers, patent examiners and industry professionals by facilitating efficient searches within patent databases. This study acknowledges the potential of semiautomated systems to enhance the efficiency of patent analysis while emphasising the need for further research to address identified challenges, such as class imbalance and performance variations across patent categories. This research paves the way for further developments in the field of automated patent classification, aiming to improve efficiency and accuracy in international patent systems while recognising the crucial role of human experts in the patent classification process.
ISSN:2158-107X
2156-5570
DOI:10.14569/IJACSA.2025.01601106
Fuente:Advanced Technologies & Aerospace Database