Machine Learning-Based Approach for Classifying the Source Code Using Programming Keywords

Spremljeno u:
Bibliografski detalji
Izdano u:IUP Journal of Information Technology vol. 18, no. 1 (Mar 2022), p. 7
Glavni autor: Ifham, Mohamed
Daljnji autori: Kumara, B T G S, Banujan, Kuhaneswaran
Izdano:
IUP Publications
Teme:
Online pristup:Citation/Abstract
Full Text
Full Text - PDF
Oznake: Dodaj oznaku
Bez oznaka, Budi prvi tko označuje ovaj zapis!
Opis
Sažetak:The implementation phase is one of the most critical periods in software development. Developers build their source code or reuse old source code functionalities concerning the requirement of the system. Most developers spend more time searching and navigating old source codes than developing them. It is essential to have an efficient method to search source code functionality within a short period. Topic modeling of source code is an approach used to extract topics from source codes. Many topic modeling approaches have been implemented using statistical techniques, which have many setbacks. Those results rely on non-formal code elements such as identifier names, comments, etc. Our novel approach is implemented using a machine-learning algorithm to address these issues. The source code functionality results depend only on the algorithm or the syntax of the source code. Three Java project functionalities, such as prime number, Fibonacci number, and selection sort were evaluated in this study. Java parser library is used to derive the source code elements, and an algorithm is created to take the count matrix of the source code features. Then the dataset was fed to three models-Artificial Neural Network (ANN), Random Forest (RF), and Ensemble Approach. It was found that the Ensemble Approach showed a 96.7% accuracy by surpassing ANN and RF.
ISSN:0973-2896
Izvor:Advanced Technologies & Aerospace Database