Machine Learning-Based Approach for Classifying the Source Code Using Programming Keywords

Uloženo v:
Podrobná bibliografie
Vydáno v:IUP Journal of Information Technology vol. 18, no. 1 (Mar 2022), p. 7
Hlavní autor: Ifham, Mohamed
Další autoři: Kumara, B T G S, Banujan, Kuhaneswaran
Vydáno:
IUP Publications
Témata:
On-line přístup:Citation/Abstract
Full Text
Full Text - PDF
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

MARC

LEADER 00000nab a2200000uu 4500
001 2664914158
003 UK-CbPIL
022 |a 0973-2896 
035 |a 2664914158 
045 2 |b d20220301  |b d20220331 
084 |a 210444  |2 nlm 
100 1 |a Ifham, Mohamed 
245 1 |a Machine Learning-Based Approach for Classifying the Source Code Using Programming Keywords 
260 |b IUP Publications  |c Mar 2022 
513 |a Journal Article 
520 3 |a The implementation phase is one of the most critical periods in software development. Developers build their source code or reuse old source code functionalities concerning the requirement of the system. Most developers spend more time searching and navigating old source codes than developing them. It is essential to have an efficient method to search source code functionality within a short period. Topic modeling of source code is an approach used to extract topics from source codes. Many topic modeling approaches have been implemented using statistical techniques, which have many setbacks. Those results rely on non-formal code elements such as identifier names, comments, etc. Our novel approach is implemented using a machine-learning algorithm to address these issues. The source code functionality results depend only on the algorithm or the syntax of the source code. Three Java project functionalities, such as prime number, Fibonacci number, and selection sort were evaluated in this study. Java parser library is used to derive the source code elements, and an algorithm is created to take the count matrix of the source code features. Then the dataset was fed to three models-Artificial Neural Network (ANN), Random Forest (RF), and Ensemble Approach. It was found that the Ensemble Approach showed a 96.7% accuracy by surpassing ANN and RF. 
653 |a Prime numbers 
653 |a Machine learning 
653 |a Software 
653 |a Fibonacci numbers 
653 |a Java 
653 |a Programming languages 
653 |a Accuracy 
653 |a Source code 
653 |a Building codes 
653 |a Syntax 
653 |a Data mining 
653 |a Artificial intelligence 
653 |a Modelling 
653 |a Artificial neural networks 
653 |a Neural networks 
653 |a Classification 
653 |a Learning theory 
653 |a Algorithms 
653 |a Libraries 
653 |a Keywords 
653 |a Software development 
653 |a Semantics 
700 1 |a Kumara, B T G S 
700 1 |a Banujan, Kuhaneswaran 
773 0 |t IUP Journal of Information Technology  |g vol. 18, no. 1 (Mar 2022), p. 7 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2664914158/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/2664914158/fulltext/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/2664914158/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch