Nov 22, 2022
DOI:
Published in: International Arab Conference on Information Technology (ACIT'2022)
Publisher: IEEE
Processing programming languages are very similar to processing natural languages, especially high-level languages such as Python, Java, C#, C, C++, and others. Therefore, the natural language processing concepts can be applied as one of the most important branches of artificial intelligence in detecting, recognizing, and classification scripts written in different programming languages. The programming language script classification can be counted as a classical machine learning problem. This research aims to present a model using Multinomial Naïve Bayes (MNB) artificial intelligence algorithm to identify and classify the programming language used in writing the source code file provided as an input for the proposed model. A set of categorized files containing source codes will be used in training the proposed model, and then the model will be able to automatically detect and classify a new script into one of the already trained categories. The machine learning method called NB Multinomial will be used to implement this matter. This work is very important for Mufti-programming language editors such as Visual Studio Code, Notepad+, and others, where the user can paste the source code, and the editor will recognize the programming language automatically.
Copyright © 2024 Al Ain University. All Rights Reserved.