Software categorization using low-level distributional features


In New Trends in Intelligent Software Methodologies, Tools and Techniques: Proceedings of the 16th International Conference (SoMeT_17), Frontiers in Artificial Intelligence and Applications (Vol. 297, p 88-98). IOS Press (2017) .

IOS Press


In recent years, there has been a growing interest in applying deep learning techniques for automatic generation of software. To achieve this ambitious objective, a number of smaller research goals need to be reached, one of which is automatic categorization of software, used in numerous tasks of software intelligence. We present here an approach to this problem using a set of low-level features derived from lexical analysis of software code. We compare different feature sets for categorizing software and also apply different supervised machine learning algorithms to perform the classifi cation task. The representation allows us to identify the most relevant libraries used for each class, and we use the best-performing classifi er to accomplish this. We evaluate our approach by applying it to categorize popular Python projects from Github.

Add your rating and review

If all scientific publications that you have read were ranked according to their scientific quality and importance from 0% (worst) to 100% (best), where would you place this publication? Please rate by selecting a range.

0% - 100%

This publication ranks between % and % of publications that I have read in terms of scientific quality and importance.

Keep my rating and review anonymous
Show publicly that I gave the rating and I wrote the review