A-Index: Semantic-based Anomaly Index for Source Code

E. Akimova, E. Akimova, A. Bersenev, A. Bersenev, A. Cheshkov, A. Deikov, A. Deikov, K. Kobylkin, K. Kobylkin, A. Konygin, I. Mezentsev, I. Mezentsev, V. Misilov, V. Misilov

2022

Abstract

The software development community has been using handcrafted code quality metrics for a long time. Despite their widespread use, these metrics have a number of known shortcomings. The metrics do not take into account project-specific coding conventions, the wisdom of the crowd, etc. To address these issues, we propose a novel semantic-based approach to calculating an anomaly index for the source code. This index called A-INDEX is the output of a model trained in unsupervised mode on a source code corpus. The larger the index value, the more atypical the code fragment is. To test A-INDEX we use it to find anomalous code fragments in Python repositories. We also apply the index for a variant of the source code defect prediction problem. Using BugsInPy and PyTraceBugs datasets, we investigate how A-INDEX changes when the bug is fixed. The experiments show that in 63% of cases, the index decreases when the bug is fixed. If one keeps only those code fragments for which the index changes significantly, then in 71% of cases the index decreases when the bug is fixed.

Download


Paper Citation


in Harvard Style

Akimova E., Bersenev A., Cheshkov A., Deikov A., Kobylkin K., Konygin A., Mezentsev I. and Misilov V. (2022). A-Index: Semantic-based Anomaly Index for Source Code. In Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-758-568-5, pages 259-266. DOI: 10.5220/0010984600003176


in Bibtex Style

@conference{enase22,
author={E. Akimova and A. Bersenev and A. Cheshkov and A. Deikov and K. Kobylkin and A. Konygin and I. Mezentsev and V. Misilov},
title={A-Index: Semantic-based Anomaly Index for Source Code},
booktitle={Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2022},
pages={259-266},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010984600003176},
isbn={978-989-758-568-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - A-Index: Semantic-based Anomaly Index for Source Code
SN - 978-989-758-568-5
AU - Akimova E.
AU - Bersenev A.
AU - Cheshkov A.
AU - Deikov A.
AU - Kobylkin K.
AU - Konygin A.
AU - Mezentsev I.
AU - Misilov V.
PY - 2022
SP - 259
EP - 266
DO - 10.5220/0010984600003176