Code Smell Detection Using Machine Learning in Static Analysis
DOI:
https://doi.org/10.63345/Keywords:
code smell detection; static analysis; software metrics; machine learning; class imbalance; cross-project generalization; explainabilityAbstract
Code smells—recurring design or implementation symptoms that indicate deeper problems—degrade maintainability, increase fault-proneness, and inflate long-term costs. Traditional smell detection relies on heuristics woven into static analysis rules or on expert judgment, both of which struggle to generalize across projects and languages. This manuscript presents a machine-learning (ML) approach that uses features derived from static analysis—object-oriented metrics, control-flow measures, dependency signals, and lightweight lexical cues—to detect prominent smells such as God Class, Long Method, Feature Envy, Data Class, and Shotgun Surgery. We design a reproducible pipeline covering dataset construction, feature extraction, imbalance handling, model training, evaluation, and statistical testing. A simulation study emulates multi-project conditions and cross-version drift to approximate realistic industrial scenarios. Four supervised learners—Logistic Regression, Random Forest, SVM (RBF), and XGBoost—are compared under stratified cross-validation and cross-project holdout. Performance is reported using F1-score (primary), with secondary examinations of calibration, error structure, and explainability (via model-agnostic feature attribution).
Results show tree-based ensembles (Random Forest and XGBoost) consistently outperform linear and kernel baselines, particularly for class-imbalance-sensitive smells (e.g., Shotgun Surgery). Statistical analysis using non-parametric tests indicates significant differences among learners, and ablation suggests that combining structure-aware metrics with succinct lexical signals yields the best trade-off between accuracy and interpretability. We conclude with practical guidance for toolsmiths and teams: use ensemble ML as an assistive layer on top of static analysis, expose explainable rankings rather than hard flags, calibrate thresholds by smell type, and validate models cross-project to avoid overfitting to local coding styles.
Downloads
Downloads
Additional Files
Published
Issue
Section
License
Copyright (c) 2026 The journal retains copyright of all published articles, ensuring that authors have control over their work while allowing wide dissenmination.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Articles are published under the Creative Commons Attribution NonCommercial 4.0 License (CC BY NC 4.0), allowing others to distribute, remix, adapt, and build upon the work for non-commercial purposes while crediting the original author.
