PREDICTING SOFTWARE BUG SEVERITY FROM DEVELOPER REPORTS VIA HYBRID NLP AND GRAPH-BASED LEARNING

Ibrahim Qasim

Authors

Ibrahim Qasim Department of Software Engineering, Institute of Intelligent Software Analytics, Lahore, Pakistan Author

Keywords:

Software Bug Severity Prediction, Natural Language Processing, Graph-Based Learning, Software Defect Triage, Explainable Machine Learning

Abstract

Software bug severity prediction is an important task in software maintenance because it helps development teams prioritize critical defects, allocate resources efficiently, and reduce delays in software release cycles. Traditional severity classification methods mainly rely on manual triaging or simple text-based machine learning models, which often fail to capture the contextual relationships among bug reports, developers, modules, and historical defect patterns. This paper presents a hybrid approach for predicting software bug severity from developer reports by combining natural language processing and graph-based learning. The proposed framework first extracts semantic information from bug descriptions, summaries, and developer comments using transformer-based text representations. These textual embeddings are then integrated with graph-based features that represent relationships among reports, affected components, reporter activity, duplicate links, and dependency structures. The hybrid model is evaluated on a structured bug-report dataset containing multiple severity classes, including low, medium, high, and critical bugs. Experimental results show that the combined NLP and graph-learning model improves classification performance compared with standalone text-based baselines. The model achieves stronger macro-F1, higher recall for severe bug classes, and better robustness under imbalanced severity distributions. The results also indicate that graph connectivity, historical report similarity, and component-level defect patterns provide useful signals for distinguishing critical bugs from ordinary reports. Explainability analysis further highlights important severity-indicating terms such as “crash,” “security,” “memory,” “failure,” and “data loss,” supporting the interpretability of the proposed approach. Overall, the study demonstrates that integrating semantic text understanding with structural software repository knowledge can improve automated bug severity prediction and support more reliable software maintenance decision-making.