COBRA: Counterfactual oversampling framework for imbalanced structured data classification

Recently, cyberattacks have grown more frequent and sophisticated, placing sustained pressure on digital infrastructure. Network intrusion detection remains a first line of defense, and Deep Learning (DL) enables large-scale identification of malicious activity in network flows. A persistent challenge, however, is severe class imbalance: real traffic is dominated by benign flows, while rare but consequential attacks are underrepresented. DL models trained on such data typically favor the majority class and systematically overlook the very attacks of interest. A common remedy is data rebalancing through oversampling, yet its effects on what models learn and which features they depend on remain underexplored. In this work, we propose COBRA ( COunterfactual BAlancing for Rare-class ) method, where we examine oversampling through counterfactual explanations: we synthesize realistic ?what-if? variants that minimally transform network flows into particular attack classes, use them to rebalance the training set, and analyze how feature relevance changes once these counterfactuals are introduced using eXplainable AI (XAI). Our study focuses on deep neural networks (DNNs) for multi-class intrusion detection, evaluating both detection effectiveness and explanation stability. We conduct a systematic evaluation of two widely used benchmarks, NSL-KDD and CICIDS17. Furthermore, we assess the robustness of these models under adversarial attacks.

Publishing Year

2026