Support Vector Machines (SVM) are commonly used for binary classification tasks, but their performance often fails in the presence of imbalanced datasets. The classifier tends to be inclined towards the common class, leading to suboptimal decision boundaries. This research investigates the impact of hyper parameter optimization on the performance of SVM when applied to imbalanced datasets, with a focus on visualizing decision boundaries. The main contribution of this research is the introduction of a dynamic class weighting strategy combined with an automated hyper parameter tuning approach, which adapts to class imbalances. The metrics considered in this research includes the class weights, which is optimized using grid search and cross-validation techniques. Additionally, decision boundary visualization is used to qualitatively assess the impact of different hyper parameter settings on the classifier’s performance. Experiments are conducted using synthetic datasets with varying levels of class imbalance, and the results are evaluated based on classification accuracy, precision, recall, and the area under the ROC curve (AUC). The results reveal that using dynamic class weights and optimizing hyper parameters significantly improved classifier performance, with a 15% increase in recall and a 10% improvement in AUC, compared to standard SVM approaches.
M. Mujahid., “Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering,” Journal of Big Data, vol. 11, no. 1, Jun. 2024, doi: 10.1186/s40537-024-00943-4.
P. Borah and D. Gupta, “Robust twin bounded support vector machines for outliers and imbalanced data,” Applied Intelligence, vol. 51, no. 8, pp. 5314–5343, Jan. 2021, doi: 10.1007/s10489-020-01847-5.
G. Lee, P. Woo, and K. Lee, “Data generation using geometrical edge probability for one-class support vector machines,” Expert Systems with Applications, vol. 229, p. 120387, Nov. 2023, doi: 10.1016/j.eswa.2023.120387.
W. Dudzik, J. Nalepa, and M. Kawulok, “Evolving data-adaptive support vector machines for binary classification,” Knowledge-Based Systems, vol. 227, p. 107221, Sep. 2021, doi: 10.1016/j.knosys.2021.107221.
M. Shalaby, M. Farouk, and H. A. Khater, “Data reduction for SVM training using density-based border identification,” PLOS ONE, vol. 19, no. 4, p. e0300641, Apr. 2024, doi: 10.1371/journal.pone.0300641.
J. Nalepa, W. Dudzik, and M. Kawulok, “Memetic Evolution of Training Sets with Adaptive Radial Basis Kernels for Support Vector Machines,” 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5503–5510, Jan. 2021, doi: 10.1109/icpr48806.2021.9412495.
J. Wei, H. Huang, L. Yao, Y. Hu, Q. Fan, and D. Huang, “New imbalanced bearing fault diagnosis method based on Sample-characteristic Oversampling TechniquE (SCOTE) and multi-class LS-SVM,” Applied Soft Computing, vol. 101, p. 107043, Mar. 2021, doi: 10.1016/j.asoc.2020.107043.
H. Q. Tan et al., “Detecting outliers beyond tolerance limits derived from statistical process control in patient‐specific quality assurance,” Journal of Applied Clinical Medical Physics, vol. 25, no. 2, Sep. 2023, doi: 10.1002/acm2.14154.
S. Sharma, S. Timilsina, B. P. Gautam, S. Watanabe, S. Kondo, and K. Sato, “Enhancing Sika Deer Identification: Integrating CNN-Based Siamese Networks with SVM Classification,” Electronics, vol. 13, no. 11, p. 2067, May 2024, doi: 10.3390/electronics13112067.
J. Zhang, Q. Zhang, X. Qin, and Y. Sun, “A two-stage fault diagnosis methodology for rotating machinery combining optimized support vector data description and optimized support vector machine,” Measurement, vol. 200, p. 111651, Aug. 2022, doi: 10.1016/j.measurement.2022.111651.
Y. Li, J. Wu, W. Li, and A. Fang, “DWOSC: Dynamic Weight Optimization and Smoothness Constraint for Sensor-Based Human Activity Recognition,” IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–11, 2024, doi: 10.1109/tim.2024.3366277.
S. A. Gamel, S. S. M. Ghoneim, and Y. A. Sultan, “Improving the accuracy of diagnostic predictions for power transformers by employing a hybrid approach combining SMOTE and DNN,” Computers and Electrical Engineering, vol. 117, p. 109232, Jul. 2024, doi: 10.1016/j.compeleceng.2024.109232.
R. K. Batchu and H. Seetha, “A generalized machine learning model for DDoS attacks detection using hybrid feature selection and hyperparameter tuning,” Computer Networks, vol. 200, p. 108498, Dec. 2021, doi: 10.1016/j.comnet.2021.108498.
S. A. Syed et al., “Registration based fully optimized melanoma detection using deep forest technique,” Biomedical Signal Processing and Control, vol. 93, p. 106116, Jul. 2024, doi: 10.1016/j.bspc.2024.106116.
CRediT Author Statement
The authors confirm contribution to the paper as follows:
Conceptualization: WWK, LB;
Methodology: LB;
Software: LB;
Data Curation: WWK;
Writing- Original Draft Preparation: WWK, LB;
Visualization: LB;
Supervision: LB;
Validation: WWK, LB;
Writing- Reviewing and Editing: WWK, LB;
Writing- Original Draft: WWK, LB; All authors reviewed the results and approved the final version of the manuscript.
Acknowledgements
Authors thank Reviewers for taking the time and effort necessary to review the manuscript.
Funding
No funding was received for conducting this research.
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
No data available for above study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Wong W K and Leena B, "Optimizing SVM Hyper parameters for Imbalanced Datasets using Decision Boundary Visualization", Journal of Future Networks and Communications, vol.1, no.1, pp. 051-059, January 2025. doi: XXXX/XXXX/JFNC202501006.