Comparison of Support Vector Machine, Random Forest, and C4.5 Algorithms for Customer Loss Prediction

Authors

  • Bima Maulana Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Dany Febrian Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Irgie Rachmat Fachrezi Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Muhammad Ferdi Zeen International University of Africa Khortum

DOI:

https://doi.org/10.57152/ijatis.v2i1.1102

Keywords:

Customer Loss Prediction, C4.5, Random Forest, Support Vector Machine

Abstract

Loss of customers has been discussed and many studies have been conducted, starting from using the Bayesian network algorithm, Decision tree, random vorest, Support vector machine, and neyral network Algorithms Support Vector Machine (SVM), Random Forest, and Decision Tree or C4.5 are algorithms used for prediction and have several advantages Random forest has the advantage of being able to combine many predictions from decision trees that have a tendency to reduce overfitting. This research uses the C4.5 algorithm, SVM and random forest. Research shows that the Random Forest method has the highest accuracy of 87.02% compared to the Support Vector Machine and Decision Tree methods. In contrast, Decision Tree gets low accuracy results with a value of 78.52%. Experimental results show that the Random forest method for customer loss prediction achieves an average classification accuracy of 4% - 9% higher than the Support Vector Machine and Decision Tree methods.

References

S. Mitrovi?, B. Baesens, W. Lemahieu, and J. De Weerdt, “On the operational efficiency of different feature types for telco Churn prediction,” Eur J Oper Res, vol. 267, no. 3, pp. 1141–1155, 2018.

D. H. Tisantri, R. C. Wihandika, and S. Adinugroho, “Prediksi Keputusan Pelanggan Menggunakan Extreme Learning Machine Pada Data Telco Customer Churn,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, no. 11, pp. 10516–10523, 2019.

K. Kelvin, C. Cindy, C. Charles, D. P. Leonardo, and Y. Yennimar, “Customer Churn’s Analysis In Telecomunications Company Using Fp-Growth Algorithm: Customer Churn’s Analysis In Telecomunications Company Using Fp-Growth Algorithm,” Jurnal Mantik, vol. 4, no. 2, pp. 1285–1290, 2020.

R. Yu, X. An, B. Jin, J. Shi, O. A. Move, and Y. Liu, “Particle classification optimization-based BP network for telecommunication customer churn prediction,” Neural Comput Appl, vol. 29, pp. 707–720, 2018.

B. Huang, M. T. Kechadi, and B. Buckley, “Customer churn prediction in telecommunications,” Expert Syst Appl, vol. 39, no. 1, pp. 1414–1425, 2012.

S. Huang, N. Cai, P. P. Pacheco, S. Narrandes, Y. Wang, and W. Xu, “Applications of support vector machine (SVM) learning in cancer genomics,” Cancer Genomics Proteomics, vol. 15, no. 1, pp. 41–51, 2018.

H. Nalatissifa and H. F. Pardede, “Customer Decision Prediction Using Deep Neural Network on Telco Customer Churn Data,” Jurnal Elektronika dan Telekomunikasi, vol. 21, no. 2, pp. 122–127, 2021.

N. Hashmi, N. A. Butt, and M. Iqbal, “Customer churn prediction in telecommunication a decade review and classification,” International Journal of Computer Science Issues (IJCSI), vol. 10, no. 5, p. 271, 2013.

S. D. Damanik and M. I. Jambak, “Klasifikasi Customer Churn pada Telekomunikasi Industri Untuk Retensi Pelanggan Menggunakan Algoritma C4. 5,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 3, no. 6, pp. 1303–1309, 2023.

A. Famili, W.-M. Shen, R. Weber, and E. Simoudis, “Data preprocessing and intelligent data analysis,” Intelligent data analysis, vol. 1, no. 1, pp. 3–23, 1997.

S. Manikandan, “Data transformation,” J Pharmacol Pharmacother, vol. 1, no. 2, p. 126, 2010.

A. Febriani, T. T. Rahmawati, E. Sabna, P. Studi, T. Informatika, and H. T. Pekanbaru, “Implementation of Data Mining to Predict the Feasibility of Blood Donors Using C4.5 Algorithm 1,” Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), vol. 1, no. 1, pp. 41–46, 2018.

W. Katrina, H. J. Damanik, F. Parhusip, D. Hartama, A. P. Windarto, and A. Wanto, “C. 45 classification rules model for determining students level of understanding of the subject,” in Journal of Physics: Conference Series, 2019, p. 12005.

H. Hasanah, “Perbandingan Tingkat Akurasi Algoritma Support Vector Machines (SVM) dan C4.5 dalam Prediksi Penyakit Jantung,” 2023.

S. Huang, N. Cai, P. P. Pacheco, S. Narrandes, Y. Wang, and W. Xu, “Applications of support vector machine (SVM) learning in cancer genomics,” Cancer Genomics Proteomics, vol. 15, no. 1, pp. 41–51, 2018.

K. Prima Wijaya and A. Muslim, Peningkatan Akurasi pada Algoritma Support Vector Machine dengan Penerapan Information Gain untuk Mendiagnosa Chronic Kidney Disease. 2016.

A. Paul, D. P. Mukherjee, P. Das, A. Gangopadhyay, A. R. Chintha, and S. Kundu, “Improved random forest for classification,” IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 4012–4024, 2018.

A. Primajaya and B. N. Sari, “Random forest algorithm for prediction of precipitation,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 1, no. 1, pp. 27–31, 2018.

I. Tachtsidis and F. Scholkmann, “False positives and false negatives in functional near-infrared spectroscopy: issues, challenges, and the way forward,” Neurophotonics, vol. 3, no. 3, p. 31405, 2016.

S. T. Brookes et al., “Subgroup analysis in randomised controlled trials: quantifying the risks of false-positives and false-negatives,” 2001.

Downloads

Published

2025-02-28