Comparison of Supervised Learning Algorithms for Cancer Prediction

Authors

  • Intan Adha Maharani Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Rifda Dwi Setiani Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Raudhatul Khairiyah Al-Azhar University, Egypt
  • Elfani Mardhatillah Al-Azhar University, Egypt

Keywords:

Cancer Prediction, Decision Tree, Machine Learning, Naive Bayes, Support Vector Machine

Abstract

This study focuses on the application of Machine Learning algorithms for cancer prediction using a classification dataset. Several algorithms were employed, including K-Nearest Neighbor (KNN), Naive Bayes Classifier, Decision Tree, Random Forest, and Support Vector Machine (SVM). The primary goal of this research is to evaluate the performance of each algorithm to identify the best method for achieving high accuracy in cancer classification prediction. The experimental results reveal variations in performance among these algorithms. The evaluation was conducted using metrics such as accuracy, precision, recall, and F1-Score. Based on the analysis, Random Forest and Support Vector Machine demonstrated the best performance with the highest accuracy compared to other algorithms. Meanwhile, the Naive Bayes algorithm tended to exhibit lower performance in predictions. This study emphasizes the importance of selecting the appropriate algorithm in the implementation of Machine Learning for medical applications such as cancer prediction. With these findings, it is hoped that the identified methods can assist in clinical decision-making and improve the accuracy of early cancer diagnosis.

References

I. Ahmad and F. Alqurashi, “Early cancer detection using deep learning and medical imaging: A survey,” Crit. Rev. Oncol. Hematol., vol. 204, no. October, p. 104528, 2024, doi: 10.1016/j.critrevonc.2024.104528.

L. Liu et al., “Machine learning protocols in early cancer detection based on liquid biopsy: A survey,” Life, vol. 11, no. 7, pp. 1–39, 2021, doi: 10.3390/life11070638.

L. Zhou, S. Pan, J. Wang, and A. V. Vasilakos, “Machine learning on big data: Opportunities and challenges,” Neurocomputing, vol. 237, pp. 350–361, 2017, doi: 10.1016/j.neucom.2017.01.026.

L. Sari, A. Romadloni, and R. Listyaningrum, “Penerapan Data Mining dalam Analisis Prediksi Kanker Paru Menggunakan Algoritma Random Forest,” Infotekmesin, vol. 14, no. 1, pp. 155–162, 2023, doi: 10.35970/infotekmesin.v14i1.1751.

H. Suryono, H. Kuswanto, and N. Iriawan, “Rice phenology classification based on random forest algorithm for data imbalance using Google Earth engine,” Procedia Comput. Sci., vol. 197, no. 2021, pp. 668–676, 2021, doi: 10.1016/j.procs.2021.12.201.

V. Nemade and V. Fegade, “Machine Learning Techniques for Breast Cancer Prediction,” Procedia Comput. Sci., vol. 218, no. 2022, pp. 1314–1320, 2022, doi: 10.1016/j.procs.2023.01.110.

A. Bilal, A. Imran, T. I. Baig, X. Liu, E. Abouel Nasr, and H. Long, “Breast cancer diagnosis using support vector machine optimized by improved quantum inspired grey wolf optimization,” Sci. Rep., vol. 14, no. 1, pp. 1–25, 2024, doi: 10.1038/s41598-024-61322-w.

M. Tiara et al., “Pemanfaatan Algoritma Adasyn Dan Support Vector Machine Dalam Meningkatkan Akurasi Prediksi Kanker Paru-Paru,” vol. 8, no. 5, pp. 8773–8778, 2024.

C. A. Ul Hassan, M. S. Khan, and M. A. Shah, “Comparison of machine learning algorithms in data classification,” ICAC 2018 - 2018 24th IEEE Int. Conf. Autom. Comput. Improv. Product. through Autom. Comput., no. September, pp. 1–6, 2018, doi: 10.23919/IConAC.2018.8748995.

A. Eleyan, “Breast cancer classification using moments,” 2018 Electr. Electron. Comput. Sci. Biomed. Eng. Meet., pp. 1–4, 2012, doi: 10.1109/siu.2012.6204778.

N. Manjunathan, N. Gomathi, and S. Muthulingam, “Early Detection of Breast Cancer using Machine Learning,” Int. Conf. Sustain. Comput. Smart Syst. ICSCSS 2023 - Proc., vol. 10, no. 3, pp. 165–169, 2023, doi: 10.1109/ICSCSS57650.2023.10169777.

A. Yaqoob, R. Musheer Aziz, and N. K. verma, “Applications and Techniques of Machine Learning in Cancer Classification: A Systematic Review,” Human-Centric Intell. Syst., vol. 3, no. 4, pp. 588–615, 2023, doi: 10.1007/s44230-023-00041-3.

E. Asamoah, G. B. M. Heuvelink, I. Chairi, P. S. Bindraban, and V. Logah, “Random forest machine learning for maize yield and agronomic efficiency prediction in Ghana,” Heliyon, vol. 10, no. 17, p. e37065, 2024, doi: 10.1016/j.heliyon.2024.e37065.

P. P. Sengar, M. J. Gaikwad, and A. S. Nagdive, “Comparative study of machine learning algorithms for breast cancer prediction,” Proc. 3rd Int. Conf. Smart Syst. Inven. Technol. ICSSIT 2020, no. December 2016, pp. 796–801, 2020, doi: 10.1109/ICSSIT48917.2020.9214267.

M. M. Hassan et al., “A comparative assessment of machine learning algorithms with the Least Absolute Shrinkage and Selection Operator for breast cancer detection and prediction,” Decis. Anal. J., vol. 7, no. May, p. 100245, 2023, doi: 10.1016/j.dajour.2023.100245.

A. Almomany, W. R. Ayyad, and A. Jarrah, “Optimized implementation of an improved KNN classification algorithm using Intel FPGA platform: Covid-19 case study,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 6, pp. 3815–3827, 2022, doi: 10.1016/j.jksuci.2022.04.006.

A. Hernandes, S. Kurnia Gusti, F. Syafria, L. Handayani, and S. Ramadhani, “Klasifikasi Data Penerimaan Zakat dengan Algoritma K-Nearest Neighbor,” Media Online, vol. 4, no. 3, pp. 1632–1640, 2023, doi: 10.30865/klik.v4i3.1528.

Z. C. Dwinnie, L. Khairani, M. A. M. Putri, J. Adhiva, and M. I. F. Tsamarah, “Application of the Supervised Learning Algorithm for Classification of Pregnancy Risk Levels,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 26–33, 2023, doi: 10.57152/predatecs.v1i1.806.

Y. Shang, “Prevention and detection of DDOS attack in virtual cloud computing environment using Naive Bayes algorithm of machine learning,” Meas. Sensors, vol. 31, no. December 2023, p. 100991, 2024, doi: 10.1016/j.measen.2023.100991.

A. Nugroho and Y. Religia, “Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 3, pp. 504–510, 2021, doi: 10.29207/resti.v5i3.3067.

Rayuwati, Husna Gemasih, and Irma Nizar, “Implementasi Algoritma Naive Bayes Untuk Memprediksi Tingkat Penyebaran Covid,” Jural Ris. Rumpun Ilmu Tek., vol. 1, no. 1, pp. 38–46, 2022, doi: 10.55606/jurritek.v1i1.127.

A. Al Nasseri, A. Tucker, and S. De Cesare, “Quantifying StockTwits semantic terms’ trading behavior in financial markets: An effective application of decision tree algorithms,” Expert Syst. Appl., vol. 42, no. 23, pp. 9192–9210, 2015, doi: 10.1016/j.eswa.2015.08.008.

M. R. Anugrah, N. A. Al-Qadr, N. Nazira, and N. Ihza, “Implementation of C4.5 and Support Vector Machine (SVM) Algorithm for Classification of Coronary Heart Disease,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 20–25, 2023, doi: 10.57152/predatecs.v1i1.805.

L. Y. Hu, M. W. Huang, S. W. Ke, and C. F. Tsai, “The distance function effect on k-nearest neighbor classification for medical datasets,” Springerplus, vol. 5, no. 1, 2016, doi: 10.1186/s40064-016-2941-7.

A. F. Lubis et al., “Classification of Diabetes Mellitus Sufferers Eating Patterns Using K-Nearest Neighbors, Naïve Bayes and Decission Tree,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 2, no. 1, pp. 44–51, 2024, doi: 10.57152/predatecs.v2i1.1103.

G. A. Sandag, “Prediksi Rating Aplikasi App Store Menggunakan Algoritma Random Forest,” CogITo Smart J., vol. 6, no. 2, pp. 167–178, 2020, doi: 10.31154/cogito.v6i2.270.167-178.

D. Ananda, S. Nurhidayarnis, T. A. Afifah, M. A. Ramadhan, and I. Mahendra, “Text Classification of Translated Qur’anic Verses Using Supervised Learning Algorithm,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 2, pp. 78–84, 2024, doi: 10.57152/predatecs.v1i2.870.

A. Rahmah, N. Sepriyanti, M. H. Zikri, I. Ambarani, and M. Y. bin Shahar, “Implementation of Support Vector Machine and Random Forest for Heart Failure Disease Classification,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 34–40, 2023, doi: 10.57152/predatecs.v1i1.816.

H. Apriyani and K. Kurniati, “Perbandingan Metode Naïve Bayes Dan Support Vector Machine Dalam Klasifikasi Penyakit Diabetes Melitus,” J. Inf. Technol. Ampera, vol. 1, no. 3, pp. 133–143, 2020, doi: 10.51519/journalita.volume1.isssue3.year2020.page133-143.

M. Vakili, M. Ghamsari, and M. Rezaei, “Performance Analysis and Comparison of Machine and Deep Learning Algorithms for IoT Data Classification,” 2020, [Online]. Available: http://arxiv.org/abs/2001.09636

I. Ozcan, H. Aydin, and A. Cetinkaya, “Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer,” Asian Pacific J. Cancer Prev., vol. 23, no. 10, pp. 3287–3297, 2022, doi: 10.31557/APJCP.2022.23.10.3287.

Published

2025-09-04