Comparison of Logistic Regression, Random Forest and Adaboost Algorithms for Diabetes Mellitus Classification

Authors

  • Alfi Syahri Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Umi Fariha Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Rival Afandi Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia
  • Intan Nurliyana MARA University, Malaysia

DOI:

https://doi.org/10.57152/ijatis.v1i1.1116

Keywords:

Adaboost, Classification, Diabetes Mellitus, Logistic Regression, Random Forest

Abstract

Diabetes mellitus is a chronic disease that affects the way the body regulates sugar (glucose). High blood sugar levels can lead to health complications including heart problems, eye disorders, nerve damage, kidney and blood vessel disorders. It is important for early detection of diabetes by utilizing data mining technology. Data mining has various classification models that can be used to detect diabetes, including logistic regression, random forest and adaboost. The comparison of the three algorithms aims to find out which algorithm is most appropriate in the classification of diabetes. From the results obtained, the random forest algorithm has the best performance in the classification of diabetes mellitus compared to other algorithms.

References

S. Safiri et al., “Prevalence, Deaths and Disability-Adjusted-Life-Years (DALYs) Due to Type 2 Diabetes and Its Attributable Risk Factors in 204 Countries and Territories, 1990-2019: Results From the Global Burden of Disease Study 2019,” Front. Endocrinol. (Lausanne)., vol. 13, no. February, pp. 1–14, 2022, doi: 10.3389/fendo.2022.838027.

C. Carpinteiro, J. Lopes, A. Abelha, and M. F. Santos, “A Comparative Study of Classification Algorithms for Early Detection of Diabetes,” Procedia Comput. Sci., vol. 220, pp. 868–873, 2023, doi: 10.1016/j.procs.2023.03.117.

S. Rammang and N. N. Reza, “Pengendalian Diabetes Melitus Melalui Edukasi dan Pemeriksaan Kadar Gula Darah Sewaktu,” vol. 7, pp. 133–137, 2023.

T. Mora, D. Roche, and B. Rodríguez-Sánchez, “Predicting the onset of diabetes-related complications after a diabetes diagnosis with machine learning algorithms,” Diabetes Res. Clin. Pract., vol. 204, no. April, 2023, doi: 10.1016/j.diabres.2023.110910.

R. J. Tiurma and Syahrizal, “Obesitas Sentral dengan Kejadian Hiperglikemia pada Pegawai Satuan Kerja Perangkat Daerah,” Higeia J. Public Heal. Res. Dev., vol. 5, no. 3, pp. 227–238, 2021.

M. E. Febrian, F. X. Ferdinan, G. P. Sendani, K. M. Suryanigrum, and R. Yunanda, “Diabetes prediction using supervised machine learning,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 21–30, 2022, doi: 10.1016/j.procs.2022.12.107.

S. Kumar and K. K. Mohbey, “A review on big data based parallel and distributed approaches of pattern mining,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 5, pp. 1639–1662, 2022, doi: 10.1016/j.jksuci.2019.09.006.

X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” Soc. Sci. Res., vol. 110, no. April 2022, p. 102817, 2023, doi: 10.1016/j.ssresearch.2022.102817.

M. M. Rahman, Y. Watanobe, T. Matsumoto, R. U. Kiran, and K. Nakamura, “Educational Data Mining to Support Programming Learning Using Problem-Solving Data,” IEEE Access, vol. 10, pp. 26186–26202, 2022, doi: 10.1109/ACCESS.2022.3157288.

K. Maharana, S. Mondal, and B. Nemade, “A review: Data pre-processing and data augmentation techniques,” Glob. Transitions Proc., vol. 3, no. 1, pp. 91–99, 2022, doi: 10.1016/j.gltp.2022.04.020.

P. Ghosh, S. Azam, A. Karim, M. Hassan, K. Roy, and M. Jonkman, “A comparative study of different machine learning tools in detecting diabetes,” Procedia Comput. Sci., vol. 192, pp. 467–477, 2021, doi: 10.1016/j.procs.2021.08.048.

S. Dutta, B. C. S. Manideep, S. M. Basha, R. D. Caytiles, and N. C. S. N. Iyengar, “Classification of diabetic retinopathy images by using deep learning models,” Int. J. Grid Distrib. Comput., vol. 11, no. 1, pp. 89–106, 2018, doi: 10.14257/ijgdc.2018.11.1.09.

S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” Int. J. Cogn. Comput. Eng., vol. 2, no. November 2020, pp. 40–46, 2021, doi: 10.1016/j.ijcce.2021.01.001.

P. A. Zandbergen and S. J. Barbeau, “Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones,” J. Navig., vol. 64, no. 3, pp. 381–399, 2011, doi: 10.1017/S0373463311000051.

K. P. Sinaga and M. S. Yang, “Unsupervised K-means clustering algorithm,” IEEE Access, vol. 8, pp. 80716–80727, 2020, doi: 10.1109/ACCESS.2020.2988796.

S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, Data preprocessing in predictive data mining, vol. 34. 2019. doi: 10.1017/S026988891800036X.

F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf. Sci. (Ny)., vol. 513, pp. 429–441, 2020, doi: 10.1016/j.ins.2019.11.004.

G. Di Franco and M. Santurro, “Machine learning, artificial neural networks and social research,” Qual. Quant., vol. 55, no. 3, pp. 1007–1025, 2021, doi: 10.1007/s11135-020-01037-y.

D. Nguyen et al., “Ensemble learning using traditional machine learning and deep neural network for diagnosis of Alzheimer’s disease,” IBRO Neurosci. Reports, vol. 13, no. September, pp. 255–263, 2022, doi: 10.1016/j.ibneur.2022.08.010.

K. W. Walker and Z. Jiang, “Application of adaptive boosting (AdaBoost) in demand-driven acquisition (DDA) prediction: A machine-learning approach,” J. Acad. Librariansh., vol. 45, no. 3, pp. 203–212, 2019, doi: 10.1016/j.acalib.2019.02.013.

Downloads

Published

2024-05-26