Implementation of K-Nearest Neighbors, Naïve Bayes Classifier, Support Vector Machine and Decision Tree Algorithms for Obesity Risk Prediction
DOI:
https://doi.org/10.57152/predatecs.v2i1.1110Keywords:
Classification, Decision Tree, K-Nearest Neighbors, Naïve Bayes Classifier, Obesity, Support Vector MachineAbstract
An abnormal or excessive build-up of fat that can negatively impact one's health as a result of an imbalance in energy between calories consumed and burnt is known as obesity. The majority of ailments, such as diabetes, heart disease, cancer, osteoarthritis, chronic renal disease, stroke, hypertension, and other fatal conditions, are linked to obesity. Information technology has therefore been the subject of several studies aimed at diagnosing and treating obesity. Because there is a wealth of information on obesity, data mining techniques such as the K-Nearest Neighbors (K-NN) algorithm, Naïve Bayes Classifier, Support Vector Machine (SVM), and Decision Tree can be used to classify the data. The 2111 records and 17 characteristics of obesity data that were received from Kaggle will be used in this study. The four algorithms are to be compared in this study. In other words, using the dataset used in this study, the Decision Tree algorithm's accuracy outperforms that of the other three algorithms K-NN, Naïve Bayes, and SVM. Using the Decision Tree algorithm, the accuracy was 84.98%; the K-NN algorithm came in second with an accuracy value of 83.55%; the Naïve Bayes algorithm came in third with an accuracy rate of 77.48%; and the SVM algorithm came in last with the lowest accuracy value in this study, at 77.32%.
References
F. Musa, F. Basaky, and O. E.O, “Obesity prediction using machine learning techniques,” J. Appl. Artif. Intell., vol. 3, no. 1, pp. 24–33, 2022, doi: 10.48185/jaai.v3i1.470.
M. Safaei, E. A. Sundararajan, M. Driss, W. Boulila, and A. Shapi’i, “A systematic literature review on obesity: Understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity,” Comput. Biol. Med., vol. 136, no. April, p. 104754, 2021, doi: 10.1016/j.compbiomed.2021.104754.
S. Ibrahim et al., “Overweight and Obesity Prevalence and Predictors in People Living in Karachi,” J. Pharm. Res. Int., vol. 33, no. August 2018, pp. 194–202, 2021, doi: 10.9734/jpri/2021/v33i31b31708.
F. Ferdowsy, K. S. A. Rahi, M. I. Jabiullah, and M. T. Habib, “A machine learning approach for obesity risk prediction,” Curr. Res. Behav. Sci., vol. 2, no. July, p. 100053, 2021, doi: 10.1016/j.crbeha.2021.100053.
S. Shi et al., “An application based on bioinformatics and machine learning for risk prediction of sepsis at first clinical presentation using transcriptomic data,” Front. Genet., vol. 13, no. September, pp. 1–12, 2022, doi: 10.3389/fgene.2022.979529.
C. Y. Cheng et al., “Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships,” Nat. Commun., vol. 12, no. 1, pp. 1–16, 2021, doi: 10.1038/s41467-021-25893-w.
M. Rashighi and J. E. Harris, “Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the Future—Big Data, Machine Learning, and Clinical Medicine,” Physiol. Behav., vol. 176, no. 3, pp. 139–148, 2017, doi: 10.1056/NEJMp1606181.Predicting.
W. Lin, S. Shi, H. Huang, J. Wen, and G. Chen, “Predicting risk of obesity in overweight adults using interpretable machine learning algorithms,” Front. Endocrinol. (Lausanne)., vol. 14, no. November, pp. 1–10, 2023, doi: 10.3389/fendo.2023.1292167.
K. Jindal, N. Baliyan, and P. S. Rana, Obesity prediction using ensemble machine learning approaches, vol. 708, no. January. Springer Singapore, 2018. doi: 10.1007/978-981-10-8636-6_37.
R. Hammond et al., “Correction: Predicting childhood obesity using electronic health records and publicly available data (PLoS ONE (2019) 14:4 (e0215571) DOI: 10.1371/journal.pone.0215571),” PLoS One, vol. 14, no. 10, pp. 1–18, 2019, doi: 10.1371/journal.pone.0223796.
S. Garba, M. Abdullahi, U. A. Umar, and N. T. Wurnor, “Obesity Level Classification Based on Decision Tree and Naïve Bayes Classifiers,” Sule Lamido Univ. J. Sci. Technol., vol. 3, no. 1, pp. 113–121, 2022, doi: https://slujst.com.ng/.
E. Kandemir, Ç. Çava?, A. E.-U. U. J. Of, and U. 2019, “Comparison of Classifiers for the Risk of Obesity Prediction Among High School Students,” Dergipark.Org.Tr, vol. 2, no. 1, pp. 15–21, 2019, doi: https://dergipark.org.tr/en/pub/uujes/issue/46592/488677.
J. Dunstan, M. Aguirre, M. Bastías, C. Nau, T. A. Glass, and F. Tobar, “Predicting nationwide obesity from food sales using machine learning,” Health Informatics J., vol. 26, no. 1, pp. 652–663, 2020, doi: 10.1177/1460458219845959.
T. M. Powell-Wiley et al., “Obesity and Cardiovascular Disease A Scientific Statement From the American Heart Association,” Circulation, vol. 143, no. 21, pp. E984–E1010, 2021, doi: 10.1161/CIR.0000000000000973.
D. Mohajan and H. K. Mohajan, “Body Mass Index (BMI) is a Popular Anthropometric Tool to Measure Obesity Among Adults,” J. Innov. Med. Res., vol. 2, no. 4, pp. 25–33, 2023, doi: 10.56397/jimr/2023.04.06.
H. Rajaguru and S. R. Sannasi Chakravarthy, “Analysis of decision tree and k-nearest neighbor algorithm in the classification of breast cancer,” Asian Pacific J. Cancer Prev., vol. 20, no. 12, pp. 3777–3781, 2019, doi: 10.31557/APJCP.2019.20.12.3777.
S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, “Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction,” Sci. Rep., vol. 12, no. 1, pp. 1–11, 2022, doi: 10.1038/s41598-022-10358-x.
R. Syahputra, G. J. Yanris, and D. Irmayani, “SVM and Naïve Bayes Algorithm Comparison for User Sentiment Analysis on Twitter,” Sinkron, vol. 7, no. 2, pp. 671–678, 2022, doi: 10.33395/sinkron.v7i2.11430.
M. F. Fayyad and D. T. Savra, “Sentiment Analysis of Towards Electric Cars using Naive Bayes Classifier and Support Vector Machine Algorithm,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. July, pp. 1–9, 2023, doi: https://doi.org/10.57152/predatecs.v1i1.814.
H. Yoshikawa, “Can naive Bayes classifier predict infection in a close contact of COVID-19?? A comparative test for predictability of the predictive model and healthcare workers in Japan,” J. Infect. Chemother., vol. 28, no. 6, pp. 774–779, 2022, doi: 10.1016/j.jiac.2022.02.017.
M. Vishwakarma and N. Kesswani, “A new two-phase intrusion detection system with Naïve Bayes machine learning for data classification and elliptic envelop method for anomaly detection,” Decis. Anal. J., vol. 7, no. April, p. 100233, 2023, doi: 10.1016/j.dajour.2023.100233.
W. Putri, D. Hastari, and K. U. Faizah, “Implementation of Naïve Bayes Classifier f or Classifying Alzheimer ’ s Disease Using the K-Means Clustering Data Sharing Technique,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. July, pp. 47–54, 2023, doi: https://doi.org/10.57152/predatecs.v1i1.803.
Z. C. Dwinnie et al., “Application of the Supervised Learning Algorithm for Classification of Pregnancy Risk Levels,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. July, pp. 26–33, 2023, doi: https://doi.org/10.57152/predatecs.v1i1.806.
M. R. Anugrah, N. Nazira, N. A. Al-qadr, and N. Ihza, “Implementation of C4 . 5 and Support Vector Machine ( SVM ) Algorithm for Classification of Coronary Heart Disease,” vol. 1, no. July, pp. 20–25, 2023, doi: 10.7910/DVN/76SIQD.
B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.
I. D. Mienye, Y. Sun, and Z. Wang, “Prediction performance of improved decision tree-based algorithms: A review,” Procedia Manuf., vol. 35, pp. 698–703, 2019, doi: 10.1016/j.promfg.2019.06.011.
A. Cherfi, K. Nouira, and A. Ferchichi, “Very Fast C4.5 Decision Tree Algorithm,” Appl. Artif. Intell., vol. 32, no. 2, pp. 119–137, 2018, doi: 10.1080/08839514.2018.1447479.