Implementation of Machine Learning Algorithm for Heart Attack Disease Prediction
Keywords:
Decision Tree, Heart Attack Prediction, K-Nearest Neighbors, Random Forest, Support Vector MachineAbstract
Heart attack disease is one of the leading causes of death worldwide, making early detection a critical factor in reducing mortality. However, manual prediction is often inaccurate due to the complexity of medical data. To address this issue, this study evaluates five machine learning algorithms K-Nearest Neighbors (KNN), Decision Tree, Naïve Bayes, Random Forest, and Support Vector Machine (SVM) for predicting heart attack risk. The dataset, obtained from Kaggle, was preprocessed and divided into training and testing sets using 70:30 and 80:20 ratios. Algorithm performance was assessed using accuracy, precision, recall, and F1-score. The results showed that Decision Tree and Random Forest achieved the best performance with accuracy up to 97.98%, while KNN recorded the lowest accuracy at around 61.36%. This study not only demonstrates the comparative effectiveness of these algorithms on the same dataset, contributing to the growing body of research on AI in healthcare, but also highlights their potential clinical utility. In particular, Decision Tree and Random Forest can support the development of AI-based clinical decision support systems to assist healthcare professionals in early diagnosis and risk management
References
M. S. Iqbal, M. Adnan, S. E. G. Mohamed, and M. Tariq, “A hybrid deep learning framework for short-term load forecasting with improved data cleansing and preprocessing techniques,” Results Eng., vol. 24, no. November, p. 103560, 2024, doi: 10.1016/j.rineng.2024.103560.
A. Tawakuli and T. Engel, “Make your data fair: A survey of data preprocessing techniques that address biases in data towards fair AI,” J. Eng. Res., no. July, 2024, doi: 10.1016/j.jer.2024.06.016.
S. P. Patro, G. S. Nayak, and N. Padhy, “Heart disease prediction by using novel optimization algorithm: A supervised learning prospective,” Informatics Med. Unlocked, vol. 26, 2021, doi: 10.1016/j.imu.2021.100696.
S. Aziz, N. Afreen, F. Akram, and M. Ahmed, “A Framework for Cardiac Arrest Prediction via Application of Ensemble Learning Using Boosting Algorithms,” Procedia Comput. Sci., vol. 235, no. 2023, pp. 3293–3304, 2024, doi: 10.1016/j.procs.2024.04.311.
M. Wang, X. Yao, and Y. Chen, “An Imbalanced-Data Processing Algorithm for the Prediction of Heart Attack in Stroke Patients,” IEEE Access, vol. 9, pp. 25394–25404, 2021, doi: 10.1109/ACCESS.2021.3057693.
G. Sugendran and S. Sujatha, “Earlier identification of heart disease using enhanced genetic algorithm and fuzzy weight based support vector machine algorithm,” Meas. Sensors, vol. 28, no. May, p. 100814, 2023, doi: 10.1016/j.measen.2023.100814.
M. W. Rasheed, A. Mahboob, and I. Hanif, “An estimation of physicochemical properties of heart attack treatment medicines by using molecular descriptor’s,” South African J. Chem. Eng., vol. 45, no. April, pp. 20–29, 2023, doi: 10.1016/j.sajce.2023.04.003.
J. Gamboa-Cruzado, R. Crisostomo-Castro, J. Vilabuleje, J. López-Goycochea, and J. N. Valenzuela, “Heart Attack Prediction Using Machine Learning: a Comprehensive Systematic Review and Bibliometric Analysis,” J. Theor. Appl. Inf. Technol., vol. 102, no. 5, pp. 1930–1944, 2024.
S. S. Shijer, A. H. Jassim, L. A. Al-Haddad, and T. T. Abbas, “Evaluating electrical power yield of photovoltaic solar cells with k-Nearest neighbors: A machine learning statistical analysis approach,” e-Prime - Adv. Electr. Eng. Electron. Energy, vol. 9, no. July, p. 100674, 2024, doi: 10.1016/j.prime.2024.100674.
M. Ozcan and S. Peker, “A classification and regression tree algorithm for heart disease modeling and prediction,” Healthc. Anal., vol. 3, no. December 2022, p. 100130, 2023, doi: 10.1016/j.health.2022.100130.
N. Gul, W. K. Mashwani, M. Aamir, S. Aldahmani, and Z. Khan, “Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting,” Alexandria Eng. J., vol. 72, pp. 157–168, 2023, doi: 10.1016/j.aej.2023.03.075.
W. J. Sari et al., “Performance Comparison of Random Forest, Support Vector Machine and Neural Network in Health Classification of Stroke Patients,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 2, no. 1, pp. 34–43, 2024, doi: 10.57152/predatecs.v2i1.1119.
M. R. Anugrah, N. A. Al-Qadr, N. Nazira, and N. Ihza, “Implementation of C4.5 and Support Vector Machine (SVM) Algorithm for Classification of Coronary Heart Disease,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 20–25, 2023, doi: 10.57152/predatecs.v1i1.805.
H. Hidayat, A. Sunyoto, and H. Al Fatta, “Klasifikasi Penyakit Jantung Menggunakan Random Forest Clasifier,” J. SISKOM-KB (Sistem Komput. dan Kecerdasan Buatan), vol. 7, no. 1, pp. 31–40, 2023, doi: 10.47970/siskom-kb.v7i1.464.
Y. H. Shakir, E. Aziz, A. Al, and A. Alkhazraji, “Leveraging Machine Learning for Early Risk Prediction in Cirrhosis Outcome Patients,” vol. 3, no. July, pp. 22–30, 2025.
S. Lee, C. Lee, K. G. Mun, and D. Kim, “Decision Tree Algorithm Considering Distances between Classes,” IEEE Access, vol. 10, no. April, pp. 69750–69756, 2022, doi: 10.1109/ACCESS.2022.3187172.
A. F. Lubis et al., “Classification of Diabetes Mellitus Sufferers Eating Patterns Using K-Nearest Neighbors, Naïve Bayes and Decission Tree,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 2, no. 1, pp. 44–51, 2024, doi: 10.57152/predatecs.v2i1.1103.
B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.
T. Kim and J. S. Lee, “Exponential Loss Minimization for Learning Weighted Naive Bayes Classifiers,” IEEE Access, vol. 10, pp. 22724–22736, 2022, doi: 10.1109/ACCESS.2022.3155231.
M. Libnao, M. Misula, C. Andres, J. Mariñas, and A. Fabregas, “Traffic incident prediction and classification system using naïve bayes algorithm,” Procedia Comput. Sci., vol. 227, pp. 316–325, 2023, doi: 10.1016/j.procs.2023.10.530.
M. Artur, “Review the performance of the Bernoulli Naïve Bayes Classifier in Intrusion Detection Systems using Recursive Feature Elimination with Cross-validated selection of the best number of features,” Procedia Comput. Sci., vol. 190, no. 2019, pp. 564–570, 2021, doi: 10.1016/j.procs.2021.06.066.
O. Peretz, M. Koren, and O. Koren, “Naive Bayes classifier – An ensemble procedure for recall and precision enrichment,” Eng. Appl. Artif. Intell., vol. 136, no. PB, p. 108972, 2024, doi: 10.1016/j.engappai.2024.108972.
W. Putri, D. Hastari, K. U. Faizah, S. Rohimah, and D. Safira, “Implementation of Naïve Bayes Classifier for Classifying Alzheimer’s Disease Using the K-Means Clustering Data Sharing Technique,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 47–54, 2023, doi: 10.57152/predatecs.v1i1.803.
A. Tariq et al., “Modelling, mapping and monitoring of forest cover changes, using support vector machine, kernel logistic regression and naive bayes tree models with optical remote sensing data,” Heliyon, vol. 9, no. 2, p. e13212, 2023, doi: 10.1016/j.heliyon.2023.e13212.
K. Maxwell, M. Rajabi, J. Esterle, M. Tivane, and D. Travassos, “Spatial modelling and classification of altered coal using random forest-based methods at Moatize Basin, Mozambique,” J. African Earth Sci., vol. 215, no. March, p. 105279, 2024, doi: 10.1016/j.jafrearsci.2024.105279.
P. F. Pratama, D. Rahmadani, R. S. Nahampun, D. Harmutika, A. Rahmadeyan, and M. F. Evizal, “Random Forest Optimization Using Particle Swarm Optimization for Diabetes Classification,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 1, no. 1, pp. 41–46, 2023, doi: 10.57152/predatecs.v1i1.809.
E. Asamoah, G. B. M. Heuvelink, I. Chairi, P. S. Bindraban, and V. Logah, “Random forest machine learning for maize yield and agronomic efficiency prediction in Ghana,” Heliyon, vol. 10, no. 17, p. e37065, 2024, doi: 10.1016/j.heliyon.2024.e37065.
P. Josso, A. Hall, C. Williams, T. Le Bas, P. Lusty, and B. Murton, “Application of random-forest machine learning algorithm for mineral predictive mapping of Fe-Mn crusts in the World Ocean,” Ore Geol. Rev., vol. 162, no. September, p. 105671, 2023, doi: 10.1016/j.oregeorev.2023.105671.
M. Muta’alimah, C. K. Zarry, A. Kurniawan, H. Hasysya, M. F. Firas, and N. Nadhirah, “Classifications of Offline Shopping Trends and Patterns with Machine Learning Algorithms,” Public Res. J. Eng. Data Technol. Comput. Sci., vol. 2, no. 1, pp. 18–25, 2024, doi: 10.57152/predatecs.v2i1.1099.
M. Wahba, R. Essam, M. El-Rawy, N. Al-Arifi, F. Abdalla, and W. M. Elsadek, “Forecasting of flash flood susceptibility mapping using random forest regression model and geographic information systems,” Heliyon, vol. 10, no. 13, p. e33982, 2024, doi: 10.1016/j.heliyon.2024.e33982.
Z. Liu et al., “Enhancing XRF sensor-based sorting of porphyritic copper ore using particle swarm optimization-support vector machine (PSO-SVM) algorithm,” Int. J. Min. Sci. Technol., vol. 34, no. 4, pp. 545–556, 2024, doi: 10.1016/j.ijmst.2024.04.002.
R. Krishna and S. Hemamalini, “Improved TLBO algorithm for optimal energy management in a hybrid microgrid with support vector machine-based forecasting of uncertain parameters,” Results Eng., vol. 24, no. July, p. 102992, 2024, doi: 10.1016/j.rineng.2024.102992.
V. Asadpour, E. J. Puttock, D. Getahun, M. J. Fassett, and F. Xie, “Automated placental abruption identification using semantic segmentation, quantitative features, SVM, ensemble and multi-path CNN,” Heliyon, vol. 9, no. 2, p. e13577, 2023, doi: 10.1016/j.heliyon.2023.e13577.
M. L. De Klerk and A. K. Saha, “Performance analysis of DTC-SVM in a complete traction motor control mechanism for a battery electric vehicle,” Heliyon, vol. 8, no. 4, p. e09265, 2022, doi: 10.1016/j.heliyon.2022.e09265.
S. Zhao, X. Liang, L. Wang, H. Zhang, G. Li, and J. Chen, “A fault diagnosis method for analog circuits based on EEMD-PSO-SVM,” Heliyon, vol. 10, no. 18, p. e38064, 2024, doi: 10.1016/j.heliyon.2024.e38064.