Comparative Analysis of Machine Learning Models for Chronic Disease Indicator Classification Using U.S. Chronic Disease Indicators Dataset

Gregorius Airlangga

doi:10.57152/malcom.v4i3.1403

Authors

Gregorius Airlangga Atma Jaya Catholic University of Indonesia

DOI:

https://doi.org/10.57152/malcom.v4i3.1403

Keywords:

Chronic Disease Classification, Chronic Disease Indicators, Gradient Boosting, Machine Learning, Support Vector Machine

Abstract

The prevalence of chronic diseases poses significant challenges to public health systems worldwide. This study evaluates the performance of four machine learning models—Gradient Boosting Classifier, Support Vector Machine (SVM), Logistic Regression, and Random Forest—in classifying chronic disease indicators using the U.S. Chronic Disease Indicators (CDI) dataset. The models were assessed based on accuracy, precision, recall, F1 score, classification report, and confusion matrix to determine their effectiveness. The Gradient Boosting Classifier outperformed other models with an accuracy of 64.36%, precision of 63.72%, recall of 64.36%, and F1 score of 63.88%. While SVM and Random Forest demonstrated moderate performance, Logistic Regression served as a baseline for comparison. The study highlights the Gradient Boosting Classifier's superiority in handling the complexities of the CDI dataset, suggesting its potential for improving chronic disease prediction and management. Future research should focus on refining these models, addressing class imbalances, and incorporating domain knowledge to enhance interpretability and applicability in real-world scenarios.

Downloads

Download data is not yet available.

References

F. Luna and V. A. Luyckx, “Why have non-communicable diseases been left behind?,” Asian Bioeth. Rev., vol. 12, no. 1, pp. 5–25, 2020.

H. Singh and J. Bharti, “Non-Communicable Diseases and Their Risk Factors,” EAS J Parasitol Infect Dis, vol. 3, no. 6, pp. 83–86, 2021.

B. Gyawali, P. Khanal, S. R. Mishra, E. van Teijlingen, and D. Wolf Meyrowitsch, “Building strong primary health care to tackle the growing burden of non-communicable diseases in Nepal,” Glob. Health Action, vol. 13, no. 1, p. 1788262, 2020.

A. Budreviciute, S. Damiati, D. K. Sabir, and R. Kodzius, “Management and prevention strategies for non-communicable diseases (NCDs) and their risk factors,” Front. public Heal., vol. 8, p. 574111, 2020.

A. Francis et al., “Chronic kidney disease and the global public health agenda: an international consensus,” Nat. Rev. Nephrol., pp. 1–13, 2024.

M. A. Faghy et al., “Cardiovascular disease prevention and management in the COVID-19 era and beyond: an international perspective,” Prog. Cardiovasc. Dis., vol. 76, pp. 102–111, 2023.

N. P. F. Pequeno, N. L. de A. Cabral, D. M. Marchioni, S. C. V. C. Lima, and C. de O. Lyra, “Quality of life assessment instruments for adults: a systematic review of population-based studies,” Health Qual. Life Outcomes, vol. 18, pp. 1–13, 2020.

V. Falanga et al., “Chronic wounds,” Nat. Rev. Dis. Prim., vol. 8, no. 1, p. 50, 2022.

Z. Ahmed, K. Mohamed, S. Zeeshan, and X. Dong, “Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine,” Database, vol. 2020, p. baaa010, 2020.

M. Sarker, “Revolutionizing healthcare: the role of machine learning in the health sector,” J. Artif. Intell. Gen. Sci. ISSN 3006-4023, vol. 2, no. 1, pp. 36–61, 2024.

A. S. Morrow, A. D. Campos Vega, X. Zhao, and M. M. Liriano, “Leveraging machine learning to identify predictors of receiving psychosocial treatment for Attention Deficit/Hyperactivity Disorder,” Adm. Policy Ment. Heal. Ment. Heal. Serv. Res., vol. 47, no. 5, pp. 680–692, 2020.

J. V. S. Guerra, M. M. G. Dias, A. J. V. C. Brilhante, M. F. Terra, M. Garcia-Arevalo, and A. C. M. Figueira, “Multifactorial basis and therapeutic strategies in metabolism-related diseases,” Nutrients, vol. 13, no. 8, p. 2830, 2021.

J. Yang, X. Ju, F. Liu, O. Asan, T. S. Church, and J. O. Smith, “Prediction for the risk of multiple chronic conditions among working population in the United States with machine learning models,” IEEE Open J. Eng. Med. Biol., vol. 2, pp. 291–298, 2021.

Z. Nenova and J. Shang, “Chronic disease progression prediction: Leveraging case-based reasoning and big data analytics,” Prod. Oper. Manag., vol. 31, no. 1, pp. 259–280, 2022.

F. Nazi and T. Abbas, “Harnessing Machine Learning for Cancer Subtype Classification: Precision Medicine Applications,” J. Environ. Sci. Technol., vol. 2, no. 2, pp. 72–82, 2023.

V. A. Lepakshi, “Machine learning and deep learning based AI tools for development of diagnostic tools,” in Computational Approaches for Novel Therapeutic and Diagnostic Designing to Mitigate SARS-CoV-2 Infection, Elsevier, 2022, pp. 399–420.

D. Painuli, S. Bhardwaj, and others, “Recent advancement in cancer diagnosis using machine learning and deep learning techniques: A comprehensive review,” Comput. Biol. Med., vol. 146, p. 105580, 2022.

S. Dixit, A. Kumar, and K. Srinivasan, “A Current Review of Machine Learning and Deep Learning Models in Oral Cancer Diagnosis: Recent Technologies, Open Challenges, and Future Research Directions,” Diagnostics, vol. 13, no. 7, p. 1353, 2023.

M. Kirola, M. Memoria, A. Dumka, and K. Joshi, “A comprehensive review study on: optimized data mining, machine learning and deep learning techniques for breast cancer prediction in big data context,” Biomed. Pharmacol. J., vol. 15, no. 1, pp. 13–25, 2022.

T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J. Big data, vol. 8, pp. 1–37, 2021.

E. Finn, F. L. Andersson, and M. Madin-Warburton, “Burden of Clostridioides difficile infection (CDI)-a systematic review of the epidemiology of primary and recurrent CDI,” BMC Infect. Dis., vol. 21, no. 1, p. 456, 2021.

Jainaru, “Chronic Disease Indicators.” 2023.