Klasifikasi Data Pariwisata Berkelanjutan Menggunakan Decision Tree dengan Equal Width dan Logaritma Binning
Decision Tree Classification Using Equal Width and Logarithmic Binning for Sustainable Tourism Data
DOI:
https://doi.org/10.57152/malcom.v6i1.2520Keywords:
Decision Tree, Equal Width Binning, Logarithmic Binning, Pra-Proses DataAbstract
Pengelolaan data rekomendasi pariwisata memerlukan pemodelan prediktif yang akurat untuk mengklasifikasikan target rekomendasi berdasarkan perilaku wisatawan. Namun, variabel numerik seperti Rekomendasi Score seringkali memiliki distribusi data yang tidak merata (skewed), yang dapat memengaruhi performa algoritma pembelajaran mesin seperti Decision Tree. Penelitian ini bertujuan untuk membandingkan efektivitas dua teknik preprocessing dengan diskritisasi, yaitu Equal Width Binning (EWB) dan Logarithmic Binning (LB), dalam meningkatkan kinerja model klasifikasi. Metodologi penelitian ini mencakup beberapa tahapan preprocessing data, antara lain handling missing value, serta ekstraksi fitur temporal dari data tanggal perjalanan. Data kemudian diproses menggunakan dua skenario binning yang berbeda sebelum dilatih menggunakan algoritma Decision Tree. Hasil penelitian dievaluasi menggunakan metrik Akurasi, Presisi, Recall, dan F1-Score. Hasil perbandingan menunjukkan bahwa Equal Width Binning nilai akurasi sebesar 82 %, dan Logarithmic Binning memberikan nilai akurasi sebesar 90%. Diskritisasi melalui logaritma bining mampu mengurangi kedalaman pohon (tree depth) dan mencegah overfitting, sehingga menghasilkan model yang lebih tangguh dalam memprediksi target rekomendasi pariwisata.
Downloads
References
D. B. Mohamad and Q. Wu, “Sustainable development of road tourism: Model-based forecast of future trends,” Data Metadata, vol. 4, p. 928, May 2025, doi: 10.56294/dm2025928.
Q. B. Baloch et al., “Impact of tourism development upon environmental sustainability: a suggested framework for sustainable ecotourism,” Environ. Sci. Pollut. Res., vol. 30, no. 3, pp. 5917–5930, Jan. 2023, doi: 10.1007/s11356-022-22496-w.
Y. Kaya and R. Teki?N, “Comparison of discretization methods for classifier decision trees and decision rules on medical data sets,” Eur. J. Sci. Technol., Mar. 2022, doi: 10.31590/ejosat.1080098.
S. Milojevi?, “Power law distributions in information science: Making the case for logarithmic binning,” J. Am. Soc. Inf. Sci. Technol., vol. 61, no. 12, pp. 2417–2425, Dec. 2010, doi: 10.1002/asi.21426.
Q. Lin and M. Newberry, “Seeing through noise in power laws”.
R. Thaiphan and T. Phetkaew, “Comparative Analysis of Discretization Algorithms on Decision Tree,” in 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore: IEEE, Jun. 2018, pp. 63–67. doi: 10.1109/ICIS.2018.8466449.
X. Chen, “Analysis of Classification of Discretization Method,” in 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China: IEEE, Oct. 2020, pp. 186–190. doi: 10.1109/MLBDBI51377.2020.00041.
O. Green et al., “Logarithmic Radix Binning and Vectorized Triangle Counting,” in 2018 IEEE High Performance extreme Computing Conference (HPEC), Waltham, MA: IEEE, Sep. 2018, pp. 1–7. doi: 10.1109/HPEC.2018.8547581.
R. Dwivedi, A. Tiwari, N. Bharill, and M. Ratnaparkhe, “A Novel Clustering-Based Hybrid Feature Selection Approach Using Ant Colony Optimization,” Arab. J. Sci. Eng., vol. 48, no. 8, pp. 10727–10744, Aug. 2023, doi: 10.1007/s13369-023-07719-7.
C. Silva and M. Saraee, “Predicting Road Traffic Accident Severity using Decision Trees and Time-Series Calendar Heatmaps,” in 2019 IEEE Conference on Sustainable Utilization and Development in Engineering and Technologies (CSUDET), Penang, Malaysia: IEEE, Nov. 2019, pp. 99–104. doi: 10.1109/CSUDET47057.2019.9214709.
S. Alam, M. S. Ayub, S. Arora, and M. A. Khan, “An investigation of the imputation techniques for missing values in ordinal data enhancing clustering and classification analysis validity,” Decis. Anal. J., vol. 9, p. 100341, Dec. 2023, doi: 10.1016/j.dajour.2023.100341.
G. Baron, “On Influence of Representations of Discretized Data on Performance of a Decision System,” Procedia Comput. Sci., vol. 96, pp. 1418–1427, 2016, doi: 10.1016/j.procs.2016.08.187.
A. V. Toropova and T. V. Tulupyeva, “Discretization of a Continuous Frequency Value in a Model of Socially Significant Behavior,” in 2022 XXV International Conference on Soft Computing and Measurements (SCM), Saint Petersburg, Russian Federation: IEEE, May 2022, pp. 28–30. doi: 10.1109/SCM55405.2022.9794892.
Ö. D. Gürcan, P. Morel, S. Kobayashi, R. Singh, S. Xu, and P. H. Diamond, “Logarithmic discretization and systematic derivation of shell models in two-dimensional turbulence,” Phys. Rev. E, vol. 94, no. 3, p. 033106, Sep. 2016, doi: 10.1103/PhysRevE.94.033106.
I. Ramli, H. Basri, A. Achmad, R. G. A. P. Basuki, and Moch. A. Nafis, “Linear Regression Analysis Using Log Transformation Model for Rainfall Data in Water Resources Management Krueng Pase, Aceh, Indonesia,” Int. J. Des. Nat. Ecodynamics, vol. 17, no. 1, pp. 79–86, Feb. 2022, doi: 10.18280/ijdne.170110.
Z. Ali and W. Shahzad, “Performance Evaluation of Associative Classifiers in Perspective of Discretization Methods,” Adv. Sci. Technol. Eng. Syst. J., vol. 2, no. 3, pp. 845–854, Jun. 2017, doi: 10.25046/aj0203105.
I. D. Mienye and N. Jere, “A Survey of Decision Trees: Concepts, Algorithms, and Applications,” IEEE Access, vol. 12, pp. 86716–86727, 2024, doi: 10.1109/ACCESS.2024.3416838.
J. T. Hancock, T. M. Khoshgoftaar, and J. M. Johnson, “Evaluating classifier performance with highly imbalanced Big Data,” J. Big Data, vol. 10, no. 1, p. 42, Apr. 2023, doi: 10.1186/s40537-023-00724-5.
I. R. Management Association, Data Mining: Concepts, Methodologies, Tools, and Applications: Concepts, Methodologies, Tools, and Applications. in Contemporary research in information science and technology, no. v. 1. Information Science Reference, 2012. [Online]. Available: https://books.google.co.id/books?id=oLqeBQAAQBAJ
A. Amin et al., “Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods,” Int. J. Inf. Manag., vol. 46, pp. 304–319, Jun. 2019, doi: 10.1016/j.ijinfomgt.2018.08.015.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Fahri Alviansyah, Mukti Adi Azhari, Attar Raihan Nazhif , Suharto Suharto, Denny Saryanto, Kusrini Kusrini

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright © by Author; Published by Institut Riset dan Publikasi Indonesia (IRPI)
This Indonesian Journal of Machine Learning and Computer Science is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

















