Klasifikasi Data Pariwisata Berkelanjutan Menggunakan Decision Tree dengan Equal Width dan Logaritma Binning

Decision Tree Classification Using Equal Width and Logarithmic Binning for Sustainable Tourism Data

Authors

  • Fahri Alviansyah Universitas Amikom Yogyakarta
  • Mukti Adi Azhari Universitas Amikom Yogyakarta
  • Attar Raihan Nazhif Universitas Amikom Yogyakarta
  • Suharto Suharto Universitas Amikom Yogyakarta
  • Denny Saryanto Universitas Amikom Yogyakarta
  • Kusrini Kusrini Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.57152/malcom.v6i1.2520

Keywords:

Decision Tree, Equal Width Binning, Logarithmic Binning, Pra-Proses Data

Abstract

Pengelolaan data rekomendasi pariwisata memerlukan pemodelan prediktif yang akurat untuk mengklasifikasikan target rekomendasi berdasarkan perilaku wisatawan. Namun, variabel numerik seperti Rekomendasi Score seringkali memiliki distribusi data yang tidak merata (skewed), yang dapat memengaruhi performa algoritma pembelajaran mesin seperti Decision Tree. Penelitian ini bertujuan untuk membandingkan efektivitas dua teknik preprocessing dengan diskritisasi, yaitu Equal Width Binning (EWB) dan Logarithmic Binning (LB), dalam meningkatkan kinerja model klasifikasi. Metodologi penelitian ini mencakup beberapa tahapan preprocessing data, antara lain handling missing value, serta ekstraksi fitur temporal dari data tanggal perjalanan. Data kemudian diproses menggunakan dua skenario binning yang berbeda sebelum dilatih menggunakan algoritma Decision Tree. Hasil penelitian dievaluasi menggunakan metrik Akurasi, Presisi, Recall, dan F1-Score. Hasil perbandingan menunjukkan bahwa Equal Width Binning nilai akurasi sebesar 82 %, dan Logarithmic Binning memberikan nilai akurasi sebesar 90%. Diskritisasi melalui logaritma bining mampu mengurangi kedalaman pohon (tree depth) dan mencegah overfitting, sehingga menghasilkan model yang lebih tangguh dalam memprediksi target rekomendasi pariwisata.

Downloads

Download data is not yet available.

References

D. B. Mohamad and Q. Wu, “Sustainable development of road tourism: Model-based forecast of future trends,” Data Metadata, vol. 4, p. 928, May 2025, doi: 10.56294/dm2025928.

Q. B. Baloch et al., “Impact of tourism development upon environmental sustainability: a suggested framework for sustainable ecotourism,” Environ. Sci. Pollut. Res., vol. 30, no. 3, pp. 5917–5930, Jan. 2023, doi: 10.1007/s11356-022-22496-w.

Y. Kaya and R. Teki?N, “Comparison of discretization methods for classifier decision trees and decision rules on medical data sets,” Eur. J. Sci. Technol., Mar. 2022, doi: 10.31590/ejosat.1080098.

S. Milojevi?, “Power law distributions in information science: Making the case for logarithmic binning,” J. Am. Soc. Inf. Sci. Technol., vol. 61, no. 12, pp. 2417–2425, Dec. 2010, doi: 10.1002/asi.21426.

Q. Lin and M. Newberry, “Seeing through noise in power laws”.

R. Thaiphan and T. Phetkaew, “Comparative Analysis of Discretization Algorithms on Decision Tree,” in 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore: IEEE, Jun. 2018, pp. 63–67. doi: 10.1109/ICIS.2018.8466449.

X. Chen, “Analysis of Classification of Discretization Method,” in 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China: IEEE, Oct. 2020, pp. 186–190. doi: 10.1109/MLBDBI51377.2020.00041.

O. Green et al., “Logarithmic Radix Binning and Vectorized Triangle Counting,” in 2018 IEEE High Performance extreme Computing Conference (HPEC), Waltham, MA: IEEE, Sep. 2018, pp. 1–7. doi: 10.1109/HPEC.2018.8547581.

R. Dwivedi, A. Tiwari, N. Bharill, and M. Ratnaparkhe, “A Novel Clustering-Based Hybrid Feature Selection Approach Using Ant Colony Optimization,” Arab. J. Sci. Eng., vol. 48, no. 8, pp. 10727–10744, Aug. 2023, doi: 10.1007/s13369-023-07719-7.

C. Silva and M. Saraee, “Predicting Road Traffic Accident Severity using Decision Trees and Time-Series Calendar Heatmaps,” in 2019 IEEE Conference on Sustainable Utilization and Development in Engineering and Technologies (CSUDET), Penang, Malaysia: IEEE, Nov. 2019, pp. 99–104. doi: 10.1109/CSUDET47057.2019.9214709.

S. Alam, M. S. Ayub, S. Arora, and M. A. Khan, “An investigation of the imputation techniques for missing values in ordinal data enhancing clustering and classification analysis validity,” Decis. Anal. J., vol. 9, p. 100341, Dec. 2023, doi: 10.1016/j.dajour.2023.100341.

G. Baron, “On Influence of Representations of Discretized Data on Performance of a Decision System,” Procedia Comput. Sci., vol. 96, pp. 1418–1427, 2016, doi: 10.1016/j.procs.2016.08.187.

A. V. Toropova and T. V. Tulupyeva, “Discretization of a Continuous Frequency Value in a Model of Socially Significant Behavior,” in 2022 XXV International Conference on Soft Computing and Measurements (SCM), Saint Petersburg, Russian Federation: IEEE, May 2022, pp. 28–30. doi: 10.1109/SCM55405.2022.9794892.

Ö. D. Gürcan, P. Morel, S. Kobayashi, R. Singh, S. Xu, and P. H. Diamond, “Logarithmic discretization and systematic derivation of shell models in two-dimensional turbulence,” Phys. Rev. E, vol. 94, no. 3, p. 033106, Sep. 2016, doi: 10.1103/PhysRevE.94.033106.

I. Ramli, H. Basri, A. Achmad, R. G. A. P. Basuki, and Moch. A. Nafis, “Linear Regression Analysis Using Log Transformation Model for Rainfall Data in Water Resources Management Krueng Pase, Aceh, Indonesia,” Int. J. Des. Nat. Ecodynamics, vol. 17, no. 1, pp. 79–86, Feb. 2022, doi: 10.18280/ijdne.170110.

Z. Ali and W. Shahzad, “Performance Evaluation of Associative Classifiers in Perspective of Discretization Methods,” Adv. Sci. Technol. Eng. Syst. J., vol. 2, no. 3, pp. 845–854, Jun. 2017, doi: 10.25046/aj0203105.

I. D. Mienye and N. Jere, “A Survey of Decision Trees: Concepts, Algorithms, and Applications,” IEEE Access, vol. 12, pp. 86716–86727, 2024, doi: 10.1109/ACCESS.2024.3416838.

J. T. Hancock, T. M. Khoshgoftaar, and J. M. Johnson, “Evaluating classifier performance with highly imbalanced Big Data,” J. Big Data, vol. 10, no. 1, p. 42, Apr. 2023, doi: 10.1186/s40537-023-00724-5.

I. R. Management Association, Data Mining: Concepts, Methodologies, Tools, and Applications: Concepts, Methodologies, Tools, and Applications. in Contemporary research in information science and technology, no. v. 1. Information Science Reference, 2012. [Online]. Available: https://books.google.co.id/books?id=oLqeBQAAQBAJ

A. Amin et al., “Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods,” Int. J. Inf. Manag., vol. 46, pp. 304–319, Jun. 2019, doi: 10.1016/j.ijinfomgt.2018.08.015.

Downloads

Published

2026-02-01

How to Cite

Alviansyah, F., Azhari, M. A., Nazhif , A. R., Suharto, S., Saryanto, D., & Kusrini, K. (2026). Klasifikasi Data Pariwisata Berkelanjutan Menggunakan Decision Tree dengan Equal Width dan Logaritma Binning : Decision Tree Classification Using Equal Width and Logarithmic Binning for Sustainable Tourism Data. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 6(1), 354-362. https://doi.org/10.57152/malcom.v6i1.2520