Determining the Final Project Topic Based on the Courses Taken by Using Machine Learning Techniques
DOI:
https://doi.org/10.57152/malcom.v3i2.904Keywords:
Machine Learning, Random Oversampling, Random Undersampling, Tugas AkhirAbstract
A thesis (TA) is a scientific paper based on a problem. TA must be completed by students who wish to complete their studies. During this time, students often experience difficulties in determining the TA topic they want to research. To fix it, this research tries to determine TA topics using Machine Learning (ML) techniques based on the elective courses that students have taken. Elective courses are one form of academic data that can be used to consider TA topics. The ML algorithms used are KNN, NBC, ANN, SVM, C4.5, Random Forest, and Logistic Regression. The dataset used in this research is imbalanced data. This research balances the data using the Random Oversampling method and the Random Undersampling method. The results of experiments show that datasets balanced using ROS produce much higher ML performance, but tend to over-fit due to data duplication in the dataset. If the dataset is not balanced at all then the ML performance will be very low. Therefore, for unbalanced data, it is recommended to use the RUS method as data balance. The highest accuracy results for algorithms balanced using ROS are ANN=69.7%, RF=66.7%, SVM=57.6%, LR=57.6%, NBC=42.4%, C4.5=42.4%, and KNN=33.3%
References
A. Homaidi, “Perancangan dan implementasi E-Thesis untuk tugas akhir mahasiswa Universitas Ibrahimy Situbondo,” NJCA (Nusantara J. Comput. Its Appl., vol. 4, no. 1, pp. 15–26, 2019, doi: 10.36564/njca.v4i1.109.
A. C. Siregar, “Pelatihan penulisan tugas akhir dengan menggunakan LaTeX bagi mahasiswa teknik informatika Universitas Muhammadiyah Pontianak,” J. Bul. Al-Ribaath, vol. 18, no. 1, pp. 40–48, 2021, doi: 10.29406/br.v18i1.2555.
M. R. Baharuddin, “Adaptasi Kurikulum Merdeka Belajar Kampus Merdeka (Fokus: Model MBKM Program Studi),” J. Stud. Guru dan Pembelajaran, vol. 4, no. 1, pp. 195–205, Apr. 2021, doi: 10.30605/jsgp.4.1.2021.591.
B. Ahmad and M. S. Laha, “Penerapan studi lapangan dalam meningkatkan kemampuan analisis masalah (Studi Kasus pada mahasiswa Sosiologi IISIP YAPIS BIAK),” J. NALAR Pendidik., vol. 8, no. 1, p. 63, Jun. 2020, doi: 10.26858/jnp.v8i1.13644.
A. Salipolo, “Analisis kesulitan mahasiswa Pendidikan Matematika IAIN Palopo dalam menyusun skripsi selama Pandemi COVID-19,” 2022.
R. A. Kristian and I. Wahyuni, “Penentuan topik judul Tugas Akhir mahasiswa di STMIK Asia Malang menggunakan Fuzzy Inference System Tsukamoto,” J. Ilm. Teknol. Inf. Asia, vol. 12, no. 01, pp. 33–47, 2018, doi: 10.32815/jitika.v12i1.223.
A. Triawan and M. Della Lintang, “Penerapan Metode Naïve Bayes Untuk Rekomendasi Topik Tugas Akhir Berdasarkan Daftar Hasil Studi Mahasiswa di Perguruan Tinggi,” Teknois J. Ilm. Teknol. Inf. dan Sains, vol. 10, no. 2, pp. 58–70, 2020, doi: 10.36350/jbs.v10i2.91.
A. D. Adhi Putra and S. Juanita, “Analisis sentimen pada ulasan pengguna aplikasi Bibit dan Bareksa dengan algoritma KNN,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 2, pp. 636–646, 2021, doi: 10.35957/jatisi.v8i2.962.
A. B. Saputro, “Penerapan Machine Learning untuk mengidentifikasi faktor-faktor yang mempengaruhi kemampuan komunikasi matematis pada materi Program Linear,” Universitas Islam Negeri Syarif Hidayatullah Jakarta, 2023.
E. Irwandi, “Pengembangan sistem informasi pengelolaan Tugas Akhir Program Studi Sistem Informasi Uin Suska Riau,” Universitas Islam Negeri Sultan Syarif Kasim Riau, 2020.
A. D. T. Utomo, T. Andriyanto, and A. Ristyawan, “Implementasi metode Electre untuk menentukan topik skripsi,” Semin. Nas. Inov. Teknol. UN PGRI, vol. 4, no. 3, pp. 23–30, 2020, doi: 10.29407/inotek.v4i3.27.
H. A. Hermawan, “Identifikasi hambatan penyelesaian studi bagi mahasiswa PGSD PENJAS,” Jambura Heal. Sport J., vol. 4, no. 2, pp. 78–88, 2022, doi: 10.37311/jhsj.v4i2.15630.
S. Ray, “A quick review of Machine Learning Algorithms,” Int. Conf. Mach. Learn. Big Data, Cloud Parallel Comput. (COM-IT-Con), India, vol. 3, no. 2, pp. 35–39, 2019, doi: 10.1109/COMITCon.2019.8862451.
P. P. Shinde and D. S. Shah, “A review of Machine Learning and Deep Learning Applications,” in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 2019, pp. 1–6, doi: 10.1109/ICCUBEA.2018.8697857.
R. Ghorbani and R. Ghousi, “Comparing different resampling methods in predicting students’ performance using Machine Learning Techniques,” in IEEE Access, 2020, vol. 8, pp. 67899–67911, doi: 10.1109/ACCESS.2020.2986809.
B. Mahesh, “Machine learning algorithms - A review,” Int. J. Sci. Res., vol. 2, no. January 2019, pp. 1–6, 2020, doi: 10.21275/ART20203995.
A. Syukron and A. Subekti, “Penerapan metode Random Over-Under Sampling dan Random Forest untuk klasifikasi penilaian kredit,” J. Inform., vol. 5, no. 2, pp. 175–185, 2018, doi: 10.31294/ji.v5i2.4158.
S. Y. Bae, J. Lee, J. Jeong, C. Lim, and J. Choi, “Effective data-balancing methods for class-imbalanced genotoxicity datasets using machine learning algorithms and molecular fingerprints,” Comput. Toxicol., vol. 20, no. June, pp. 1–6, 2021, doi: 10.1016/j.comtox.2021.100178.
N. Rodríguez, D. López, A. Fernández, S. García, and F. Herrera, “SOUL: Scala Oversampling and Undersampling library for imbalance classification,” SoftwareX, vol. 15, no. July, pp. 1–8, 2021, doi: 10.1016/j.softx.2021.100767.
S. Mutmainah, “Penanganan imbalance data pada klasifikasi kemungkinan penyakit Stroke,” SNATi, vol. 1, no. 1, pp. 10–16, 2021, [Online]. Available: https://journal.uii.ac.id/jurnalsnati/article/view/20060.