Implementasi Random Forest dan SMOTE untuk Prediksi Risiko Putus Sekolah Dasar Menuju Indonesia Emas 2045
DOI:
https://doi.org/10.62951/bridge.v3i2.408Keywords:
Elementary School, Random Forest, SMOTE, Temporal DataAbstract
This research investigates the implementation of Random Forest algorithms combined with Synthetic Minority Over-sampling Technique (SMOTE) to predict elementary school dropout rates in Indonesia, supporting the Indonesia Emas 2045 vision. A significant gap was identified in previous studies, which, despite utilizing artificial intelligence for dropout interventions, had not integrated temporal dimensions into data analysis. A temporal data-based classification model was developed using Indonesian Ministry of Education data from 2021-2023, incorporating lag features, delta calculations, and rolling statistics. Two models were implemented: one with SMOTE achieving 99% accuracy with perfect recall for high-risk regions, while the non-SMOTE model reached 100% accuracy. Temporal features were identified as crucial predictors, reflecting external fluctuations and annual changes impacting dropout decisions. This approach enables educational institutions to allocate resources more efficiently by prioritizing operational assistance for high-risk schools. The model's capacity to identify high-risk regions with 100% recall represents a strategic investment in strengthening Indonesia's human resource sustainability. To address the limitations of provincial aggregate data, expansion to include individual-level variables and model validation at district or school scales is recommended for future research.
Downloads
References
Alifa, V. N. (2023). Analisis faktor penyebab meningkatnya angka putus sekolah di Indonesia pada tahun 2022. Jurnal Pendidikan Sultan Agung, 3(2), Article 2. https://doi.org/10.30659/jp-sa.3.2.175-182
Andrade-Girón, D., Sandivar-Rosas, J., Marín-Rodriguez, W., Susanibar-Ramirez, E., Toro-Dextre, E., Ausejo-Sanchez, J., Villarreal-Torres, H., & Angeles-Morales, J. (2023). Predicting student dropout based on machine learning and deep learning: A systematic review. EAI Endorsed Transactions on Scalable Information Systems, 10(5), Article 5. https://doi.org/10.4108/eetsis.3586
Banaag, R., Sumodevilla, J. L., & Potane, J. (2024). Factors affecting student drop out behavior: A systematic review. International Journal of Educational Management and Innovation, 5(1), Article 1. https://doi.org/10.12928/ijemi.v5i1.9396
Cahyani, N. L. P. A. (2024). Machine learning approaches for customer churn prediction in the aquaculture technology sector. International Journal of Current Science Research and Review, 7(8). https://doi.org/10.47191/ijcsrr/V7-i8-74
Cho, C. H., Yu, Y. W., & Kim, H. G. (2023). A study on dropout prediction for university students using machine learning. Applied Sciences, 13(21), Article 21. https://doi.org/10.3390/app132112004
Dhani, A. A., Siswa, T. A. Y., & Pranoto, W. J. (2024). Perbaikan akurasi Random Forest dengan ANOVA dan SMOTE pada klasifikasi data stunting. Teknika, 13(2), Article 2. https://doi.org/10.34148/teknika.v13i2.875
Fadila, A., Syafriandi, S., Kurniawati, Y., & Salma, A. (2024). Classification of dropout rates in West Sumatra using the Random Forest algorithm with Synthetic Minority Oversampling Technique. UNP Journal of Statistics and Data Science, 2(3), Article 3. https://doi.org/10.24036/ujsds/vol2-iss3/183
Fitriana, S., Riniyanty, L., Laila, R., Pratama, S. A., & Lamasitudju, C. A. (2024). Prediksi siswa putus sekolah dan keberhasilan akademik menggunakan machine learning: Prediksi siswa putus sekolah dan keberhasilan akademik. The Indonesian Journal of Computer Science, 13(6), Article 6. https://doi.org/10.33022/ijcs.v13i6.4453
Ghozali, A., Pratiwi, H., & Handajani, S. S. (2023). Implementasi data mining menggunakan metode Random Forest dan Support Vector Machine dalam klasifikasi penyakit diabetes. Delta: Jurnal Ilmiah Pendidikan Matematika, 11(2), 147. https://doi.org/10.31941/delta.v11i2.2686
Herdian, C., Kamila, A., & Agung Musa Budidarma, I. G. (2024). Studi kasus feature engineering untuk data teks: Perbandingan label encoding dan one-hot encoding pada metode linear regresi. Technologia: Jurnal Ilmiah, 15(1), 93. https://doi.org/10.31602/tji.v15i1.13457
Hussain, L., Lone, K. J., Awan, I. A., Abbasi, A. A., & Pirzada, J.-R. (2022). Detecting congestive heart failure by extracting multimodal features with synthetic minority oversampling technique (SMOTE) for imbalanced data using robust machine learning techniques. Waves in Random and Complex Media, 32(3), 1079–1102. https://doi.org/10.1080/17455030.2020.1810364
Ilham, M. F. N., Annurrahma, K. D., Wirayuda, P., & Rudiman, R. (2024). Analisis kepuasan pengguna aplikasi Donorku dengan pendekatan metode Random Forest dengan SMOTE. Jurnal Informatika Teknologi dan Sains (Jinteks), 6(3), 508–513. https://doi.org/10.51401/jinteks.v6i3.4229
Jiménez, O., Jesús, A., & Wong, L. (2023). Model for the prediction of dropout in higher education in Peru applying machine learning algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine. In 2023 33rd Conference of Open Innovations Association (FRUCT) (pp. 116–124). https://doi.org/10.23919/FRUCT58615.2023.10143068
Kumar, D., Kothiyal, A., Kumar, R., Hemantha, C., & Maranan, R. (2024). Random Forest approach optimized by the Grid Search process for predicting the dropout students. In 2024 International Conference on Innovations and Challenges in Emerging Technologies (ICICET) (pp. 1–6). https://doi.org/10.1109/ICICET59348.2024.10616372
Marta, J. K., Nugraha, A. E., & Anggorowati, K. D. (2023). Analisis penyebab anak putus sekolah pada jenjang pendidikan dasar di Desa Engkurai. Jurnal Pendidikan dan Pembelajaran Sekolah Dasar, 1(3), Article 3. https://doi.org/10.46368/jppsd.v1i3.1398
Nugraha, S. A. S., & Amiludin, A. (2024). Inovasi metode pembelajaran sekolah dasar berbasis connection sebagai pengembangan karakter social entrepreneurship dalam mewujudkan Indonesia emas 2045. Al-Mubtadi: Jurnal Pendidikan Guru Madrasah Ibtidaiyah, 1(2), 92–106. https://doi.org/10.58988/almubtadi.v1i2.282
Nugroho, A., & Rilvani, E. (2023). Penerapan metode oversampling SMOTE pada algoritma Random Forest untuk prediksi kebangkrutan perusahaan. Techno.Com, 22(1), Article 1. https://doi.org/10.33633/tc.v22i1.7527
Nurmalitasari, Awang Long, Z., & Faizuddin Mohd Noor, M. (2023). Factors influencing dropout students in higher education. Education Research International, 2023(1), 7704142. https://doi.org/10.1155/2023/7704142
Oktaviani, V., Rosmawarni, N., & Muslim, M. P. (2024). Perbandingan kinerja Random Forest dan SMOTE Random Forest dalam mendeteksi dan mengukur tingkat stres pada mahasiswa tingkat akhir. Informatik: Jurnal Ilmu Komputer, 20(1), Article 1. https://doi.org/10.52958/iftk.v20i1.9158
Paput, M. J., Suryowati, K., & Jatipaningrum, M. T. (2023). Perbandingan metode Random Forest dan Adaptive Boosting pada klasifikasi indeks pembangunan manusia di Indonesia. Jurnal Statistika Industri dan Komputasi, 8(2), 73–83. https://doi.org/10.34151/statistika.v8i2.4458
Prasetya, M. R. A., Priyatno, A. M., & Nurhaeni. (2023). Penanganan imputasi missing values pada data time series dengan menggunakan metode data mining. Jurnal Informasi dan Teknologi, 52–62. https://doi.org/10.37034/jidt.v5i2.324
Purwanto, A., Sartono, B., & Notodiputro, K. A. (2025). A comparison of Random Forest and Double Random Forest: Dropout rates of madrasah students in Indonesia. BAREKENG: Jurnal Ilmu Matematika dan Terapan, 19(1), Article 1. https://doi.org/10.30598/barekengvol19iss1pp227-236
Puspa, C. I. S., Rahayu, D. N. O., & Parhan, M. (2023). Transformasi pendidikan abad 21 dalam merealisasikan sumber daya manusia unggul menuju Indonesia emas 2045. Jurnal Basicedu, 7(5), 3309–3321. https://doi.org/10.31004/basicedu.v7i5.5030
Ramadhanti, H. D. (2021). Klasifikasi status NEET pada penduduk usia muda di Indonesia dengan SVM dan Random Forest. Journal of System and Computer Engineering, 2(1), Article 1. https://doi.org/10.47650/jsce.v1i2.143
Rofi, M. M., Setiawan, F. A., & Riana, F. (2024). Perbandingan metode K-NN dan Random Forest pada klasifikasi mahasiswa berpotensi dropout. INFOTECH Journal, 10(1), 84–89. https://doi.org/10.31949/infotech.v10i1.8856
S, G. N., Suryanti, M., & Faridah, H. (2024). Strategi meningkatkan motivasi belajar siswa sekolah dasar sebagai upaya mengatasi putus sekolah. Jurnal Pengabdian Pendidikan Masyarakat (JPPM), 5(1), Article 1. https://doi.org/10.52060/jppm.v5i1.1500
Soleh, N., Fajriah, F., & Rahman, F. (2024). Kontribusi mahasiswa dalam meningkatkan kualitas sumber daya manusia dan mewujudkan visi Indonesia Emas 2045. Journal of Smart Education and Learning, 1(1), 22–28. https://doi.org/10.53088/jsel.v1i1.978
Sun, S., Zeng, Z., & Li, Q. (2024). A spatio-temporal evolution analysis framework based on sentiment recognition for temple murals. Journal of Information Science, 01655515241293766. https://doi.org/10.1177/01655515241293766
Surip, A., Pratama, M. A., Ali, I., Dikananda, A. R., & Purnamasari, A. I. (2021). Penerapan machine learning menggunakan algoritma C4.5 berbasis PSO dalam menganalisa data siswa putus sekolah. Informatics for Educators and Professional: Journal of Informatics, 5(2), 147–155. https://doi.org/10.51211/itbi.v5i2.1530
Villar, A., & de Andrade, C. R. V. (2024). Supervised machine learning algorithms for predicting student dropout and academic success: A comparative study. Discover Artificial Intelligence, 4(1), 2. https://doi.org/10.1007/s44163-023-00079-z
Widiasanti, I., Abdul, A. V., Nirwana, A., & Arlita, A. D. (2023). Ancaman melawan putus sekolah dengan dilema kualitas pendidikan Indonesia. JISIP (Jurnal Ilmu Sosial dan Pendidikan), 7(3), Article 3. https://doi.org/10.58258/jisip.v7i3.5228
Widyastuti, N. A. (2021). Analisis tren angka putus sekolah pada pendidikan dasar di Kabupaten Bantul. Spektrum Analisis Kebijakan Pendidikan, 10(2), 74–89. https://doi.org/10.21831/sakp.v10i2.17372
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Bridge : Jurnal Publikasi Sistem Informasi dan Telekomunikasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.