An Ensemble Learning Approach for Student Performance Analysis of a Higher Educational Institute using a SHAP-Based Feature Selection and Optuna Optimization
Volume 11, Issue 2, Page No 1–11, 2026
Adv. Sci. Technol. Eng. Syst. J. 11(2), 1–11 (2026);
DOI: 10.25046/aj110201
Keywords: Student performance, Higher education, Ensemble learning, AI-powered tools, Explanaible AI, SHAP feature selection, Optuna optimization
Forecasting and assessing student performance are crucial for allowing educators to pinpoint deficiencies and promote grade improvement. A thorough comprehension of feature contributions is crucial for improving model interpretability and facilitating informed decision-making in academic institutions. Explainable artificial intelligence encompasses methodologies and strategies designed to deliver transparent and accessible rationales for the decisions rendered by artificial intelligence and machine learning algorithms. In this research paper, an interpretable gradient boosting approach for predicting student performance is introduced, including both feature selection using SHapley Additive exPlanations (SHAP)-based features and cost-sensitive decision thresholds. The proposed methodology includes hybrid resampling with SMOTE-Tomek, Optuna hyperparameter optimization with stratified cross-validation, and SHAP-guided feature selection strategy. The proposed approach is tested using a dataset of a higher educational institute in the Middle East, including student information, learning management, and a video interaction system, to make an analysis and evaluate the performance of the students. The results show both improvement of the macro F1-score and the fail-class recall by achieving an accuracy of 94%, a weighted/macro F1-score of 0.9399, and a fail-class recall of 0.9619. The suggested method facilitates trade-offs among prediction accuracy, interpretability, and fairness, bridging the divide between high-performing machine learning models and practical educational applications, hence aiding in the formulation of data-driven policies and the customization of learning experiences.
- N. Kalra, S. M. Paddock, “Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?” Transportation Research Part A, 94, 182–193, 2016, doi: 10.1016/j.tra.2016.09.010.
- A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, V. Koltun, “CARLA: An open urban driving simulator”, CoRL, 2017, doi: 10.48550/arXiv.1711.03938.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, O. Beijbom, “nuScenes: A multimodal dataset for autonomous driving”, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11621–11631, 2020, doi: 10.1109/CVPR42600.2020.01164.
- F. Codevilla, E. Santana, A. Lopez, A. Gaidon, “Exploring the limitations of behavior cloning for autonomous driving”, in International Conference on Computer Vision (ICCV), 9328–9337, 2019, doi: 10.1109/ICCV.2019.00942.
- Y. Zhang, A. Carballo, H. Yang, K. Takeda, “Autonomous driving in adverse weather conditions: A survey”, arXiv preprint arXiv:2112.08936, 2021, doi: 10.1016/j.isprsjprs.2022.12.021.
- S. Zang, M. Ding, D. Smith, P. Tyler, T. Rakotoarivelo, M. A. Kaafar, “The impact of adverse weather conditions on autonomous vehicles: How rain, snow, fog, and hail affect the performance of a self-driving car”, IEEE Vehicular Technology Magazine, 14, 103–111, 2019, doi: 10.1109/MVT.2019.2892497.
- D. Neumeister, D. Pape, “Automated vehicles and adverse weather: Final report”, U.S. Department of Transportation, Federal Highway Administration, June 2019. Available: www.its.dot.gov/index.htm
- R. Li, T. Qin, C. Widdershoven, “ISS-Scenario: Scenario-based testing in CARLA”, in Theoretical Aspects of Software Engineering (TASE), 279–286, 2024, doi: 10.1007/978-3-031-64626-3_16.
- M. Čávojský, E. Šlapak, M. Dopiriak, G. Bugar, J. Gazda, “3CSim: CARLA corner case simulation for control assessment in autonomous driving”, arXiv preprint arXiv:2409.10524, 2024, doi: 10.48550/arXiv.2409.10524.
- B. Osiński, P. Milos, A. Jakubowski, P. Zięcina, M. Martyniak, C. Galias, A. Breuer, S. Homoceanu, H. Michalewski, “CARLA real traffic scenarios – novel training ground and benchmark for autonomous driving”, arXiv preprint arXiv:2012.11329, 2020, doi: 10.48550/arXiv.2012.11329.
- D. J. Fremont, E. Kim, Y. V. Pant, S. A. Seshia, A. Acharya, X. Bruso, P. Wells, S. Lemke, Q. Lu, S. Mehta, “Formal scenario-based testing of autonomous vehicles: From simulation to the real world”, in International Conference on Intelligent Transportation (ITSC), 1–8, 2020, doi: 10.1109/ITSC45102.2020.9294368.
- N. Hanselmann, K. Renz, K. Chitta, A. Bhattacharyya, A. Geiger, “KING: Generating safety-critical driving scenarios for robust imitation via kinematics gradients”, in Proceedings of the European Conference on Computer Vision, 333–350, 2022, doi: 10.1007/978-3-031-19839-7_20.
- H.-S. Cho, Y.-J. Park, M. Park, J. Son, “Study on designing scenarios to evaluate adverse condition positioning for highly reliable autonomous driving”, The Transactions of the Korean Society of Automotive Engineers, 31, 1021–1037, 2023, doi: 10.7467/KSAE.2023.31.12.1021.
- Glender Brás, Samara Leal, Breno Sousa, Gabriel Paes, Cleberson Junior, João Souza, Rafael Assis, Tamires Marques, Thiago Teles Calazans Silva, "Machine Learning Methods for University Student Performance Prediction in Basic Skills based on Psychometric Profile", Advances in Science, Technology and Engineering Systems Journal, vol. 10, no. 4, pp. 1–13, 2025. doi: 10.25046/aj100401
- Andi Kristanto, Utari Dewi, Dina Fitria Murad, Yumiati, Santi Dewiki, Tiara Sevi Nurmania, "Utilization of Generative Artificial Intelligence to Improve Students’ Visual Literacy Skills", Advances in Science, Technology and Engineering Systems Journal, vol. 10, no. 3, pp. 1–8, 2025. doi: 10.25046/aj100301
- Anh-Thu Mai, Duc-Huy Nguyen, Thanh-Tin Dang, "Transfer and Ensemble Learning in Real-time Accurate Age and Age-group Estimation", Advances in Science, Technology and Engineering Systems Journal, vol. 7, no. 6, pp. 262–268, 2022. doi: 10.25046/aj070630
- Maria J. Poblaciones, "Opinion and Effectiveness of Kahoot! use in Online Distance Learning in Crop Production at Higher Education Level: A Case of Study", Advances in Science, Technology and Engineering Systems Journal, vol. 7, no. 1, pp. 8–13, 2022. doi: 10.25046/aj070102
- Seok-Jun Bu, Hae-Jung Kim, "Ensemble Learning of Deep URL Features based on Convolutional Neural Network for Phishing Attack Detection", Advances in Science, Technology and Engineering Systems Journal, vol. 6, no. 5, pp. 291–296, 2021. doi: 10.25046/aj060532
- Mariutsi Alexandra Osorio-Sanabria, Astrid Jaime, Tamara Alcantara-Concepcion, Piedad Barreto, "Open Access Research Trends in Higher Education: A Literature Review", Advances in Science, Technology and Engineering Systems Journal, vol. 6, no. 2, pp. 499–511, 2021. doi: 10.25046/aj060257
- Dionisius Saviordo Thenuardi, Benfano Soewito, "Indoor Positioning System using WKNN and LSTM Combined via Ensemble Learning", Advances in Science, Technology and Engineering Systems Journal, vol. 6, no. 1, pp. 242–249, 2021. doi: 10.25046/aj060127
- Meyliana, Yakob Utama Chandra, Cadelina Cassandra, Surjandy, Erick Fernando, Henry Antonius Eka Widjaja, Harjanto Prabowo, "Education Value Chain Model for Examination, Grading, and Evaluation Process in Higher Education based on Blockchain Technology", Advances in Science, Technology and Engineering Systems Journal, vol. 5, no. 6, pp. 1698–1703, 2020. doi: 10.25046/aj0506202
- Tedi Priatna, Dian Sa’adillah Maylawati, Hamdan Sugilar, Muhammad Ali Ramdhani, "Social Engineering to Establish Digital Culture in Higher Education", Advances in Science, Technology and Engineering Systems Journal, vol. 5, no. 6, pp. 1474–1479, 2020. doi: 10.25046/aj0506177
- Fernando Richter Vidal, Feliz Gouveia, Christophe Soares, "Blockchain Application in Higher Education Diploma Management and Results Analysis", Advances in Science, Technology and Engineering Systems Journal, vol. 5, no. 6, pp. 871–882, 2020. doi: 10.25046/aj0506104
- Ghassan Frache, Hector Nistazakis, George Tombras, "Constructing Learning-by-Doing Pedagogical Model for Delivering 21st Century Engineering Education", Advances in Science, Technology and Engineering Systems Journal, vol. 3, no. 1, pp. 115–124, 2018. doi: 10.25046/aj030114
- Lucila Romero, Milagros Gutierrez, Laura Caliusco, "Semantic modeling of portfolio assessment in e-learning environment", Advances in Science, Technology and Engineering Systems Journal, vol. 2, no. 1, pp. 149–156, 2017. doi: 10.25046/aj020117