Predicción de repitencias en estudiantes a nivel escolar usando Machine Learning: una revisión sistemática
DOI: clave:
Métodos de predicción, Repitencia escolar, Machine learning, Deserción escolar, EducaciónResumen
El objetivo principal de la investigación es determinar el estado del arte de la investigación acerca de la Predicción de repitencias en estudiantes a nivel escolar usando Machine Learning. Los resultados obtenidos se han centrado en estudios relacionados a los algoritmos y herramientas de Machine Learning más eficientes para la predicción de repitencia estudiantil. Esto se llevó a cabo mediante una revisión sistemática de la literatura (RSL) en base a Machine Learning, para la predicción de estudiantes con repitencia escolar entre los años 2017-2021. La estrategia de búsqueda identificó 47,490 artículos de bibliotecas digitales como ACM Digital Library, ERIC, Google Scholar, IEEE Xplore, Microsoft Academic, Science Direct y Taylor & Francis Online; de las cuales 90 fueron identificados y seleccionados como adecuados para la revisión. En cuanto a las conclusiones, estas presentan respuestas sobre las categorías de variables más aplicadas en la predicción de repitencia escolar en estudiantes, las métricas utilizadas para evaluar los resultados de la predicción de repitencia escolar, autores con mayor productividad en la predicción de repitencia escolar, y los artículos más citados cuyas discusiones y conclusiones se caracterizan por su objetividad y polaridad en las investigaciones sobre la predicción de estudiantes con repitencia escolar usando Machine Learning.Descargas
Abu Saa, A., Al-Emran, M., & Shaalan, K. (2019). Factors Affecting Students’ Performance in Higher Education: A Systematic Review of Predictive Data Mining Techniques. In Technology, Knowledge and Learning, 24 (4). Springer Netherlands.
Abu Zohair, L. M. (2019). Prediction of Student’s performance by modelling small dataset size. International Journal of Educational Technology in Higher Education, 16 (1).
Adelman, M., Haimovich, F., Ham, A., & Vazquez, E. (2018). Predicting school dropout with administrative data: new evidence from Guatemala and Honduras. Education Economics, 26 (4), 356–372.
Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A. A., Abid, M., Bashir, M., & Khan, S. U. (2021). Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models. IEEE Access, 9, 7519–7539.
Agrusti, F., Bonavolontà, G., & Mezzini, M. (2019). University dropout prediction through educational data mining techniques: A systematic review. Journal of E-Learning and Knowledge Society, 15 (3), 161–182.
Alban, M., & Mauricio, D. (2019). Predicting University Dropout trough Data Mining: A systematic Literature. Indian Journal of Science and Technology, 12 (4), 1–12.
Al-Shabandar, R., Hussain, A. J., Liatsis, P., & Keight, R. (2019). Detecting at-risk students with early interventions using machine learning techniques. IEEE Access, 7, 149464–149478.
Baker, R. S., Berning, A. W., Gowda, S. M., Zhang, S., & Hawn, A. (2020). Predicting K-12 Dropout. Journal of Education for Students Placed at Risk, 25 (1), 28–54.
Berens, J., Schneider, K., Görtz, S., Oster, S., & Burghoff, J. (2021). Early Detection of Students at Risk – Predicting Student Dropouts Using Administrative Student Data and Machine Learning Methods. SSRN Electronic Journal, 11 (3), 1–41.
Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology. Journal of Science Education and Technology, 30 (2), 193–209.
Borrella, I., Caballero-Caballero, S., & Ponce-Cueto, E. (2019). Predict and intervene: Addressing the dropout problem in a MOOC-based program. Proceedings of the 6th 2019 ACM Conference on Learning at Scale, L@S 2019.
Botelho, A. F., Varatharaj, A., Patikorn, T., Doherty, Di., Adjei, S. A., & Beck, J. E. (2019). Developing Early Detectors of Student Attrition and Wheel Spinning Using Deep Learning. IEEE Transactions on Learning Technologies, 12 (2), 158–170.
Cagliero, L., Canale, L., Farinetti, L., Baralis, E., & Venuto, E. (2021). Predicting student academic performance by means of associative classification. Applied Sciences (Switzerland), 11 (4), 1–22.
Çam, E., & Özdağ, M. E. (2020). Discovery of Course Success Using Unsupervised Machine Learning Algorithms. Malaysian Online Journal of Educational Technology, 9 (1), 26–47.
Çetinkaya, A., & Baykan, Ö. K. (2020). Prediction of middle school students’ programming talent using artificial neural networks. Engineering Science and Technology, an International Journal, 23 (6), 1301–1307.
Chen, Y., & Zhang, M. (2017). MOOC student dropout: Pattern and prevention. ACM International Conference Proceeding Series, Part F1277.
Chien, H., Kwok, O.-M., Yeh, Y.-C., Sweany, N. W., Baek, E., & McIntosh, W. A. (2020). Identifying At-Risk Online Learners by Psychological Variables Using Machine Learning Techniques. Online Learning, 24 (4), 131–146.
Chung, J. Y., & Lee, S. (2019). Dropout early warning systems for high school students using machine learning. Children and Youth Services Review, 96, 346–353.
Cornell-Farrow, S., & Garrard, R. (2020). Machine learning classifiers do not improve the prediction of academic risk: Evidence from Australia. Communications in Statistics Case Studies Data Analysis and Applications, 6 (2), 228–246.
Corry, M., Dardick, W., & Stella, J. (2017). An examination of dropout rates for Hispanic or Latino students enrolled in online K-12 schools. Education and Information Technologies, 22 (5), 2001–2012.
Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256.
Coussement, K., Phan, M., De Caigny, A., Benoit, D. F., & Raes, A. (2020). Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model. Decision Support Systems, 135 (12), 113325.
de la Fuente-Mella, H., Gutiérrez, C. G., Crawford, K., Foschino, G., Crawford, B., Soto, R., de la Barra, C. L., Caneo, F. C., Monfroy, E., Becerra-Rozas, M., & Elórtegui-Gómez, C. (2020). Analysis and prediction of engineering student behavior and their relation to academic performance using data analytics techniques. Applied Sciences (Switzerland), 10 (20), 1–11.
De Melo, G., Vasconcelos-Filho, E. P., Oliveira, S. M., Calixto, W. P., Ferreira, C. C., & Furriel, G. P. (2017). Evaluation techniques of machine learning in task of reprovation prediction of technical high school students. 2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2017 - Proceedings, 2017-Janua (Ml), 1–7.
Del Bonifro, F., Gabbrielli, M., Lisanti, G., & Zingaro, S. P. (2020). Student dropout prediction. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12163 LNAI, 129–140.
Do Nascimento, R. L. S., Fagundes, R. A. A., & MacIel, A. M. A. (2019). Prediction of school efficiency rates through ensemble regression application. Proceedings - IEEE 19th International Conference on Advanced Learning Technologies, ICALT 2019, 2161-377X (4), 194–198.
Ezz, M., & Elshenawy, A. (2020). Adaptive recommendation system using machine learning algorithms for predicting student’s best academic program. Education and Information Technologies, 25 (4), 2733–2746.
F. Gontzis, A., Kotsiantis, S., T. Panagiotakopoulos, C., & Verykios, V. S. (2022). A predictive analytics framework as a countermeasure for attrition of students. Interactive Learning Environments, 30 (3), 568–582.
Figueroa-Canas, J., & Sancho-Vinuesa, T. (2020). Early prediction of dropout and final exam performance in an online statistics course. Revista Iberoamericana de Tecnologias Del Aprendizaje, 15 (2), 86–94.
Freitas, F. A., Vasconcelos, F. F. X., Peixoto, S. A., Hassan, M. M., Ali Akber Dewan, M., de Albuquerque, V. H. C., & Rebouças Filho, P. P. (2020). IoT system for school dropout prediction using machine learning techniques based on socioeconomic data. Electronics (Switzerland), 9 (10), 1–14.
Gitinabard, N., Khoshnevisan, F., Lynch, C. F., & Wang, E. Y. (2018). Your actions or your associates? Predicting certification and dropout in MOOCs with behavioral and social features. Proceedings of the 11th International Conference on Educational Data Mining, EDM 2018.
Gkontzis, A. F., Kotsiantis, S., Tsoni, R., & Verykios, V. S. (2018). An effective LA approach to predict student achievement. ACM International Conference Proceeding Series, 76–81.
Gómez-Pulido, J. A., Durán-Domínguez, A., & Pajuelo-Holguera, F. (2020). Optimizing latent factors and collaborative filtering for students’ performance prediction. Applied Sciences (Switzerland), 10 (16).
Goopio, J., & Cheung, C. (2021). The MOOC dropout phenomenon and retention strategies. Journal of Teaching in Travel and Tourism, 21 (2), 177–197.
Hai-tao, P., Ming-qu, F., Hong-bin, Z., Bi-zhen, Y., Jin-jiao, L., Chun-fang, L., Yan-ze, Z., & Rui, S. (2021). Predicting academic performance of students in Chinese-foreign cooperation in running schools with graph convolutional network. Neural Computing and Applications, 33 (2), 637–645.
Helal, S., Li, J., Liu, L., Ebrahimie, E., Dawson, S., & Murray, D. J. (2019). Identifying key factors of student academic performance by subgroup discovery. International Journal of Data Science and Analytics, 7 (3), 227–245.
Hellas, A., Ihantola, P., Petersen, A., Ajanovski, V. V., Gutica, M., Hynninen, T., Knutas, A., Leinonen, J., Messom, C., & Liao, S. N. (2018). Predicting academic performance: a systematic literature review. Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, 175–199.
Herodotou, C., Hlosta, M., Boroowa, A., Rienties, B., Zdrahal, Z., & Mangafa, C. (2019). Empowering online teachers through predictive learning analytics. British Journal of Educational Technology, 50 (6), 3064–3079.
Hlosta, M., Zdrahal, Z., & Zendulka, J. (2017). Ouroboros: Early identification of at-risk students without models based on legacy data. ACM International Conference Proceeding Series, 6–15.
Hmedna, B., El Mezouary, A., & Baz, O. (2020). A predictive model for the identification of learning styles in MOOC environments. Cluster Computing, 23 (2), 1303–1328.
Hoffait, A. S., & Schyns, M. (2017). Early detection of university students with potential difficulties. Decision Support Systems, 101, 1–11.
Huang, A. Y. Q., Lu, O. H. T., Huang, J. C. H., Yin, C. J., & Yang, S. J. H. (2020). Predicting students’ academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Interactive Learning Environments, 28 (2), 206–230.
Huberts, L. C. E., Schoonhoven, M., & Does, R. J. M. M. (2020). Multilevel process monitoring: A case study to predict student success or failure. Journal of Quality Technology, 54 (2), 1–17.
Huo, H., Cui, J., Hein, S., Padgett, Z., Ossolinski, M., Raim, R., & Zhang, J. (2020). Predicting Dropout for Nontraditional Undergraduate Students: A Machine Learning Approach. Journal of College Student Retention: Research, Theory and Practice, 24 (4).
Iatrellis, O., Savvas, I., Fitsilis, P., & Gerogiannis, V. C. (2021). A two-phase machine learning approach for predicting student outcomes. Education and Information Technologies, 26 (1), 69–88.
Imran, A. S., Dalipi, F., & Kastrati, Z. (2019). Predicting Student Dropout in a MOOC. 190–195.
Irfan, M., Alam, C. N., & Tresna, D. (2019). Implementation of Fuzzy Mamdani Logic Method for Student Drop Out Status Analytics. Journal of Physics: Conference Series, 1363 (1).
Jin, C. (2020). MOOC student dropout prediction model based on learning behavior features and parameter optimization. Interactive Learning Environments, 1–19.
Jokhan, A., Sharma, B., & Singh, S. (2019). Early warning system as a predictor for student performance in higher education blended courses. Studies in Higher Education, 44 (11), 1900–1911.
Karimi-Haghighi, M., Castillo, C., Hernandez-Leo, D., & Oliver, V. M. (2021). Predicting Early Dropout: Calibration and Algorithmic Fairness Considerations. Companion Proceedings 11th International Conference on Learning Analytics & Knowledge, Ml, 1–10.
Kartal, E., Özyaprak, M., Özen, Z., Şimşek, İ., Köse Biber, S., Biber, M., & Can, T. (2020). Bir Öğrenciyi Üstün Zekâlı ve Yetenekli Olarak Aday Göstermek İçin Doğru Soruları Sormak: Bir Makine Öğrenmesi Yaklaşımı. Bilişim Teknolojileri Dergisi, 13 (4), 385–400.
Kemper, L., Vorhoff, G., & Wigger, B. U. (2020). Predicting student dropout: A machine learning approach. European Journal of Higher Education, 10 (1), 28–47.
Kiss, B., Nagy, M., Molontay, R., & Csabay, B. (2019). Predicting dropout using high school and first-semester academic achievement measures. ICETA 2019 - 17th IEEE International Conference on Emerging ELearning Technologies and Applications, Proceedings, 383–389.
Kitchenham, B. A., & Charters, S. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering. EBSE Technical Report EBSE-2007-01. School of Computer Science and Mathematics, Keele University. January, 1–57
Ko, C. Y., & Leu, F. Y. (2021). Examining Successful Attributes for Undergraduate Students by Applying Machine Learning Techniques. IEEE Transactions on Education, 64 (1), 50–57.
Lacave, C., Molina, A. I., & Cruz-Lemus, J. A. (2018). Learning Analytics to identify dropout factors of Computer Science studies through Bayesian networks. Behaviour and Information Technology, 37 (10–11), 993–1007.
Lee, S., & Chung, J. Y. (2019). The machine learning-based dropout early warning system for improving the performance of dropout prediction. Applied Sciences (Switzerland), 9 (15).
Lee, Y., Shin, D., Loh, H. Bin, Lee, J., Chae, P., Cho, J., Park, S., Lee, J., Baek, J., Kim, B., & Choi, Y. (2020). Deep attentive study session dropout prediction in mobile learning environment. CSEDU 2020 - Proceedings of the 12th International Conference on Computer Supported Education, 1, 26–35.
Lemay, D. J., & Doleck, T. (2020). Predicting completion of massive open online course (MOOC) assignments from video viewing behavior. Interactive Learning Environments, 30 (10), 1–12.
Liao, S. N., Zingaro, D., Thai, K., Alvarado, C., Griswold, W. G., & Porter, L. (2019). A robust machine learning technique to predict low-performing students. ACM Transactions on Computing Education, 19 (3), 1–19.
Lincke, A., Jansen, M., Milrad, M., & Berge, E. (2021). The performance of some machine learning approaches and a rich context model in student answer prediction. Research and Practice in Technology Enhanced Learning, 16 (1).
Livieris, I. E., Drakopoulou, K., Tampakas, V. T., Mikropoulos, T. A., & Pintelas, P. (2019). Predicting Secondary School Students’ Performance Utilizing a Semi-supervised Learning Approach. Journal of Educational Computing Research, 57 (2), 448–470.
Lu, D. N., Le, H. Q., & Vu, T. H. (2020). The factors affecting acceptance of e-learning: A machine learning algorithm approach. Education Sciences, 10 (10), 1–13.
Ma, X., Yang, Y., & Zhou, Z. (2018). Using machine learning algorithm to predict student pass rates in online education. ACM International Conference Proceeding Series, 156–161.
Martínez-Abad, F. (2019). Identification of Factors Associated With School Effectiveness With Data Mining Techniques: Testing a New Approach. Frontiers in Psychology, 10 (11), 1–13.
Moreno-Marcos, P. M., Muñoz-Merino, P. J., Alario-Hoyos, C., Estévez-Ayres, I., & Delgado Kloos, C. (2018). Analysing the predictive power for anticipating assignment grades in a massive open online course. Behaviour and Information Technology, 37 (10–11), 1021–1036.
Moreno-Marcos, P. M., Alario-Hoyos, C., Munoz-Merino, P. J., & Kloos, C. D. (2019). Prediction in MOOCs: A Review and Future Research Directions. IEEE Transactions on Learning Technologies, 12 (3), 384–401.
Mourdi, Y., Sadgal, M., El Kabtane, H., & Berrada Fathi, W. (2019). A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs. International Journal of Web Information Systems, 15 (5), 489–509.
Mubarak, A. A., Cao, H., & Zhang, W. (2020). Prediction of students’ early dropout based on their interaction logs in online learning environment. Interactive Learning Environments, 30 (8), 1–20.
Musso, M. F., Hernández, C. F. R., & Cascallar, E. C. (2020). Predicting key educational outcomes in academic trajectories: a machine-learning approach. Higher Education, 80 (5), 875–894.
Naicker, N., Adeliyi, T., & Wing, J. (2020). Linear Support Vector Machines for Prediction of Student Performance in School-Based Education. Mathematical Problems in Engineering, 2020.
Ninrutsirikun, U., Imai, H., Watanapa, B., & Arpnikanondt, C. (2020). Principal Component Clustered Factors for Determining Study Performance in Computer Programming Class. Wireless Personal Communications, 115 (4), 2897–2916.
Niu, Z., Li, W., Yan, X., & Wu, N. (2018). Exploring causes for the dropout on massive open online courses. ACM International Conference Proceeding Series, 47–52.
Oeda, S., & Hashimoto, G. (2017). Log-Data Clustering Analysis for Dropout Prediction in Beginner Programming Classes. Procedia Computer Science, 112, 614–621.
Orellana, D., Segovia, N., & Cánovas, B. R. (2020). El abandono estudiantil en programas de educación superior virtual: revisión de literatura. Revista de la Educación Superior, 49 (194), 45–62.
Pillutla, V. S., Tawfik, A. A., & Giabbanelli, P. J. (2020). Detecting the Depth and Progression of Learning in Massive Open Online Courses by Mining Discussion Data. Technology, Knowledge and Learning, 25 (4), 881–898.
Qazdar, A., Er-Raha, B., Cherkaoui, C., & Mammass, D. (2019). A machine learning algorithm framework for predicting students performance: A case study of baccalaureate students in Morocco. Education and Information Technologies, 24 (6), 3577–3589.
Rastrollo-Guerrero, J. L., Gómez-Pulido, J. A., & Durán-Domínguez, A. (2020). Analyzing and predicting students’ performance by means of machine learning: A review. Applied Sciences (Switzerland), 10 (3).
Sabri, M., El Bouhdidi, J., & Chkouri, M. Y. (2021). A proposal for a deep learning model to enhance student guidance and reduce dropout. Lecture Notes in Networks and Systems, 144, 158–165.
Shakil Ahamed, A. T. M., Mahmood, N. T., & Rahman, R. M. (2017). An intelligent system to predict academic performance based on different factors during adolescence. Journal of Information and Telecommunication, 1 (2), 155–175.
Shelton, B. E., Yang, J., Hung, J. L., & Du, X. (2018). Two-stage predictive modeling for identifying at-risk students. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11003 LNCS, 578–583.
Tamada, M. M., Netto, J. F. D. M., & De Lima, D. P. R. (2019). Predicting and Reducing Dropout in Virtual Learning using Machine Learning Techniques: A Systematic Review. Proceedings - Frontiers in Education Conference, FIE, 2019-Octob(October).
Thomas, J. J., & Ali, A. M. (2020). Dispositional Learning Analytics Structure Integrated with Recurrent Neural Networks in Predicting Students Performance. Advances in Intelligent Systems and Computing, 1072, 446–456.
Von Hippel, P. T., & Hofflinger, A. (2021). The data revolution comes to higher education: identifying students at risk of dropout in Chile. Journal of Higher Education Policy and Management, 43 (1), 2–23.
Waheed, H., Hassan, S. U., Aljohani, N. R., Hardman, J., Alelyani, S., & Nawaz, R. (2020). Predicting academic performance of students from VLE big data using deep learning models. Computers in Human Behavior, 104.
Wang, H., Li, G., Wang, G., & Lin, L. (2019). CamDrop: A new explanation of dropout and a guided regularization method for deep neural networks. International Conference on Information and Knowledge Management, Proceedings, 1141–1149.
Wang, W., Yu, H., & Miao, C. (2017). Deep model for dropout prediction in MOOCs. ACM International Conference Proceeding Series, Part F1306, 26–32.
Wang, X., Schneider, H., & Walsh, K. R. (2020). A Predictive Analytics Approach to Building a Decision Support System for Improving Graduation Rates at a Four-Year College. Journal of Organizational and End User Computing, 32 (4), 43–62.
Whitehill, J., Mohan, K., Seaton, D., Rosen, Y., & Tingley, D. (2017). MOOC Dropout Prediction. 161–164.
Wu, N. (2019). CLMS - Net : Dropout Prediction in MOOCs with Deep Learning.
Yair, G., Rotem, N., & Shustak, E. (2020). The riddle of the existential dropout: lessons from an institutional study of student attrition. European Journal of Higher Education, 10 (4), 436–453.
Yang, J., Devore, S., Hewagallage, D., Miller, P., Ryan, Q. X., & Stewart, J. (2020). Using machine learning to identify the most at-risk students in physics classes. Physical Review Physics Education Research, 16 (2), 20130.
Yang, Z., Yang, J., Rice, K., Hung, J. L., & Du, X. (2020). Using Convolutional Neural Network to Recognize Learning Images for Early Warning of At-Risk Students. IEEE Transactions on Learning Technologies, 13 (3), 617–630.
Yildiz, M., & Börekci, C. (2020). Predicting Academic Achievement with Machine Learning Algorithms. Journal of Educational Technology and Online Learning, 3 (3).
Yousafzai, B. K., Hayat, M., & Afzal, S. (2020). Application of machine learning and data mining in predicting the performance of intermediate and secondary education level student. Education and Information Technologies, 25 (6), 4677–4697.
Zabriskie, C., Yang, J., Devore, S., & Stewart, J. (2019). Using machine learning to predict physics course outcomes. Physical Review Physics Education Research, 15 (2), 20120.
Zeineddine, H., Braendle, U., & Farah, A. (2021). Enhancing prediction of student success: Automated machine learning approach. Computers and Electrical Engineering, 89 (11), 106903.
Cómo citar
Derechos de autor 2023 Javier Gamboa-Cruzado, Cinthya Y. Alvarez-Cuellar, Shirley Martinez-Medina, Josue Edison Turpo Chaparro, Aníbal Sifuentes Damián, María Rodríguez Kong

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
1. Política propuesta para revistas de acceso abierto
Los autores/as que publiquen en esta revista aceptan las siguientes condiciones:
- Los autores/as conservan los derechos de autor y ceden a la revista el derecho de la primera publicación, con el trabajo registrado con la licencia de atribución de Creative Commons, que permite a terceros utilizar lo publicado siempre que mencionen la autoría del trabajo y a la primera publicación en esta revista.
- Los autores/as pueden realizar otros acuerdos contractuales independientes y adicionales para la distribución no exclusiva de la versión del artículo publicado en esta revista (p. ej., incluirlo en un repositorio institucional o publicarlo en un libro) siempre que indiquen claramente que el trabajo se publicó por primera vez en esta revista.
- Se permite y recomienda a los autores/as a publicar su trabajo en Internet (por ejemplo en páginas institucionales o personales) antes y durante el proceso de revisión y publicación, ya que puede conducir a intercambios productivos y a una mayor y más rápida difusión del trabajo publicado (veaThe Effect of Open Access).