Please enter verification code
Factors Influencing Secondary School Student’s Performance Through Variable Decision Tree Data Mining Technique
International Journal of Data Science and Analysis
Volume 6, Issue 5, October 2020, Pages: 120-129
Received: Jan. 17, 2020; Accepted: Sep. 10, 2020; Published: Sep. 25, 2020
Views 243      Downloads 51
Yousaf Ali Khan, School of Statistics, Jiangxi University of Finance and Economics, Nanchang, China; Department of Mathematics and Statistics, Hazara University Mansehra, Mansehra, Pakistan
Article Tools
Follow on us
Schools are considered as the backbone for long-term economic progress. No country can develop without increasing their education level. Despite the fact that the Portuguese population shows a brilliant development in their educational level from last decade, but still Portugal lies on the tail surrender of Europe in statistics because of excessive levels of student failure. Primarily, this costs a lot better in the middle of the elegance of Mathematics and Portuguese. On the other hand, the field of data mining (DM), the purpose of extracting the high-stage knowledge of raw statistics, automatic gear compelling offer to a useful source of training domain. This paper pursues to improve the overall performance of middle school students of Portugal through two variables decision tree, which is a favorable approach to data mining used for classification, prediction and factors explored with the help of their significance. Results shows that, provided the first and / or second interval school grades, awesome prediction accuracy can be achieved. Despite the success of students strongly influenced by father's job assistance; evaluation has clearly shown that there are also other elements (such as learning time, mother's occupation, the desire of higher education, the paid-classes and the travel time from home and school, etc.) are important elements which have great impact on the performance of students in secondary school education in Portugal. As a direct result of this study, through which specialize in these factors and create a kind of policy is mainly based on studies in the country width exceptional level of education may increase at the secondary level that produces goose bumps to the stage of higher education in Europe.
Data Mining in Education, Secondary School, Decision Tree, Performance, Classification, Europe’s
To cite this article
Yousaf Ali Khan, Factors Influencing Secondary School Student’s Performance Through Variable Decision Tree Data Mining Technique, International Journal of Data Science and Analysis. Vol. 6, No. 5, 2020, pp. 120-129. doi: 10.11648/j.ijdsa.20200605.11
Copyright © 2020 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Eurostat., (2007). Early school-leavers.
Mejer. L., Turchetti.P., and Gere. E., (2011). Trends in European Education during the Last Decade.
Turban. E., Sharda. R., Aronson. J., and King. D., (2007). Business Intelligence, A Managerial Approach. Prentice-Hall.
Ma. Y., Liu. B., Wong C., Yu. P., and Lee S., (2000). Targeting the Right Students Using Data Mining. In Proc. of 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, USA, 457–464.
Luan. J., (2002). Data Mining and Its Applications in Higher Education. New Directions for Institutional Research, 113, 17–36.
Minaei-Bidgoli. B., Kashy. D., Kortemeyer G., and Punch. W., (2003). Predicting Student Performance: an application of data mining methods with an educational web-based system. In Proc. of IEEE Frontiers in Education. Colorado, USA, 13–18.
Kotsiantis. S., Pierrakeas C., and Pintelas. P., (2004). Pre-dicting Students’ Performance in Distance Learning Using Machine Learning Techniques. Applied Artificial Intelligence (AAI), 18, no. 5, 411–426.
Pardos. Z., Heffernan N., Anderson. B., and Heffernan. C., (2006). Using Fine-Grained Skill Models to Fit Student Performance with Bayesian Networks. In Proc. of 8th Int. Conf. On Intelligent Tutoring Systems. Taiwan.
Cortez. P, and A. Silva. A., (2008) Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th Future Business Technology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, EUROSIS, ISBN 978-9077381-39-7.
Patel. B. N., Prajapati. S. G., and Lakh aria. K. I., (2012). Efficient Classification of Data Using Decision Tree. Bunfring International Journal of Data Mining, vol. 2, no. 1, pp. 6-12.
Wang. L. M., Li. X. L., Cao. C. H., and Yuan. S. M., (2006). Combining Decision Tree and Naïve Bayes for Classification. Knowledge-Based Systems, vol. 19, no. 7, pp. 511–515.
Aitkenhead. M. J., (2008). A Co-Evolving Decision Tree Classification Method, Expert Systems with Applications, vol. 34, no. 1, pp. 18–25.
Kraskov. A., Stogbauer. H., and Grass Berger. P., (2004). Estimating Mutual Information. Phys Rev E Stat Nonlin Soft Matter Phys 69 (6 Pt 2): 066138.
Kinney. J. B., and Gurinder. S. A (2014). Equitability, mutual information, and the Maximal Information Coefficient. PNAS, vol. 111, no. 9, pp. 3354–3359.
Reshef. D. N., et al. (2011). Detecting Novel Associations in Large Data Sets. Science 334 (6062): 1518-1524.
Reshef. D. N., Reshef. Y., Mitzenmacher. M., and Sabeti. P., (2013) Equitability Analysis of the Maximal Information Coefficients with Comparisons. arXiv: 1301.6314v1 [cs. LG].
Hastie. T., Tibshirani. R., and Friedman. J. H (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction Springer Verlag, New York.
Filose M, et al. (2013). Minerva: Maximal Information-Based Nonparametric Exploration R package for Variable Analysis version 1.3 URL,
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
Tel: (001)347-983-5186