Using Data Mining Algorithms for Thalassemia Risk Prediction
International Journal of Biomedical Science and Engineering
Volume 7, Issue 2, June 2019, Pages: 33-44
Received: Aug. 7, 2019; Accepted: Aug. 23, 2019; Published: Sep. 6, 2019
Views 496      Downloads 132
Ngozi Chidozie Egejuru, Department of Computer Science, Hallmark University, Ijebu-Itele, Nigeria
Sekoni Olayinka Olusanya, Department of Computer Science, Tai Solarin University of Education, Ijebu Ode, Nigeria
Adanze Onyenonachi Asinobi, Department of Pediatrics, College of Medicine, University of Ibadan, Ibadan, Nigeria
Omotayo Joseph Adeyemi, Department of Computer Science, Tai Solarin University of Education, Ijebu Ode, Nigeria
Victor Oluwatimilehin Adebayo, Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria
Peter Adebayo Idowu, Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria
Article Tools
Follow on us
This study predict the risk of thalassemia in all age groups based on identified risk of thalassemia. Knowledge about the risk factors for thalassemia was identified using structural interview with experienced medical personnel and questionnaire which was used to collect empirical medical database on the parameters. Supervised machine learning algorithms was used to formulate the predictive model for risk of thalassemia using the parameters and data identified and collected. The predictive model for the risk of thalassemia was simulated using the Waikato Environment for Knowledge Analysis (WEKA). The simulated model was validated using the historical data collected from the hospitals explaining the parameters and the risk of Thalassemia. The results of the study showed that following the collection of data from 51 patients, the parameters identified included demographic variables like gender, age, marital status, ethnicity and social class while the clinical variables included family history, spleen enlargement, diabetes, urine colour changes and parent carriers while the distribution of the risk was 43% no cases, 10% low cases, 16% moderate cases and 31% high cases. The study concluded that using the multi-layer perceptron for the prediction of Thalassemia will improve the decision making process within the healthcare service concerning Thalassemia.
Thallasemia, Anaemia, Predictive Model, Naïve Bayes, Classifier, Multilayer Perceptron
To cite this article
Ngozi Chidozie Egejuru, Sekoni Olayinka Olusanya, Adanze Onyenonachi Asinobi, Omotayo Joseph Adeyemi, Victor Oluwatimilehin Adebayo, Peter Adebayo Idowu, Using Data Mining Algorithms for Thalassemia Risk Prediction, International Journal of Biomedical Science and Engineering. Vol. 7, No. 2, 2019, pp. 33-44. doi: 10.11648/j.ijbse.20190702.12
Copyright © 2019 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Gretchen Holm and Kristeen Cherney (2017). Thalassemia. Available from [Access on 7th August, 2018].
World Health Organization (WHO) (1968). Nutritional anemia: report of a WHO Scientific Group. Geneva, Switzerland: World Health Organization.
World Health Organization (2011). WHO Vitamin and Mineral Nutrition. Geneva, Switzerland: World Health Organization.
Ivoke, N., Eyo, J. E., Ivoke, O. N., Nwani, C. D., Odii, E. C., Asogwa, C. N., Ekeh, F. N. and Atama, C. I. (2013). Anaemia Prevalence and Associated Factors among Women Attending Antenatal Clinics in South-Western Ebonyi State, Nigeria. International Journal of Medicine and Medical Sciences 46 (4): 1354-1359.
Osungbade, K. O. and Oladunjoye, A. O. (2012). Preventive treatments of iron deficiency Anaemia in pregnancy: a review of their effectiveness and implications for health system strengthening. Journal of Pregnancy 2012: 1-7.
Siteti, M. C., Namasaka, S. D., Ariya, O. P., Injete, S. D. and Wanyonyi, W. A. (2014). Anaemia in Pregnancy: Prevalence and Possible Risk Factors in Kakamega County, Kenya. Science Journal of Public Health 2 (3): 216-222.
World Health Organization (WHO) (1992). The prevalence of Anaemia in women; a tabulation of available information. Geneva: World Health Organization.
Kaur, P., Singh, M. and Josan, G. (2015). Classification and Prediction Based Data Mining Algorithms to Predict Slow Learners in Education Sector. Procedia Computer Science 57: 500-508.
Kishore, C. R., Rao, K. P. and Murthy, G. (2015). Performance Evaluation of Entropy and Gini Using Threaded and Non-Threaded ID3 on Anaemia Dataset. Life 6 (10): 10-12.
Chuang L-Y, Wu K-C, Chang H-W, Yang C-H (2011) Support vector machine-based prediction for oral cancer using four snps in DNA repair genes. In: Proceedings of the international multiconference of engineers and computer scientists, March 16–18 2011.
Hu, Z. Fan, C., Oh, D. S., Marron, J. S., He, X., Qaqish, B. F., Livasy, C. and Carey, L. A. (2006). The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 7: 96-107.
Huang, C.-L., Liao, H.-C. and Chen, M.-C. (2008). Prediction Model Building and Feature Selection with Support Vector Machines in Cancer Diagnosis. Journal of Expert Systems with Applications 34 (1): 578-587.
Curiac, D. I., Vasile, G., Banias, O., Volosencu, C and Albu, A. (2009). Bayesian Network Model for Diagnosis of Psychiatric Diseases. In Proceedings of the ITI 2009 31st International Conference on Information Technology Interfaces held on June 22-25, 2009 at Cavtat, Croatia: 61-66.
Amin, N. and Habib, A. (2015). Conparison of Different Classification Technoiques Using WEKA for Hematological Data. American Journal of Engineering Research (AJER) 4 (3): 55-61.
Idowu, P. A., Aladekomo, T. A., Williams, K. O. and Balogun, J. A. (2015). Predictive Model for Likelihood of Survival of Sickle Cell Anemia (SCA) among Pediatric Patients using Fuzzy Logic. Transactions in Networks and Communications 31 (1): 31-44.
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
Tel: (001)347-983-5186