Digital Language Mining Platform for Nigerian Languages (DLMP)
International Journal on Data Science and Technology
Volume 5, Issue 1, March 2019, Pages: 1-7
Received: Mar. 1, 2019; Accepted: Apr. 8, 2019; Published: May 15, 2019
Views 573      Downloads 94
Emejulu Augustine Obiajulu, Department of Communication and Translation Studies, National Institute for Nigerian Languages, Aba, Nigeria
Okpala Izunna Udebuana, Department of Communication and Translation Studies, National Institute for Nigerian Languages, Aba, Nigeria
Nwakanma Ifeanyi Cosmas, Department of Information Management Technology, Federal University of Technology, Owerri, Nigeria
Article Tools
Follow on us
Effective communication occurs when the receiver and sender both understand and synchronize the flow of information across board. The utility of language extends beyond human to human interaction and includes also, the use of syntactically formed programming languages to interact with digital systems. Nigeria has an estimate of over 450 languages, which makes it cumbersome to harmonize and put all into a single large repository for data mining. The goal of this paper is to firmly establish the importance of Information Technology in galvanizing Nigerian Languages and Mining scientific data thereof. The purpose of applying Information and Communication Technology (ICT) is to codify the process of extracting various underlying meanings in a language, processing the various idioms, proverbs and quaint statements in such language with the view of bringing out the creativity behind them. The authors explore the developmental stages and techniques of applying an artificial Intelligence system that scans through a given indigenous linguistic system to bring out the hidden facts therein. It is recommended that stakeholders in the ‘digital humanities’ adopt such mining platforms which helps in achieving greater insight into the diverse cultures and languages, in turn, promoting easy learning experience for indigenous languages.
Artificial Intelligence, Language Mining, Nigeria, Communication
To cite this article
Emejulu Augustine Obiajulu, Okpala Izunna Udebuana, Nwakanma Ifeanyi Cosmas, Digital Language Mining Platform for Nigerian Languages (DLMP), International Journal on Data Science and Technology. Vol. 5, No. 1, 2019, pp. 1-7. doi: 10.11648/j.ijdst.20190501.11
Copyright © 2019 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Pulse Newspaper. (2018). Why are Nigerians shying away from their mother tongue?
Beaugrande and Dressler (1992). Nigeria and the role of English language in the 21st century Retrieved from
Nhlapo, T., Arogundade, E., & Garuba, H. (2014). Things fall apart? reflections on the legacy of Chinua Achebe. Zhao, Z. A., & Liu, H. (2011). Spectral feature selection for data mining. Chapman and Hall/CRC.
Danladi, S. S. (2013). Language policy: Nigeria and the role of English language in the 21st century. European Scientific Journal, ESJ, 9(17).
Yusuf, O. (2010). Basic Linguistics for Nigerian Languages. Ijebu-Ode: Shebiotimo Publications.
Chuvakin, A., Schmidt, K., & Phillips, C. (2012). Logging and log management: the authoritative guide to understanding the concepts surrounding logging and log management. Newnes.
Song, Y. Y., & Ying, L. U. (2015). Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry, 27(2), 130.
Agatonovic-Kustrin S1, Beresford R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Retrieved from
Khan, A., Baharudin, B., & Lee, L. H. (2010). Khairullah khan, (2010). A Review of Machine Learning Algorithms for Text-Documents Classification, journal of advances in information technology, 1(1).
Bohanec, M., Moyle, S., & Wettschereck, D. (2001). A software architecture for data pre-processing using data mining and decision support models.
Vaidya, J., & Clifton, C. W. (2009). Privacy-Preserving Kth Element Score over Vertically Partitioned Data. IEEE Trans. Knowl. Data Eng., 21(2), 253-258.
Cook, J. E., & Wolf, A. L. (1998). Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology (TOSEM), 7(3), 215-249.
Cortadella, J., Kishinevsky, M., Kondratyev, A., Lavagno, L., & Yakovlev, A. (1997). Petrify: a tool for manipulating concurrent specifications and synthesis of asynchronous controllers. IEICE Transactions on information and Systems, 80(3), 315-325.
Gao, F., Xing, C., Du, X., & Wang, S. (2007). Personalized service system based on hybrid filtering for digital library. Tsinghua Science and Technology, 12(1), 1-8.
Santesteban, M., Pickering, M. J., Laka, I., & Branigan, H. P. (2015). Effects of case-marking and head position on language production? Evidence from an ergative OV language. Language, Cognition and Neuroscience, 30(9), 1175-1186.
Hollmann, J., Ardö, A., & Stenström, P. (2007). Effectiveness of caching in a distributed digital library system. Journal of Systems Architecture, 53(7), 403-416.
Khenn Adatan (2013). Transcript of Language as a Key Aspect of Culture Relationship of Language and Culture. Retrieved from
Yang, L., Shin, S., Choi, Y., Choi, M., & Lee, Y. (2007, April). A surrogate variable-based data mining method using CFS and RSM. In Proceedings of the 6th WSEAS International Conference on Applied Computer Science (pp. 651-657).
Lorenz, R., Bergenthum, R., Desel, J., & Mauser, S. (2007, July). Synthesis of petri nets from finite partial languages. In Application of Concurrency to System Design, 2007. ACSD 2007. Seventh International Conference on (pp. 157-166). IEEE.
Ren, L., Song, M., & Song, J. (2004, May). A novel data type for the protocol of data synchronization. In Computer Supported Cooperative Work in Design, 2004. Proceedings. The 8th International Conference on (Vol. 1, pp. 532-535). IEEE.
Parekh, R., & Honavar, V. (2000). Grammar inference, automata induction, and language acquisition. Handbook of natural language processing, 727-764.
Agarwal, R. C., Aggarwal, C. C., & Prasad, V. V. V. (2001). A tree projection algorithm for generation of frequent item sets. Journal of parallel and Distributed Computing, 61(3), 350-371.
Blench, R. M. (2014). The origins of nominal affixes in MSEA languages: convergence, contact and some African parallels. Languages of Mainland Southeast Asia: The State of the Art, 550-577.
Rufai, A. (1977). The Question of a National Language in Nigeria. Language and Linguistic Problems in Africa Columbia, South Carolina, Hornbean.
Schrijver, A. (1998). Theory of linear and integer programming. John Wiley & Sons.
Tung, A. K., Hou, J., & Han, J. (2001). Spatial clustering in the presence of obstacles. In Data Engineering, 2001. Proceedings. 17th International Conference on (pp. 359-367). IEEE.
Van Dongen, B. F., Busi, N., Pinna, G., & van der Aalst, W. M. (2007). An iterative algorithm for applying the theory of regions in process mining. In Proceedings of the workshop on formal approaches to business processes and web services (FABPWS’07) (pp. 36-55). Publishing House of University of Podlasie, Siedlce, Poland.
Van der Aalst, W. M., Rubin, V., van Dongen, B. F., Kindler, E., & Günther, C. W. (2006). Process mining: A two-step approach using transition systems and regions. BPM Center Report BPM-06-30, BPMcenter. org, 6.
Van der Aalst, W. M., van Dongen, B. F., Herbst, J., Maruster, L., Schimm, G., & Weijters, A. J. (2003). Workflow mining: A survey of issues and approaches. Data & knowledge engineering, 47(2), 237-267.
Maniatty, W. A., & Zaki, M. J. (2000). Systems support for scalable data mining. ACM SIGKDD Explorations Newsletter, 2(2), 56-65.
Yermack, D. (2015). Is Bitcoin a real currency? An economic appraisal. In Handbook of digital currency (pp. 31-43).
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
Tel: (001)347-983-5186