Isolated Swahili Words Recognition using Sphinx4
Shadrack K. Kimutai1, Edna Milgo2, David Gichoya3
1Shadrack Kipchirchir Kimutai, IT Department, School of Information Science, Moi University, Eldoret, Kenya
2Edna Milgo, IT Department, School of Information Science, Moi University, Eldoret, Kenya.
3David Gichoya, IT Department, School of Information Science, Moi University, Eldoret, Kenya.
Manuscript received on December 11, 2013. | Revised Manuscript received on December 15, 2013. | Manuscript published on December 25, 2013. | PP:63-66 | Volume-2 Issue-2, December 2013. | Retrieval Number: B0589122213/2013©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Speech recognition is one of the frontiers in Human Computer Interaction. A number of tools used to achieve speech recognition are currently available. One of such tools is Sphinx4 from Carnegie Mellon University (CMU). It has a recognition engine based on discrete Hidden Markov Model (dHMM) and a modular structure making it flexible to a diverse set of requirements. However, most efforts that have been undertaken using this tool are focused on established dialects such as English and French. Despite Swahili being a major spoken language in Africa, literature search indicates that little research has been undertaken in developing a speech recognition tool for this dialect. In this paper, we propose an approach to building a Swahili speech recognizer using Sphinx4 to demonstrate its adaptability to recognition of spoken Swahili words. To realize this, we examined the Swahili language structure and sound synthesis processes. Then, a 40 word Swahili acoustic model was built based on the observed language and sound structures using CMU Sphinxtrain and associate tools. The developed acoustic model was then tested using sphinx4.
Keywords: Sphinx4, Swahili Language, Speech Recognition, Hidden Markov Model.