dc.description.abstract |
The pronunciation of words and phrases in any language involves careful manipulation of linguistic features. Factors such as age, motivation, accent, phonetics, stress and intonation sometimes cause a problem of inappropriate or incorrect pronunciation of words from non-native languages. Pronunciation of words using different phonological rules has a tendency of changing the meaning of those words. This study presents the development of an automatic pronunciation assistant system for under-resourced languages of Limpopo Province, namely, Sepedi, Xitsonga, Tshivenda and isiNdebele.
The aim of the proposed system is to help non-native speakers to learn appropriate and correct pronunciation of words/phrases in these under-resourced languages. The system is composed of a language identification module on the front-end side and a speech synthesis module on the back-end side. A support vector machine was compared to the baseline multinomial naive Bayes to build the language identification module. The language identification phase performs supervised multiclass text classification to predict a person’s first language based on input text before the speech synthesis phase continues with pronunciation issues using the identified language. The speech synthesis on the back-end phase is composed of four baseline text-to-speech synthesis systems in selected target languages. These text-to-speech synthesis systems were based on the hidden Markov model method of development. Subjective listening tests were conducted to evaluate the performance of the quality of the synthesised speech using a mean opinion score test. The mean opinion score test obtained good performance results on all targeted languages for naturalness, pronunciation, pleasantness, understandability, intelligibility, overall quality of the system and user acceptance. The developed system has been implemented on a “real-live” production web-server for performance evaluation and stability testing using live data. |
en_US |