dc.contributor.advisor |
Manamela, M. J. D. |
|
dc.contributor.advisor |
Modipa, T. I. |
|
dc.contributor.author |
Malatji, Promise Tshepiso
|
|
dc.date.accessioned |
2019-11-27T05:35:41Z |
|
dc.date.available |
2019-11-27T05:35:41Z |
|
dc.date.issued |
2019 |
|
dc.identifier.uri |
http://hdl.handle.net/10386/2917 |
|
dc.description |
Thesis (M. Sc. (Computer Science)) --University of Limpopo, 2019 |
en_US |
dc.description.abstract |
A Text-to-speech (TTS) synthesis system is a software system that receives text as input and produces speech as output. A TTS synthesis system can be used for, amongst others, language learning, and reading out text for people living with different disabilities, i.e., physically challenged, visually impaired, etc., by native and non-native speakers of the target language. Most people relate easily to a second language spoken by a non-native speaker they share a native language with. Most online English TTS synthesis systems are usually developed using native speakers of English. This research study focuses on developing accented English synthetic voices as spoken by non-native speakers in the Limpopo province of South Africa. The Modular Architecture for Research on speech sYnthesis (MARY) TTS engine is used in developing the synthetic voices. The Hidden Markov Model (HMM) method was used to train the synthetic voices. Secondary training text corpus is used to develop the training speech corpus by recording six speakers reading the text corpus. The quality of developed synthetic voices is measured in terms of their intelligibility, similarity and naturalness using a listening test. The results in the research study are classified based on evaluators’ occupation and gender and the overall results. The subjective listening test indicates that the developed synthetic voices have a high level of acceptance in terms of similarity and intelligibility. A speech analysis software is used to compare the recorded synthesised speech and the human recordings. There is no significant difference in the voice pitch of the speakers and the synthetic voices except for one synthetic voice. |
en_US |
dc.format.extent |
xv, 107 leaves |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
University of Limpopo |
en_US |
dc.relation.requires |
PDF |
en_US |
dc.subject |
Text-to-speech synthesis system |
en_US |
dc.subject |
Language learning |
en_US |
dc.subject.lcsh |
Text data mining |
en_US |
dc.subject.lcsh |
Data compression (computer science) |
en_US |
dc.subject.lcsh |
Informal language learning |
en_US |
dc.title |
The development of accented English synthetic voices |
en_US |
dc.type |
Thesis |
en_US |