Show simple item record

dc.contributor.advisor Manamela, M.J.D Manaileng, Mabu Johannes
dc.contributor.other Velempini, M. 2017-01-25T07:30:49Z 2017-01-25T07:30:49Z 2015
dc.description Thesis (M.Sc. (Computer Science)) -- University of Limpopo, 2015 en_US
dc.description.abstract This study investigates the potential of using graphemes, instead of phonemes, as acoustic sub-word units for monolingual and cross-lingual speech recognition for some of the under-resourced languages of the Limpopo Province, namely, IsiNdebele, Sepedi and Tshivenda. The performance of a grapheme-based recognition system is compared to that of phoneme-based recognition system. For each selected under-resourced language, automatic speech recognition (ASR) system based on the use of hidden Markov models (HMMs) was developed using both graphemes and phonemes as acoustic sub-word units. The ASR framework used models emission distributions by 16 Gaussian Mixture Models (GMMs) with 2 mixture increments. A third-order n-gram language model was used in all experiments. Identical speech datasets were used for each experiment per language. The LWAZI speech corpora and the National Centre for Human Language Technologies (NCHLT) speech corpora were used for training and testing the tied-state context-dependent acoustic models. The performance of all systems was evaluated at the word-level recognition using word error rate (WER). The results of our study show that grapheme-based continuous speech recognition, which copes with the problem of low-quality or unavailable pronunciation dictionaries, is comparable to phoneme-based recognition for the selected under-resourced languages in both the monolingual and cross-lingual speech recognition tasks. The study significantly demonstrates that context-dependent grapheme-based sub-word units can be reliable for small and medium-large vocabulary speech recognition tasks for these languages. en_US
dc.description.sponsorship Telkom SA en_US
dc.format.extent xv, 105 leaves en_US
dc.language.iso en en_US
dc.publisher University of Limpopo en_US
dc.relation.requires PDF en_US
dc.subject Grapheme-based en_US
dc.subject Speech recognition en_US
dc.subject Under- resourced languages en_US
dc.subject.lcsh Automatic speech recognition. en_US
dc.subject.lcsh Speech perception -- South Africa -- Limpopo en_US
dc.title Grapheme-based continuous speech recognition for some of the under- resourced languages of Limpopo Province en_US
dc.type Thesis en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ULSpace


My Account