The automatic recognition of emotions in speech

dc.contributor.advisorManamela, M. J. D.
dc.contributor.authorManamela, Phuti, John
dc.contributor.otherModipa, T. I.
dc.date.accessioned2021-06-18T07:39:12Z
dc.date.available2021-06-18T07:39:12Z
dc.date.issued2020
dc.descriptionThesis(M.Sc.(Computer Science)) -- University of Limpopo, 2020en_US
dc.description.abstractSpeech emotion recognition (SER) refers to a technology that enables machines to detect and recognise human emotions from spoken phrases. In the literature, numerous attempts have been made to develop systems that can recognise human emotions from their voice, however, not much work has been done in the context of South African indigenous languages. The aim of this study was to develop an SER system that can classify and recognise six basic human emotions (i.e., sadness, fear, anger, disgust, happiness, and neutral) from speech spoken in Sepedi language (one of South Africa’s official languages). One of the major challenges encountered, in this study, was the lack of a proper corpus of emotional speech. Therefore, three different Sepedi emotional speech corpora consisting of acted speech data have been developed. These include a RecordedSepedi corpus collected from recruited native speakers (9 participants), a TV broadcast corpus collected from professional Sepedi actors, and an Extended-Sepedi corpus which is a combination of Recorded-Sepedi and TV broadcast emotional speech corpora. Features were extracted from the speech corpora and a data file was constructed. This file was used to train four machine learning (ML) algorithms (i.e., SVM, KNN, MLP and Auto-WEKA) based on 10 folds validation method. Three experiments were then performed on the developed speech corpora and the performance of the algorithms was compared. The best results were achieved when Auto-WEKA was applied in all the experiments. We may have expected good results for the TV broadcast speech corpus since it was collected from professional actors, however, the results showed differently. From the findings of this study, one can conclude that there are no precise or exact techniques for the development of SER systems, it is a matter of experimenting and finding the best technique for the study at hand. The study has also highlighted the scarcity of SER resources for South African indigenous languages. The quality of the dataset plays a vital role in the performance of SER systems.en_US
dc.description.sponsorshipNational research foundation (NRF) and Telkom Center of Excellence (CoE)en_US
dc.format.extentxii, 76 leavesen_US
dc.identifier.urihttp://hdl.handle.net/10386/3347
dc.language.isoenen_US
dc.relation.requiresPDFen_US
dc.subjectSpeech emotion recognitionen_US
dc.subjectMachine learningen_US
dc.subjectFeature extractionen_US
dc.subjectClassificationen_US
dc.subjectEmotional speech databaseen_US
dc.subject.lcshAutomatic speech recognitionen_US
dc.subject.lcshMachine learningen_US
dc.titleThe automatic recognition of emotions in speechen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
manamela_pj_2020.pdf
Size:
2.08 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: