Multi-Modal Emotion Recognition Feature Extraction and Data Fusion Methods Evaluation
Sanjeeva Rao Sanku1, B. Sandhya2
1Sanjeeva Rao Sanku, Department of Computer Science and Engineering, University College of Engineering, Osmania University, Hyderabad (Telangana), India.
2Prof. B. Sandhya, Department of Computer Science and Engineering, MVSR Engineering College, Hyderabad (Telangana), India.
Manuscript received on 10 August 2024 | Revised Manuscript received on 20 August 2024 | Manuscript Accepted on 15 September 2024 | Manuscript published on 30 September 2024 | PP: 18-27 | Volume-13 Issue-10, September 2024 | Retrieval Number: 100.1/ijitee.J996813100924 | DOI: 10.35940/ijitee.J9968.13100924
Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Research into emotion detection is crucial because of the wide range of fields that can benefit from it, including healthcare, intelligent customer service, and education. In comparison to unimodal approaches, multimodal emotion recognition (MER) integrates many modalities including text, facial expressions, and voice to provide better accuracy and robustness. This article provides a historical and present-day overview of MER, focusing on its relevance, difficulties, and approaches. We examine several datasets, comparing and contrasting their features and shortcomings; they include IEMOCAP and MELD. Recent developments in deep learning approaches, particularly fusion strategies such as early, late, and hybrid fusion are covered in the literature review. Data redundancy, complicated feature extraction, and real-time detection are among the identified shortcomings. Our suggested technique enhances emotion recognition accuracy by using deep learning to extract features using a hybrid fusion approach. To overcome existing restrictions and advance the area of MER, this study intends to direct future investigations in the right direction. Examining various data fusion strategies, reviewing new methodologies in multimodal emotion identification, and identifying problems and research needs to make up the primary body of this work.
Keywords: Multimodal Emotion Recognition (MER), Speech Analysis, Facial Expression Recognition, MELD, Hybrid Fusion.
Scope of the Article: Computer Science and Applications