EMOTION RECOGNITION FROM SPEECH USING DEEP LEARNING TECHNIQUES

Ms. Ritu Vijay Bhalerao; Ms. Kaveri Santosh Ahire

📄 Abstract

Emotions are a natural component of human language and contribute significantly to communication. They are expressed in tone, pitch, and rhythm, through which people convey feelings beyond the words. Speech Emotion Recognition (SER) refers to the process of automatically recognizing emotions like happiness, sadness, anger, fear, or neutrality from voice signals. Previously, hand-crafted features were used with classical classifiers, but these were not as accurate. With the development of deep learning, models such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks have proved to be better by learning features end-to-end from audio. These methods are able to extract both sound patterns and speech time sequence. SER has numerous real-world applications in fields like virtual assistants, medicine, education, and customer care. Nonetheless, challenges exist in the form of background noise, speaker variability, and overlapping affect. This research is concerned with using deep learning models with characteristics such as MFCCs, Chroma, and Spectral Contrast to enhance emotion detection from speech accuracy and dependability.

🏷️ Keywords

Speech Emotion Recognition (SER) Deep Learning Convolutional Neural Network (CNN) Long Short-Term Memory (LSTM) Feature Extraction Mel-Frequency Cepstral Coefficients (MFCCs) Chroma Features Spectral Contrast Human–Computer Interaction (HCI) Emotion Classification

📚 How to Cite:

Ms. Ritu Vijay Bhalerao, Ms. Kaveri Santosh Ahire , EMOTION RECOGNITION FROM SPEECH USING DEEP LEARNING TECHNIQUES , Volume 11 , Issue 10, October 2025, EPRA International Journal of Multidisciplinary Research (IJMR) ,

EMOTION RECOGNITION FROM SPEECH USING DEEP LEARNING TECHNIQUES

👤 Authors

📄 Abstract

🏷️ Keywords

📚 How to Cite:

🔗 PDF URL