Audio Emotion Recognition System

This project implements a real-time audio emotion recognition system using machine learning. It can analyze speech and predict the emotional state of the speaker.

Features

Real-time audio emotion recognition
Support for uploaded audio files
Visualization of audio features
Support for multiple emotions: neutral, calm, happy, sad, angry, fearful, disgust, surprised

Dataset

This project uses the RAVDESS (Ryerson Audio-Visual Database of Emotional Speech) dataset. You can download it from: https://zenodo.org/record/1188976 with Audio_Speech_Actors_01-24.zip being used in particular.

File naming convention

Each of the 7356 RAVDESS files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 02-01-06-01-02-01-12.mp4). These identifiers define the stimulus characteristics:

Filename identifiers

Modality (01 = full-AV, 02 = video-only, 03 = audio-only).
Vocal channel (01 = speech, 02 = song).
Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.
Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").
Repetition (01 = 1st repetition, 02 = 2nd repetition).
Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).

Filename example: 02-01-06-01-02-01-12.wav

Video-only (02) Speech (01) Fearful (06) Normal intensity (01) Statement "dogs" (02) 1st Repetition (01) 12th Actor (12) Female, as the actor ID number is even.

Setup Instructions

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Download the RAVDESS dataset and place it in the data folder as a directory named RAVDESS
Run the application:

streamlit run app.py

Project Structure

├── app.py                                     # Main Streamlit application
├── audio_emotion_recognition.ipynb            # Model training and prediction code
├── audio_processor.py                         # Audio processing utilities
├── data/RAVDESS                               # Dataset directory
├── models/                                    # Saved model files
└──  requirements.txt                          # Project dependencies

Usage

Launch the application using streamlit run app.py
Choose between real-time recording or file upload
For real-time analysis, click "Start Recording" and speak
For file upload, select an audio file
View the emotion prediction and audio visualization

Requirements

Python 3.8+
See requirements.txt for all dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
models		models
.gitignore		.gitignore
README.md		README.md
app.py		app.py
audio_emotion_recognition.ipynb		audio_emotion_recognition.ipynb
audio_processor.py		audio_processor.py
requirements.txt		requirements.txt
training_history.png		training_history.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio Emotion Recognition System

Features

Dataset

File naming convention

Filename identifiers

Setup Instructions

Project Structure

Usage

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

VanshajR/UCS749-Audio-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Audio Emotion Recognition System

Features

Dataset

File naming convention

Filename identifiers

Setup Instructions

Project Structure

Usage

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages