SoundFusion: Music Recognition by Information Fusion of Different Audio Representations

In this project, we delve into the exciting world of music classification - a challenge that requires the identification of a musical piece's genre or style. We approach this task in a dual way, by utilizing the information provided by the raw signal itself and also by taking advantage of the spectrogram representation of the tracks, leveraging the powerful capabilities of Convolutional Neural Networks (CNNs) as image classifiers.We present a comprehensive music classification system using various CNN architectures implemented in PyTorch, including ResNet34 and ResNet18 (both with pretrained and random weight initialization), and our own custom ResNet34 architecture with added regularization techniques for improved performance. Additionally, we propose our own Simple CNN for comparison. Our approach utilizes both raw audio signals and spectrogram representations as inputs to the CNNs, with the goal of accurately classifying music into eight predefined genres. The focus of our paper is twofold: (1) the creation of our own custom preprocessing class to prepare the data before feeding it to the model, (2) a comparative analysis of the state-of-the-art models on the FMA dataset, and, (3) the SoundFusion architecture that we propose and which can learn from different audio representations. We provide in the report a summary of the results to showcase the effectiveness and limitations of using CNNs in music classification tasks.

Results

Report: https://github.com/Melikakmm/CNN-for-sound-classification/blob/main/summary%20of%20our%20work/Soundfusion.pdf
Slides: https://github.com/Melikakmm/CNN-for-sound-classification/blob/main/summary%20of%20our%20work/DLNN%20SoundFusion.pdf

To visualize the results, you can check our report that you can find in this repository. Several results are available also here: https://docs.google.com/spreadsheets/d/1sCCcPoR8EBBya6jRyCnfTytkSTWD2QRgWzJlIAxwu5s/edit?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
George		George
Lorenzo		Lorenzo
Melika		Melika
summary of our work		summary of our work
README.md		README.md
k.jpeg		k.jpeg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

SoundFusion: Music Recognition by Information Fusion of Different Audio Representations

Results

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Uh oh!

Uh oh!

Melikakmm/CNN-for-sound-classification

Folders and files

Latest commit

History

Repository files navigation

SoundFusion: Music Recognition by Information Fusion of Different Audio Representations

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages