Konferenzsystem

Contribution

Beitrag entfällt - Perceptually Motivated Audio Preprocessing for Soundscape Mood Identification

Authors

* Presenting author

Day / Time: 18.08.2021, 12:00-12:20

Room: Lehar 1

Typ: Regulärer Vortrag

Session: Soundscape

Article ID:

DOI: https://doi.org/

Online-access: Bitte loggen Sie sich ein, damit weitere Inhalte sichtbar werden (bspw. der Zugang zur Onlinesitzung).

Abstract:

Soundscape mood classification is defined as assigning different semantic emotion labels to sound excerpts utilizing audio signal processing and classification methods. In this work, we aim to design three different classification models, each individually utilizes a specific preprocessing type/method: 1) frame-level handcrafted audio feature extraction 2) different forms of time-frequency transformations e.g., Spectrogram, Mel-Spectrogram, Modulation Power/Phase Spectrum, etc. 3) raw audio waveforms as the training data. These approaches are implemented on the Emo-soundscapes dataset which is an annotated dataset regarding emotion content. Then we compare the performance and efficiency of each of these aforementioned models to obtain comprehensive insights regarding the amount of training data, the functionality of the prior knowledge incorporated into the training process, the interaction of the preprocessing with classification steps and the final achieved results.