Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion.

April 30, 2019 Deep Learning, Environmental Science

Researchers

Jingyu Wang Ke Zhang Kurosh Madani Yu Su

Journal

Sensors (Basel, Switzerland)

Modalities

Models

CNN

Abstract

With the popularity of using deep learning-based models in various categorization problems and their proven robustness compared to conventional methods, a growing number of researchers have exploited such methods in environment sound classification tasks in recent years. However, the performances of existing models use auditory features like log-mel spectrogram (LM) and mel frequency cepstral coefficient (MFCC), or raw waveform to train deep neural networks for environment sound classification (ESC) are unsatisfactory. In this paper, we first propose two combined features to give a more comprehensive representation of environment sounds Then, a fourfour-layer convolutional neural network (CNN) is presented to improve the performance of ESC with the proposed aggregated features. Finally, the CNN trained with different features are fused using the Dempster-Shafer evidence theory to compose TSCNN-DS model. The experiment results indicate that our combined features with the four-layer CNN are appropriate for environment sound taxonomic problems and dramatically outperform other conventional methods. The proposed TSCNN-DS model achieves a classification accuracy of 97.2%, which is the highest taxonomic accuracy on UrbanSound8K datasets compared to existing models.

Show Full Text

Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion.

Researchers

Journal

Modalities

Models

Abstract

The confluence of machine learning and multiscale simulations.

[Deep Learning and AlphaGo].

High-throughput cell spheroid production and assembly analysis by microfluidics and deep learning.

High-resolution 3D MR Fingerprinting using parallel imaging and deep learning.

Deep Learning Algorithm Predicts Angiographic Coronary Artery Disease in Stable Patients Using Only a Standard 12-lead Electrocardiogram.

DFpin: Deep learning-based protein-binding site prediction with feature-based non-redundancy from RNA level.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply