3 directional Inception-ResUNet: Deep spatial feature learning for multichannel singing voice separation with distortion.

Abstract

Singing voice separation on robots faces the problem of interpreting ambiguous auditory signals. The acoustic signal, which the humanoid robot perceives through its onboard microphones, is a mixture of singing voice, music, and noise, with distortion, attenuation, and reverberation. In this paper, we used the 3D Inception-ResUNet structure in the U-shaped encoding and decoding network to improve the utilization of the spatial and spectral information of the spectrogram. Multiobjectives were used to train the model: magnitude consistency loss, phase consistency loss, and magnitude correlation consistency loss. We recorded the singing voice and accompaniment derived from the MIR-1K dataset with NAO robots and synthesized the 10-channel dataset for training the model. The experimental results show that the proposed model trained by multiple objectives reaches an average NSDR of 11.55 dB on the test dataset, which outperforms the comparison model.Copyright: © 2024 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Show Full Text

3 directional Inception-ResUNet: Deep spatial feature learning for multichannel singing voice separation with distortion.

Researchers

Journal

Modalities

Models

Abstract

Development and Validation of a Deep-Learning Model to Screen for Hyperkalemia From the Electrocardiogram.

Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review.

Predicting length of stay ranges by using novel deep neural networks.

In-Cabin Monitoring System for Autonomous Vehicles.

Build Deep Neural Network Models to Detect Common Edible Nuts from Photos and Estimate Nutrient Portfolio.

In-orbit demonstration of a re-trainable machine learning payload for processing optical imagery.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply