Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Researchers

Journal

IEEE/ACM transactions on audio, speech, and language processing

Modalities

Models

Abstract

This paper proposes a neural cascade architecture to address the monaural speech enhancement problem. The cascade architecture is composed of three modules which optimize in turn enhanced speech with respect to the magnitude spectrogram, the time-domain signal and the complex spectrogram. Each module takes as input the noisy speech and the output obtained from the previous module, and generates a prediction of the respective target. Our model is trained in an end-to-end manner, using a triple-domain loss function that accounts for three domains of signal representation. Experimental results on the WSJ0 SI-84 corpus show that the proposed model outperforms other strong speech enhancement baselines in terms of objective speech quality and intelligibility.

Show Full Text

Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Researchers

Journal

Modalities

Models

Abstract

Real-time counting of wheezing events from lung sounds using deep learning algorithms: Implications for disease prediction and early intervention.

The Seismo-Performer: A Novel Machine Learning Approach for General and Efficient Seismic Phase Recognition from Local Earthquakes in Real Time.

Biomedical informatics and machine learning for clinical genomics.

Deep learning and pathomics analyses reveal cell nuclei as important features for mutation prediction of BRAF-mutated melanomas.

AI in the Loop: functionalizing fold performance disagreement to monitor automated medical image segmentation workflows.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply