For semi-blind source separation, the free-field wave propagation model is assumed to facilitate a two-stage procedure of source localization and separation by using an array. The results have demonstrated the excellent performance of the proposed network in dereverberation and separation, as compared to baseline methods.Īs an important problem in speech enhancement, source separation seeks to separate independent source signals from mixture signals, based on the spatial cue, the temporal-spectral cue, or statistical characteristics of sources. The proposed network is compared with several baseline approaches in terms of objective evaluation matrices. Simulations are conducted for comprehensive multisource scenarios of various subtending angles of sources and reverberation times. Furthermore, the combined network is also trained with the speech segments filtered by a great variety of room impulse responses. The scale invariant mean square error (SI-MSE) that is a frequency-domain modification from the scale invariant signal-to-noise ratio (SI-SNR) is used as the objective function for training. ![]() In the second stage, a U-net model is concatenated to the beamforming network to serve as a non-linear mapping filter for joint separation and dereverberation. Next, the output of the network is processed by a weight-and-sum operation that is reformulated to handle real-valued data in the frequency domain. In the first stage, time-dilated convolutional blocks are trained to estimate the array weights for beamforming the multichannel microphone signals. The network can be divided into two parts according to the training strategies. ![]() In this paper, a multichannel learning-based network is proposed for sound source separation in reverberant field.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |