Blind Audiovisual Separation based on Redundant Representations

Llagostera Casanovas, Anna; Monaci, Gianluca; Vandergheynst, Pierre; Gribonval, Rémi

doi:10.1109/ICASSP.2008.4517991

conference paper

Blind Audiovisual Separation based on Redundant Representations

Llagostera Casanovas, Anna

•

Monaci, Gianluca

•

Vandergheynst, Pierre

more

2008

Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing

IEEE International Conference on Acoustics, Speech and Signal Processing

In this work we present a method to perform a complete audiovisual source separation without need of previous information. This method is based on the assumption that sounds are caused by moving structures. Thus, an efficient representation of audio and video sequences allows to build relationships between synchronous structures on both modalities. A robust clustering algorithm groups video structures exhibiting strong correlations with the audio so that sources are counted and located in the image. Using such information and exploiting audio-video correlation, the audio sources activity is determined. Next, \emph{spectral} GMMs are learnt in time slots with only one source active so that it is possible to separate them in case of an audio mixture. Audio source separation performances are rigorously evaluated, clearly showing that the proposed algorithm performs efficiently and robustly.

Name

ICASSP08.pdf

Access type

openaccess

Size

426.02 KB

Format

Adobe PDF

Checksum (MD5)

1d98a1f72cbfc82449bfcac9a4f7cb9d