We consider the problem of estimating multiple filters from convolutive mixtures of several unknown sources. We propose to exploit both the time-frequency (TF) sparsity of the sources and the sparsity of the mixing filters. Our framework consists of: a) a clustering step to group the TF points where only one source is active, for each source; b) a convex optimisation step, to estimate the filters using TF cross-relations that capture linear constraints satisfied by the unknown filters. Experiments demonstrate that the approach is well suited for the estimation of sufficiently sparse filters.