treeKL: A distance between high dimension empirical distributions

This paper offers a methodological contribution for computing the distance between two empirical distributions in an Euclidean space of very large dimension. We propose to use decision trees instead of relying on standard quantification of the feature space. Our contribution is twofold: We first define a new distance between empirical distributions, based on the Kullback-Leibler (KL) divergence between the distributions over the leaves of decision trees built for the two empirical distributions. Then, we propose a new procedure to build these unsupervised trees efficiently. The performance of this new metric is illustrated on image clustering and neuron classification. Results show that the tree-based method outperforms standard methods based on standard bag-of-features procedures. (C) 2012 Elsevier B.V. All rights reserved.


Published in:
Pattern Recognition Letters, 34, 2, 140-145
Year:
2013
Publisher:
Amsterdam, Elsevier Science Bv
ISSN:
0167-8655
Keywords:
Laboratories:




 Record created 2013-12-19, last modified 2018-09-13

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)