Combinatorial Approach for Data Binarization

This paper addresses the problem of transforming arbitrary data into binary data. This is intended as preprocessing for a supervised classification task. As a binary mapping compresses the total information of the dataset, the goal here is to design such a mapping that maintains most of the information relevant to the classification problem. Most of the existing approaches to this problem are based on correlation or entropy measures between one individual binary variable and the partition into classes. On the contrary, the approach proposed here is based on a global study of the combinatorial property of a set of binary variable.

Related material