Infoscience

Report

The use of Boolean concepts in general classification contexts

This thesis discusses certain issues related to the treatment of classification problems using techniques that rely on Boolean concepts. The interest of using Boolean-based models in classification is demonstrated with the help of LAD (Logical Analysis of Data), a methodology for data analysis and classification. The models that are generated in that framework are based on conjunctions of Boolean facts associated with the attributes describing the treated data. This approach has the advantage of creating classification models that are human-interpretable, and can consequently provide a deeper understanding of each treated classification problem, in addition to the usual possibility of determining the class of new data. In order to successfully employ this kind of technique in general classification contexts, certain adaptations are required. This thesis studies these adaptations, and comprises the following topics: (i) how to efficiently transform the input data, which can be represented in arbitrary numerical format, into Boolean format, without discarding significant information; (ii) how to use classification algorithms that generate binary decisions, suitable to two-class problems, in order to solve classification problems with several classes; and (iii) an extension of the original LAD algorithm is proposed, that still generates interpretable Boolean models, but that is able to treat problems with several classes.

Related material