Infoscience

Thesis

Tractable Approaches to Learning and Planning in High Dimensions

The continuous increase, witnessed in the last decade, of both the amount of available data and the areas of application of machine learning, has lead to a demand for both learning and planning algorithms that are capable of handling large-scale problems. Thus scalability has become an important characteristic of modern machine learning algorithms. In this thesis we concentrate on tractable approaches to both learning and planning, capable of solving problems in high-dimensional spaces. We investigate multiple forms of high-dimensionality, where dimensionality can refer to the number of samples to be processed, the number of samples available for training, the dimensionality of the feature space, or even the dimensionality of the state-space. We present work on classifier cascades, developing a novel framework for interpreting, training, and employing such structures of classifiers. A Boosting algorithm developed in this framework highlights its ability to jointly train the classifiers at the cascade nodes. The work presented on goal-planning also results in a novel framework, this time for automatic macro-action discovery in imitation learning, specifically addressing problems with large state-spaces as well as with observation signals of high-dimensionality. The proposed DPBoost algorithm is capable of solving very complex goal-planning tasks where previous state-of-the-art fails. Addressing computer hardware limitations, and in particular memory limitations, we propose a new learning framework, which we call reservoir learning, which attempts to directly account for this limited memory during the training process. In particular we propose to consider learning in the presence of a reservoir in which the learner may store samples. Based on Boosting we develop a novel strategy for populating this reservoir which avoids redundancy by taking into account the joint behavior of samples. Finally high-dimensionality is also addressed in its most typical setting, that is when the feature space is large. We propose novel feature selection algorithms in the context of classification. Based on information theoretic tools, these algorithms attempt to maximize the joint mutual information between the selected, continuous, variables and the, discrete, class label variable.

Fulltext

Related material