Files

Abstract

In this work we study the binary transfer learning problem involving 10^2 -10^3 sources. We focus on how to select sources from the large pool and how to combine them to yield a good performance on a target task. In particular, we consider the transfer learning setting where one does not have direct access to the source data, but rather employs the source hypotheses trained from them. Building on results on greedy algorithms, we propose an efficient algorithm that selects relevant source hypotheses and feature dimensions simultaneously. On three computer vision datasets we achieve state-of-the-art results, substantially outperforming both popular feature selection and transfer learning baselines when transferring in a small-sample setting. Our experiments involve up to 1000 classes, totalling 1.2 million examples, with only 11 to 20 training examples from the target domain. We corroborate our findings showing theoretically that, under reasonable assumptions on the source hypotheses, our algorithm can learn effectively from few examples.

Details

Actions

Preview