Sparse approximations to Bayesian inference for nonparametric Gaussian Process models scale linearly in the number of training points, allowing for the application of powerful kernel-based models to large datasets. We present a general framework based on the informative vector machine (IVM) (Lawrence, 2002) and show how the complete Bayesian task of inference and learning of free hyperparameters can be performed in a practically efficient manner. Our framework allows for arbitrary likelihood and kernel functions, so that a large number of elementary models can be treated in a unified way. We present a range of experiments for our method applied to binary classification and regression tasks. Models based on a single latent function can be combined in order to address more complicated setups. We demonstrate this approach for a multi-way classification model.