Recently, we introduced a simple variational bound on mutual information, that resolves some of the difficulties in the application of information theory to machine learning. Here we study a specific application to Gaussian channels. It is well known that PCA may be viewed as the solution to maximizing information transmission between a high dimensional vector and its low dimensional representation . However, such results are based on assumptions of Gaussianity of the sources. In this paper, we show how our mutual information bound, when applied to this arena, gives PCA solutions, without the need for the Gaussian assumption. Furthermore, it naturally generalizes to providing an objective function for Kernel PCA, enabling the principled selection of kernel parameters.