Automatic face analysis has to cope with pose and lighting variations. Especially pose variations are difficult to tackle and many face analysis methods require the use of sophisticated normalization and initialization procedures. We propose a data-driven face analysis approach that is not only capable of extracting features relevant to a given face analysis task, but is also more robust with regard to face location changes and scale variations when compared to classical methods such as e.g. MLPs. Our approach is based on convolutional neural networks that use multi-scale feature extractors, which allow for improved facial expression recognition results with faces subject to in-plane pose variations.