Automatic face analysis has to cope with pose and lighting variations. Especially pose variations are difficult to tackle and many face analysis methods require the use of sophisticated normalization procedures. We propose a data-driven face analysis approach that is not only capable of extracting features relevant to a given face analysis task, but is also robust with regard to face location changes and scale variations. This is achieved by deploying convolutional neural networks. We show that the use of multi-scale feature extractors and whole-field feature map summing neurons allow to improve facial expression recognition results, especially with test sets that feature scale, respectively, translation changes.