Benefits of Max Pooling in Neural Networks: Theoretical and Experimental Evidence
When deep neural networks became state of the art image classifiers, numerous max pooling operations were an important component of the architecture. However, modern computer vision networks typically have few, if any, max pooling operations. To understand whether this trend is justified, we develop a mathematical framework analyzing ReLU based approximations of max pooling, and prove a sense in which max pooling cannot be replicated. We formulate and analyze a novel class of optimal approximations, and find that the residual can be made exponentially small in the kernel size, but only with an exponentially wide approximation. This work gives a theoretical basis for understanding the reduced use of max pooling in newer architectures. It also enables us to establish an empirical observation about natural images: since max pooling does not seem necessary, the inputs on which max pooling is distinct – those with a large difference between the max and other values – are not prevalent.
1275_Benefits_of_Max_Pooling_i-1.pdf
Main Document
http://purl.org/coar/version/c_be7fb7dd8ff6fe43
openaccess
CC BY
819.95 KB
Adobe PDF
4935b4f25b475d4ed2862a7d748d20d8