This paper presents a framework for coarse scene geometry estimation, based on sparse representations of omnidirectional images with geometrical basis functions. We introduce a correlation model that relates sparse components in different views with local geometrical transforms, under epipolar constraints. By combining selected pairs of features represented by sparse components, we estimate the disparity map between images, evaluate coarse depth information, and recover the relative camera pose. The proposed framework allows to estimate the geometry of the scene, hence disparity between images, using only coarse approximations of multi-view images. The experimental results demonstrate that only a few components are sufficient to estimate the disparity map and the camera pose. This is certainly beneficial for predictive multi-view compression schemes, where the scene reconstruction relies on the disparity mapping from low-resolution images in order to progressively decode the higher image resolutions.