It has recently been shown that deformable 3D surfaces could be recovered from single video streams. However, existing techniques either require a reference view in which the shape of the surface is known a priori, which often may not be available, or require tracking points over long sequences, which is hard to do. In this paper, we overcome these limitations. To this end, we establish correspondences between pairs of frames in which the shape is different and unknown. We then estimate homographies between corresponding local planar patches in both images. These yield approximate 3D reconstructions of points within each patch up to a scale factor. Since we consider overlapping patches, we can enforce them to be consistent over the whole surface. Finally, a local deformation model is used to fit a triangulated mesh to the 3D point cloud, which makes the reconstruction robust to both noise and outliers in the image data.