Accessing and organizing home videos present technical challenges due to their unrestricted content and lack of storyline. In this paper, we propose a spectral method to group video shots into scenes based on their visual similarity and temporal relations. Spectral methods exploit the eigenvector decomposition of a pair-wise similarity matrix and can be effective in capturing perceptual organization features. In particular, we investigate the automatic selection of the number of clusters, which is currently an open research issue for spectral methods. We first analyze the behaviour of the algorithm with respect to variations in the number of clusters, and then propose measures to assess the validity of a grouping result. The methodology is used to group scenes from a six-hour home video database, and is assessed with respect to a ground-truth generated by multiple humans. The results indicate the validity of the proposed approach, both compared to existing techniques as well as the human ground-truth.