Scene structuring is a video analysis task for which no common evaluation procedures have been fully adopted. In this paper, we present a methodology to evaluate such task in home videos, which takes into account human judgement, and includes a representative corpus, a set of objective performance measures, and an evaluation protocol. The components of our approach are detailed as follows. First, we describe the generation of a set of home video scene structures produced by multiple people. Second, we define similarity measures that model variations with respect to two factors: human perceptual organization and level of structure granularity. Third, we describe a protocol for evaluation of automatic algorithms based on their comparison to human performance. We illustrate our methodology by assessing the performance of two recently proposed methods: probabilistic hierarchical clustering and spectral clustering.