While objects often constitute the desired level of access for browsing and retrieval in video databases, an inherent problem for on-line object definition is that of model construction from a few examples. In this paper, we present a probabilistic methodology to localize objects that appear across video segments, based on video structuring, object definition, and localization in the video structure. Localization is formulated as a problem of random sampling in a Metric Mixture Model framework, which allows for the joint modeling of a set of color appearance exemplars and their geometric transformations. To improve the efficiency of the sampling process, candidate configurations are drawn from a prior distribution using importance sampling, and evaluated using Bayes' rule. Experimental results on a database extracted from home videos depicting real objects (with variations of scale and pose) across video shots show the performance of the method.