Video Text Recognition using Sequential Monte Carlo and Error Voting Methods

Chen, DatongOdobez, Jean-Marc2006-03-102006-03-102006-03-10200510.1016/j.patrec.2004.11.019https://infoscience.epfl.ch/handle/20.500.14299/228717This paper addresses the issue of segmentation and recognition of text embedded in video sequences from their associated text image sequence extracted by a text detection module. To this end, we propose a probabilistic algorithm based on Bayesian adaptive thresholding and Monte-Carlo sampling. The algorithm approximates the posterior distribution of segmentation thresholds of text pixels in an image by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. One important contribution of the paper is to show that, thanks to the proposed methodology, the likelihood of a segmentation parameter sample can be estimated not using a classification criterion or a visual quality criterion based on the produced segmentation map, but directly from the induced text recognition result, which is directly relevant to our task. Furthermore, as a second contribution of the paper, we propose to align text recognition results from high confidence samples gathered over time, to composite a final result using error voting technique (ROVER) at the character level. Experiments are conducted on a two hour video database. Character recognition rates higher than 93\%, and word error rates higher than 90\% are achieved, which are 4 and 3\% more than state-of-the-art methods applied to the same database.visionVideo Text Recognition using Sequential Monte Carlo and Error Voting Methodstext::journal::journal article::research article