Infoscience

Report

Transforming the feature vectors to improve HMM based cursive word recognition systems

Although many Offline Cursive Word Recognition systems are based on HMMs, no attention was ever paid, to our knowledge, to the fact that the feature vectors are typically not in the most suitable form for modeling. They are most of the time correlated and embedded in a space of dimension higher than their Intrinsic Dimension. This leads to several problems and has a negative influence on the performance. By applying some transforms it is possible to solve, or at least to attenuate, such problems resulting in data easier to model and in systems with higher recognition rate. In this work, we used Principal Component Analysis (linear and nonlinear) and Independent Component Analysis. A reduction of the error rate by up to 30.3% (over single writer data) and 16.2% (over multiple writer samples) is shown to be achieved.

Related material