Within the field of pattern recognition, biometrics is the discipline which is concerned with the automatic recognition of a person based on his/her physiological or behavioral characteristics. Face recognition, a central area in biometrics, is a very challenging task and is still largely considered an open problem. However, it is worthwhile to note that most face recognition algorithms focus on the feature extraction issue, and that much less attention has been given to the classification stage. In this dissertation, we introduce a novel measure of "distance" between faces which involves the estimation of the set of possible transformations between face images of the same person. The global transformation, which is assumed too complex for direct modeling, is approximated with a set of local transformations under a constraint imposing consistency between neighboring local transformations. The proposed local transformations and neighboring constraints are embedded within the probabilistic framework of the two-dimensional hidden Markov model (2-D HMM) in the case of discrete states and of the two-dimensional state-space model (2-D SSM) in the case of continuous states. To make the proposed face recognition approach practical, we also consider novel efficient approximations of the intractable 2-D HMM and 2-D SSM: the turbo HMM and the turbo SSM respectively. They consist of a set of inter-connected horizontal and vertical 1-D Markov chains that communicate through an iterative process. Once a proper measure of distance has been defined, we turn to the problem of face image retrieval in large databases. To reduce the computational cost, the face space is partitioned through a clustering of the data. The main challenge that we address is the computation of a cluster centroid which is consistent with the proposed measure of distance. Finally, we consider the problem of identity verification which requires a robust confidence measure. The issue is the accurate modeling of wrongful claims. For a distance such as the one introduced in this dissertation, we can model either the set of possible transformations between face images of different persons or directly the impostor distribution. We show that the latter approach leads to the best classification.