This paper addresses the problem of quality estimation of digitally coded video sequences. The topic is of great interest since many products in digital video are about to be released and it is thus important to have robust methodologies for testing and performance evaluation of such devices. The inherent problem is that human vision has to be taken into account in order to assess the quality of a sequence with a good correlation with human judgment. It is well known that the commonly used metric, the signal-to-noise ratio is not correlated with human vision. A metric for the assessment of video coding quality is presented. It is based on a multi- channel model of human spatio-temporal vision that has been parameterized for video coding applications by psychophysical experiments. The visual mechanisms of vision are simulated by a spatio-temporal filter bank. The decomposition is then used to account for phenomena as contrast sensitivity and masking. Once the amount of distortions actually perceived is known, quality estimation can be assessed at various levels. The described metric is able to rate the overall quality of the decoded video sequence as well as the rendition of important features of the sequence such as contours or textures.