Files

Abstract

The rapid development of digital imaging and video has placed visual contents in the heart of our lives. Digital multimedia span a vast number of areas from business to leisure, including but not limited to education, medicine, accessibility, training, advertisement, entertainment and social networks. The dominance of visual multimedia has created an increasing need for broadcasters and service providers to present contents of superior visual quality while keeping the storage and transmission costs as low as possible. Before finally being presented to users, all contents are processed for transmission, which reduces the quality depending on the characteristics of the processes involved. Besides enhancement methods applied as preprocessing and post-processing, compression is the key step of content delivery. Image and video processing communities have been proposing improved solutions to the multimedia compression problem for decades, using mathematical transforms, augmenting human visual system responses, and finally, incorporating deep neural networks. What distinguishes the proposed solutions from each other is two fold: one characteristic is the solution architecture, whereas the other aspect is how the solution performs. The performance of image and video compression models can be measured objectively and subjectively, with the latter emphasizing the quality of the content perceived by users. Both when developing and employing compression technologies, providers need to assess the end quality of their product. How this quality is estimated and measured is of key importance. Standardized psychophysical experiments measure the subjective quality of images and video, with the requirement of the participation of many human subjects. Objective quality assessment methods seek to provide a better alternative by accommodating no human costs at computation time, yet still predicting quality with high accuracy when compared to viewers' opinion. An efficient compression method ideally needs to employ a strong objective metric to measure the impact of degradations effectively, thereby maximize algorithm performance by achieving an optimal rate-distortion trade-off. In this work, the problem of constructing an end-to-end image compression system using an objective metric with high correlation to subjective ratings is addressed. First, the challenges of building an effective objective metric are discussed and multiple learning-based solutions using convolutional neural networks are proposed. For that means, the construction of a comprehensive database is presented, which involves mean opinion scores of compressed high resolution images, obtained via subjective quality assessment experiments. Afterwards, traditional transform-based codecs are investigated along with recent improvements as well as their learning-based counterparts, leading to the construction of novel end-to-end compression models using convolutional neural networks. The proposed autoencoders initially employ state-of-the-art objective metrics in their cost function. As a final step, overall loss of the compression model is modified to include the aforementioned learning-based objective metric, combining the compression and quality assessment solutions proposed in this work. The presented approaches provide improvements and novel insights to the state of the art both in the domains of image quality assessment and learning-based image compression.

Details

Actions

Preview