Towards Real-World Super-Resolution using Deep Neural Networks

Image super-resolution reconstructs a higher-resolution image from the observed low-resolution image. In recent years, machine learning models have been widely employed and deep learning networks have achieved state-of-the-art super-resolution performance. Most of these methods, however, are trained and evaluated on simulated datasets that assume a simple and uniform degradation model. The deep learning models trained on such simulated datasets fail to generalize to practical applications because the actual or real degradation in real-world low-resolution images is much more complex. In this thesis, we propose several approaches to improve the robustness and generalization of deep super-resolution models. The first technique reduces the accumulation of errors in the camera pipeline. We build a deep residual network for learning an end-to-end mapping between raw images and high-resolution images. Our proposed network, trained on high-quality samples, is able to reconstruct in a single step high-quality super-resolved images from low-resolution Bayer mosaics. Extensive experiments show that the proposed method achieves better results than the state-of-the-art techniques, both qualitatively and quantitatively. To resolve the problem of the mismatch between the applied blur kernel in the synthetic dataset and the real-world camera blur, we propose to incorporate blur kernel modeling in the training. We generate the super-resolution training dataset by employing a set of realistic blur kernels estimated from real low-resolution photographs. We build a pool of realistic blur kernels with a generative adversarial network; then, we train a super-resolution network using the low-resolution images constructed with the generated kernels. Our method reconstructs more visually plausible high-resolution images compared to other state-of-the-art methods that rely on a simple degradation model. In order to study the effect of noise on super-resolution, we collect a dataset that contains pairs of noisy low-resolution images and the corresponding high-resolution images by using microscopy. We then benchmark the combinations of denoising methods and super-resolution networks on the dataset. Our experimental results show that the super-resolution networks are sensitive to noise and that the consecutive application of two applications suffers from the accumulation of errors. The benchmark results also suggest that the networks can benefit from joint optimization, hence we use a single network for joint denoising and super-resolution. Our network, trained with a novel texture loss, outperforms any combination of state-of-the-art deep denoising and super-resolution networks. Finally, to take advantage of multi-modal data available in certain applications, we propose a super-resolution system based on the fusion of information from multiple sources. For the application of spectral image super-resolution, we use two downsampled versions of the same image to infer a better high-resolution image for training. We refer to these inputs as a multi-scale modality. As color images are usually taken at a resolution higher than spectral images, we make use of color images as another modality to improve the super-resolution network. We build a pipeline that learns to super-resolve by using multi-scale spectral inputs guided by a color image by combining both modalities. Our proposed method is economical in time and memory consumption yet achieve competitive results.

Süsstrunk, Sabine
Lausanne, EPFL

Note: The status of this file is: Anyone

 Record created 2020-08-03, last modified 2020-10-23

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)