Deep Image Restoration: Between Data Fidelity and Learned Priors
Image restoration reconstructs, as faithfully as possible, an original image from a potentially degraded version of it. Image degradations can be of various types, for instance haze, unwanted reflections, optical or spectral aberrations, or other physically induced artifacts. Among the most fundamental restoration tasks is additive denoising, but also image inpainting and super-resolution. Denoising recovers an original image from an observed version containing a noise component over the image signal. It has significant theoretical importance as various problems can be reduced into a denoising problem, or reformulated to use a denoising solution. It also has significant practical importance due to its widespread use in imaging pipelines. Inpainting recovers image areas that are completely lost. Super-resolution increases the resolution of an image; in other words, it reconstructs an image with an effectively higher sampling rate and a larger-cutoff acquisition low-pass filter. To this end, it requires both effective deblurring and interpolation operations when viewing the problem from a spatial perspective.
The available methods for image restoration can be divided into two main categories; the classic restoration methods and the more recent deep neural network based approaches. With the advancement of deep learning, neural networks pushed the previous performance limits in image restoration, often at the expense of interpretability and reliability. Here, reliability means fidelity to the original image data. Classic image restoration is based in part on data fidelity and in part on priors that are manually designed, with the weighing between them also being manually chosen. Even though the distinction is often lost in the final output, the hallucinations induced by the prior are generally controllable and can be intuitively analyzed. This is, however, no longer the case with deep neural networks. These networks implicitly learn a prior and learn to be faithful to the original data, through the thousands or more of their hidden weights. Hence, control and interpretability over the contribution and nature of data fidelity and prior components are lost.
In this thesis, we analyze denoising and super-resolution networks in the frequency domain to gain further understanding over how image components and their inter-relations are learned and manipulated by the deep networks. Based on the obtained insights, a stochastic masking approach is presented to improve the learning. We also present a theoretical framework to evaluate a network's performance in learning the statistically optimal data fidelity and the optimal prior in a designed experimental setup. This framework is then generalized for denoising real image data by incorporating internal noise level estimation. Lastly, we present a framework that generalizes various families of classic restoration methods based on explicit optimizations and that can incorporate learned network priors. The framework also accounts for learning the fusion weights that balance between data fidelity and a learned prior, rather than a manually designed heuristic. As the framework enables us to disentangle these two components, the fusion weights are explicit, and structurally given per pixel. These weights could benefit both the interpretability and the various downstream applications.
EPFL_TH8057.pdf
n/a
openaccess
Copyright
43.64 MB
Adobe PDF
33555132e1d1e63e08da571912aa7eee