Deep Image Priors on Neural Networks with No Training

Late last year, researchers at Oxford University and The Skolkovo Institute of Science and Technology detailed their work on deep image priors. The concept of a "deep image prior" is intuitive based on the samples provided; take a noisy or distorted image, and make it look as normal as the original. But their work takes it a step further, and does so without a reference to the original image, and without a trained network.

A great deal of image statistics are captured by the structure of a convolutional image generator rather than by any learned capability ... (and) ... no aspect of the network is learned from data; instead, the weights of the network are always randomly initialized, so that the only prior information is in the structure of the network itself.

The research focuses on super-resolution, denoising, image reconstruction and inpainting. They created and demonstrated a generator network, with no pre-training and no database that's capable of successfully rendering original quality images. Their results are comparable with the standards referenced in their research paper based on trained deep convolutional neural networks, or ConvNets. Researchers Ulyanov, Vedaldi, and Lempitsky assert that:

The structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning ... we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, super-resolution, and inpainting ... (and) bridges the gap between two very popular families of image restoration methods: learning-based methods using ConvNets, and learning-free methods based on handcrafted image priors such as self-similarity.

The team implemented the generator networks using Python's Torch libraries. They developed modules for processing noise, distortion, and interference in an image resulting from things like salt-and-pepper or "TV noise", pixel scrambling and image masking. Image inpainting is the process of removing "the mask" from an image. Masks can potentially be things like copyright watermarks on stock images, but generic image masking demonstrations were used in the sample code. Output samples from PNG files processed by the neural network show that the network successfully identifies and removes the mask like it were an overlaid area atop the original image.

The network itself alternates filtering operations such as convolution, upsampling and non-linear activation ... the choice of network architecture has a major effect how the solution space is searched by methods such as gradient descent. In particular, we show that the network resists "bad" solutions and descends much more quickly towards naturally-looking images.

Their findings could challenge the notion that ConvNets derive their success from the ability to learn realistic priors from data. The team noted that their "Swiss-army knife approach" is computationally intensive, requiring several minutes of GPU time for a single 512 x 512 pixel image. The python code, inlcuding Jupyter notebooks and sample data can be found on Github.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter