Recent Progress on Single-Image Super-Resolution

Copyright © DeNA Co.,Ltd. All Rights Reserved.
Recent Progress on
Single-Image Super-
Resolution

Self Introduction
2
Hiroto Honda
@hirotomusiker
n Image sensor and low-level CV engineer (-2015)
n ETH Zurich CVL visiting researcher (2013-2014)
n DeNA AI R&D engineer (2017-)
mainly on object detection and recognition
(OSS: https://github.com/DeNA/Chainer_Mask_R-CNN )
n CVPR NTIRE Workshop Program Committee

Contents
n Introduction to Single Image Super Resolution (SISR)
n Simple SISR Networks
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling methods – deconv or pixelshuffle
n The baseline: SRResNet
⁃ SRResNet, SRGAN, and EDSR
n Super Resolution and Perception
⁃ Restoration results and loss functions
⁃ Perception – Distortion Tradeoff
n Conclusion
3

What can Single-Image Super-Resolution do ?
n Low-Resolution Image
n High-Resolution Image
4
Restore

SISR is easy to try!
5
original(HR) LR
resize
train
No manual annotations are necessary
(self-supervised learning)

Progress on SISR
6
from: https://github.com/jbhuang0604/SelfExSR
PSNR* [dB] (over bicubic)
on Set5 dataset, x4
+1.86
+2.93
+2.06
+3.63
A+0.0
bicubic
2015 20172014 2016
+4.20
+2.48
PSNR data from：5)
SRCNN VDSR SRResNet EDSRESPCN
SISR is getting more & more accurate...
* PSNR = 10 log10 (2552 / MSE ) when max value is 255

Non-deep method: Dictionary-based algorithm
7
=
optimize coefficients
7
baseline of non-deep methods : A+ (2014)
http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf
=
learnt dictionary
x 0 +
x 0 +
x 0.8 +
x 0.8 +
x 0.05 +
x 0.05 +
LR
patch
HR
patch

How to train a deep SISR network
n Crop patches from the groundtruth images HR
n Down-sample them to generate input images LR = g(HR)
n Put them into a batch {LR}, {HR}
n Train the network f with pixel-wise loss func : MSE(HR, f(LR))
n ...thatʼs it!
8
LR=g(HR) f(LR) HR
f
MSE
e.g. bicubic down-sampling

n Simple SISR networks
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling methods – deconv or pixelshuffle
9

The first CNN-based SISR – SRCNN
10
Kernel size: 9 – 1 – 5 or 9 – 3 – 5 or 9 – 5 – 5
from：1)
pretty simple and straightforward!
bicubic x2

VDSR: Deeper Version of SRCNN
11
from：3)
3x3, 64 ch D= 5 to 20

Efficient sub-pixel CNN (ESPCN)
12
CNN deals with LR feature maps -> minimum operations
from：2)

Difference between SRCNN / VDSR and ESPCN
n Pre-upsampling: Costly, but flexible for different degradation types
n Post-upsampling: Efficient, magnitude fixed to integers (x2, x3, x4...)
13
SRCNN,
VDSR
ESPCN
bicubic x2 output
input
Pixel shuffle x2
ch
h
w

Upscaling methods - Deconvolution or PixelShuffle?
n Deconvolution
14
https://distill.pub/2016/deconv-checkerboard/
Checkerboard artifacts <- number of pixels
contributing to convolutions varies at each location

n How about resize - convolution?
15
Resize (bilinear / bicubic) : less artifacts
e.g. Resize -> Conv2D (256ch -> 3ch)

n Sub-pixel convolution (aka. PixelShuffle)
16
Tile the channel data at every location -
e.g. 9 channels -> 3x3 sub-pixels at one location
Non-convolutional upscaling
from：2)

n The baseline : SRResNet
⁃ SRResNet, SRGAN, and EDSR
17

SRResnet* and SRGAN – the new standard
18
24 residual blocks, 64 ch
Skip connection
pixel
shuffle
x2
MSE
MSE
Discriminator
Trained VGG
Perceptual Loss
Discriminator
Loss
MSE Loss
from：4)
pixel
shuffle
x2
* the network is called ‘SRResNet’ when only MSE loss is used.ch
h
w

SRResnet* and SRGAN – the network details
19
・resblocks with skip connection
・pixel shuffle upsampling
・3 types of loss functions
from：4)

Enhanced Deep Super Resolution (EDSR) network
20
32 residual blocks, 256 ch
Skip connection x2
x2
l1
l1 Loss
from：5)
・more resblocks
・no BN layers in resblocks
・l1 loss

PSNR and appearances
21
from：5)
1dB difference matters a lot!

n Super Resolution and Perception
⁃ Restoration results and loss functions
⁃ Perception – Distortion Tradeoff
22

Which do you prefer?
23
Original
SRResNet
25.53dB
SRGAN
21.15dB
bicubic
21.59dB
Method→
PSNR →
from：4)

SRResnet and SRGAN – comparison of restored images
24
MSE loss ● ●
Perceptual loss using VGG ●
Discriminator loss ● ●
from：4)
highest PSNR

3 types of loss functions
①l1/l2 loss
②perceptual loss
③GAN loss
25
generated
image
real / fake
ground
truth
multi-scale
feature
matching
VGG
discrimi-
nator
generated
image
ground
truth
generated
image
ground
truth
Low
Distortion
Good
Perception

Perception-Distortion Tradeoff
No methods can achieve low distortion and good perceptual
quality at the same time!
26
from：8)

What is SISR for?
27
Accurate Plausible
from: 4)
It depends on your application!

n summary
28

Progress on SISR – accuracy and runtime
29
PSNR [dB] (over bicubic)
on Set5 dataset, x4
+1.86
+2.93
+2.06
+3.63
A+ SRCNN VDSR SRResNet EDSR0.0
bicubic
2015 20172014 2016
+4.20
ESPCN
+2.48
0.44
0.04
0.74
1.33
40.7
Mega-Multiplication
per one input pixel
for x2 restoration
PSNR data from：5)

Benchmark Details from NTIRE 2017
30
EDSR
SRResNet
VDSR
ESPCN
SRCNN
A+
from: 9)

Summary
n Single-Image Super-Resolution is getting more accurate,
but more costly
n Resblocks with skip connections and pixel-shuffle upsampling are
the key components
n SRResNet-based network is the current baseline
n ʻAccurateʼ or ʻPlausibleʼ – what do you want?
31

Appendix: Residual Dense Network for Super-Resolution
32
DenseNet-based SRResNet
from: 6)

Appendix: Deep Back-Projection Networks For Super-Resolution
(best PSNR in NTIRE ʼ18 x8 bicubic downsampling track)
33
from: 7)
up / down projection with dense connection

Datasets
n DIV2K dataset (train, val)
https://data.vision.ee.ethz.ch/cvl/DIV2K/
n Set5 dataset (test)
http://people.rennes.inria.fr/Aline.Roumy/results/SR_BMVC12.html
n B100 dataset (test)
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
n Urban100 dataset (test)
https://sites.google.com/site/jbhuang0604/publications/struct_sr
34

Competitions
n NTIRE2017:
New Trends in Image Restoration and Enhancement workshop and challenge on image super-
resolution in conjunction with CVPR 2017
http://www.vision.ee.ethz.ch/ntire17/
report: http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf
n NTIRE2018:
New Trends in Image Restoration and Enhancement workshop and challenge on super-resolution,
dehazing, and spectral reconstructionin conjunction with CVPR 2018
http://www.vision.ee.ethz.ch/ntire18/
report:
http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w13/Timofte_NTIRE_2018
_Challenge_CVPR_2018_paper.pdf
n PIRM2018:
Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with
ECCV 2018
https://www.pirm2018.org/
35

References
1) Dong et al., Image Super-Resolution Using Deep Convolutional Networks,
https://arxiv.org/abs/1501.00092
2) Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel
Convolutional Neural Network, https://arxiv.org/abs/1609.05158
3) Kim et al., Accurate Image Super-Resolution Using Very Deep Convolutional Networks,
https://arxiv.org/pdf/1511.04587
4) Ledig et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial
Network ,
5) Lim et al., Enhanced Deep Residual Networks for Single Image Super-Resolution,
6) Zhang et al., Residual Dense Network for Image Super-Resolution,
7) Haris et al., Deep Back-Projection Networks For Super-Resolution,
https://arxiv.org/pdf/1803.02735.pdf
8) Blau et al., Perception Distortion Tradeoff, https://arxiv.org/abs/1711.06077
9) Timofte et al., NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and
Results , http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf
10) Super-Resolution.Benckmark, https://github.com/huangzehao/Super-Resolution.Benckmark
36

Recent Progress on Single-Image Super-Resolution

More Related Content

What's hot (20)

Similar to Recent Progress on Single-Image Super-Resolution (20)

Recently uploaded (20)

Recent Progress on Single-Image Super-Resolution