Can Exposure, Noise and Compression affect Image Recognition? An Assessment of the Impacts on State-of-the-art ConvNets

CANEXPOSURE,NOISEANDCOMPRESSION
AFFECTIMAGERECOGNITION?
An Assessment of the Impacts
on State-of-the-art ConvNets
STEFFENS, Cristiano R.; MESSIAS, Lucas R. V.;
DREWS-JR, Paulo J. L.;BOTELHO, Silvia S. d. C.
cristianosteffens@furg.br

OUTLINE
Evaluating image recognition models behiond validation sets
• Perception / Vision is an important component of modern autonomous systems
• CNNs hold the state-of-the-art in image recognition
• Growing interest in reliability / robustness
• Comprehensive assessment
• Clear methodology
• State-of-the-art models
• Several types of distortion
• Further directions
• How can we build better models?
• Can we prevent systems from operating on faulty data?
• Can we build better pipelines?
2

MOTIVATION
Under-exposure conditions
3
• Weakly illuminated scenes
• Time constraints (i.e. the robot depends on the image acquisition/processing to make a decision)
• Scenes with high dynamic ranges
• Small apperture (hardware construction)
• Low quality/cost sensors
Properly exposed Low Range Gamma 2 Gamma 4 Gamma 8

MOTIVATION
Some common conditions
4

MOTIVATION
Over-Exposure
5
• Scene with high dynamic range
• Ill adjusted optics/gain
• Time constraints
• Reflective surfaces
• Low dynamic range sensors
Properly exposed Low Range Gamma 1/2 Gamma 1/4 Gamma 1/8

MOTIVATION
Lossy Compression, Poisson, Gaussian, Salt & Pepper and Speckle Noise
6
• Bandwidth limitation
• Storage limitation
• Sensor quality
• Dead pixels (always off or always on)
• Wear and tear
• Dust, damage on lens and sensors, noise
Over-compression Poisson Noise Gaussian Noise Salt & Pepper Speckle Noise

EVALUATED MODELS
Imagenet Large Scale Visual Recognition Challenge (ILSVRC) State-of-the-Art holders
Model Year Size Parameters
Top-1
Acuracy
Input Size
Inception-
ResNet-v2
2017 215 MB 55,873,736 0,80 299
MobileNetV1 2017 16 MB 4,253,864 0,70 224
NASNetLarge 2018 343 MB 88,949,818 0,83 331
NASNetMobile 2018 23 MB 5,326,716 0,74 224
VGG16 2014 528 MB 138,357,544 0,71 224
Xception 2017 88 MB 22,910,480 0,79 299
7

PROCEDURE
A procedure that can be reproduced and used for any vision task
• We use pre-trained image recognition models
• No fine-tuning
• Exact same preprocessing as in the original implementation
• Oficial Imagenet validation set
• 1000 classes
• 50 images per class
• Inference on:
• Original set (to avoid hardware related, interpolation and other bias)
• 8 levels of misexposure
• Over-compressed images
• 4 types of typical noise
8

RESULTS
Overview – Top-1 Accuracy
9

RESULTS - INCEPTION-RESNET-V2
Overall good performance. Robust towards mild mis-exposure, compression, Gaussian and Poisson
10

RESULTS - INCEPTION-RESNET-V2
Overall good performance. Robust towards mild mis-exposure, compression, Gaussian and Poisson
11
FNs are limited to 50 due to the validation
dataset properties
No upper bound for FP
Statistics are per class:
A median of 10 means that 50% of the
classes in the dataset presented 10 or less
false negatives.
What is more important?
Would you rather overrun a person due to a
FN or stop in the middle of the road due to a
FP ?

RESULTS - MOBILENETV1
No robustness to S&P and Speckle Noise. Highly affected by moderate mis-exposure.
12

RESULTS - NASNET LARGE
Best accuracy, precision, and F1-Score among all models considered in this study
13

RESULTS - NASNET MOBILE
Significantly affected by severe miss-exposure conditions, S&P, and Speckle noise
14

RESULTS – VGG16
Slightly affected by mild mis-exposure and Poisson noise
15

RESULTS – XCEPTION
Robust towards moderate mis-exposure, over-compression, Gaussian and Poisson noise
16

CONCLUSION
New is Always better! Larger is better!
• Relevant
• Autonomous systems
• Robotics
• Applications that rely on visual perception
• Comprehensive experiment
• Broad set of classifiers
• Based on standard ILSVRC validation set
• Poor exposure
• Heavy compression
• Signal independent noise
• Signal dependent noise
• Reproducible procedure
• Objective evaluation
• No human bias
17

CONCLUSION
New is Always better! Larger is better!
• Most models are
• Little affected by mild miss-exposure.
• Robust towards Poisson and Gaussian noise
• Critically affected by moderate to severe miss exposure
• Critically affected by S&P and Speckle noise
• CNNs are evolving
• Modern architectures, such as NASNet, Inception Resnet v2 and Xception are more robust
• VGG is among the least robust
• Large models are better
• NOT you VGG!!
• NASNet Large performs significantly better than its Mobile version (while both share the same building
blocks)
• Mobile models are most affected
18

ONGOINGAND FUTURE WORK
We have a real issue! How can we solve it?
• Could the models’ accuracy be improved by adding these
common distortions in training time?
😞 Preliminary results show small improvement
• Can we build image processing pipelines which protect the
application from failing due to faulty data?
😃 Absolutelly! Preliminary results are promising 👉
• Can we prevent ill exposure in mobile/outdoor robotics?
⏳ Future Work
• Can we improve classification models by putting more
emphasis on image classes that are more prone to error?
⏳ Future Work
19
👈 Damaged
👈 Restored
☝️ Original

THANKS!
Cristiano Steffens
cristianosteffens@furg.br
researchgate.net/profile/Cristiano_Steffens
20
cristianosteffens@ieee.org.br
github.com/steffensbola
This study was financed in part by the Coordenação de Aperfeiçoamento de
Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Can Exposure, Noise and Compression affect Image Recognition? An Assessment of the Impacts on State-of-the-art ConvNets

More Related Content

Similar to Can Exposure, Noise and Compression affect Image Recognition? An Assessment of the Impacts on State-of-the-art ConvNets (20)

More from Cristiano Rafael Steffens (20)

Recently uploaded (20)

Can Exposure, Noise and Compression affect Image Recognition? An Assessment of the Impacts on State-of-the-art ConvNets