Deep Learning in Limited Resource Environments

DEEP LEARNING IN LIMITED
RESOURCE ENVIRONMENTS
OGUZ VURUSKANER

OVERVIEW
➢ Limited Resource Environments
➢ Training Improvements
➢ Self-Adversarial Training
➢ Arcihtectural Improvements
➢ Model Quantization
➢ Depthwise Separable Convolutions
➢ References

LIMITED RESOURCE ENVIRONMENTS
➢ In actual, the supply of a resource is always limited at any point of time.
➢ Virtually unlimited resources: On-demand extensions are available.Training
environments mostly have virtually unlimited resources. ( e.g. data centers,
cloud services )
➢ Limited resources: Not extendable. ( e.g. Perseverance (Mars Rover),
embedded devices, mobile phones )

Model Improvements
Training Improvements Architectural Improvements

FIRE DETECTION DATASET
• It is a benchmark dataset for model experiments.
• In the following months, it is going to be released public.
• 4200 training images , 672 validation images

SELF-ADVERSARIAL TRAINING
• By adding small but intentional worst-case perturbations, perturbed input
results in the model outputting an incorrect answer with high confidence.[1]
• Even though deep learning models have a complex non-linear computational
graph, they can be deceived by simple linear method which is called Fast
Gradient Sign Method.
• In our experiments, we’ve used Fast Gradient Sign Method.

FIRE DETECTION RESULTS
MODEL CORRECT ALARM FALSE ALARM
ResNet-18 w/ Adversarial 91.1% 2.9%
ResNet-18 91.0% 3.2%

CONCLUSION
• FGSM is a valid data augmentation strategy. It has improved performance with
considerably small training time drawback.
• One advantage of FGSM is its perturbation vector strictly depends on current
state of the trained model. It is a self-evolving data augmentation strategy.

MODEL QUANTIZATION
• Quantization converts a real value to an integer value. Reverse of this process
is called Dequantization.
• In general, quantization converts from 32-bit floating point to 1-byte which is
x4 memory saving!
Typical Quantization Schema[2]
S is called scale, Z is called zero-point. Together, they
define an affine transformation between real values
and integer values.

MODEL QUANTIZATION
Quantization mapping between floating point and signed byte with Scale=0.024
and Zero-point=0
-2 0 3 4
-127 127
-3.048
-83 0 125

QUANTIZATION AWARE TRAINING
• This technique readjusts floating point weights to the nearest quantization level
after every training step in the given quantization interval [a,b].
Quantization
Step
Quantization
Level
Clamp function translates input domain to quantization
interval.

FIRE DETECTION RESULTS
MODEL CORRECT ALARM FALSE ALARM
ResNet-18 QAT 90.3% 2.8%
ResNet-18 91.0% 3.2%

CONCLUSION
• In single batch inference, quantized inference outperforms approximately
doubles up in speed. However, in general performance, it seems that there are
inconsistencies on inference.
• When the results are compared with respect to inference, still, standard FP-32
inference has better results. It has higher average inference time but less
deviation.

DEPTHWISE SEPERABLE
CONVOLUTIONS (MOBILENET[3])
Naïve Convolution Depthwise Seperable Convolution

DEPTHWISE SEPARABLE
CONVOLUTIONS (MOBILENET)
• Naïve convolution complexity
• Depthwise Separable Convolution complexity

USE CASE
• Unsupervised anomaly detection on real time streams requires continuous
training of deep learning model.
• To increase inference speed and memory, we’ve proposed using depthwise
separable convolutions.
• A hourglass model is trained with normal video frames and then tested with
anormal video frames.

USE CASE
An example hourglass network architecture

RESULTS
Naïve convolution – 537K parameters
Average InferenceTime: 0.106s
DS convolution – 93.8K parameters
Average InferenceTime: 0.144s

CONCLUSION
• While replacing naïve convolutions with depthwise separable convolutions, 2
extra layers has been added.That’s why inference speed may have reduced
even there are less parameters in DS Convolution.
• Real-time anomaly detection with self-trained models are still active research
field.

FUTURE WORK
• Student-Teacher Models
• Feature-Based Knowledge Distillation
• Response-Based Knowledge Distillation
• Pseudo Labels
• Confident Learning : Dataset Labels Improvement
• Pseudo Labels combined with student-teacher models : Meta Pseudo Labels

REFERENCES
1. Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy.
"Explaining and harnessing adversarial examples." arXiv preprint
arXiv:1412.6572 (2014).
2. Jacob, Benoit, et al. "Quantization and training of neural networks for
efficient integer-arithmetic-only inference." Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition. 2018.
3. Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural
networks for mobile vision applications." arXiv preprint
arXiv:1704.04861 (2017).

Deep Learning in Limited Resource Environments

Deep Learning in Limited Resource Environments

More Related Content

What's hot (20)

Similar to Deep Learning in Limited Resource Environments (20)

Recently uploaded (20)

Deep Learning in Limited Resource Environments