SlideShare a Scribd company logo
Improving Resource Availability in
Data Centers using Deep Learning
(深層学習を使用したデータセンタにおける資源利用効率の向上)
Kundjanasith Thonglek
Software Design & Analysis Laboratory
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
2
Software Design & Analysis Laboratory
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
3
Software Design & Analysis Laboratory
Data Centers
Data centers are centralized facilities where computing and storage
hardware are aggregated to handle large amounts of data and computation.
4
Software Design & Analysis Laboratory
Technical challenges
➢ System monitoring
➢ Energy management
➢ Continuous migration
➢ Availability improvement
Objective
I aim to improve the availability of computing and storage resources in
data centers by applying deep learning.
5
Software Design & Analysis Laboratory
Resource utilization is paramount to many cloud providers as they need
to utilize their hardware resources efficiently to maximize profit.
Storage Resources
❖ Hard Disk
Computing Resources
❖ CPU, Memory
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
6
Software Design & Analysis Laboratory
Users excessively request computing resources
➢ Users tend to request more computing resources than their applications actually need
○ Unused computing resources by application are wasted
○ Overall computing resource utilization in the data centers degrades
7
Software Design & Analysis Laboratory
wasted resource
Overview of Proposed Method
8
Software Design & Analysis Laboratory
Analyzing
Cluster Usage
Designing
Neural Network
Evaluation
Training
LSTM Model
Analyzing Cluster Usage
Designing Neural Network
Training LSTM Model
Evaluation
Analyze Google’s cluster
usage trace obtained from
a production data center
Design an LSTM-based model to
predict better resource
allocation from historical data of
resource usage and allocation.
Train our model using
Google’s cluster usage trace
Evaluate improvement of
resource utilization using
Google’s cluster scheduler
simulator
Analyzing Cluster Usage
9
Software Design & Analysis Laboratory
Google’s cluster usage trace is real workload data in Google’s data center
Computing Resource Requested Resource Used Resource
CPU Requested CPU Used CPU
Memory Requested memory Used memory
Long Short-Term Memory
Recurrent Neural Network (RNN)
➢ Deep learning model for time-series forecasting
➢ Model size not increasing with size of input
➢ Weights are shared across time
10
Software Design & Analysis Laboratory
Long Short-Term Memory (LSTM) introduces long-term memory into RNN
➢ LSTM migrates the vanishing gradient problem, where the neural
network stops learning because the updates to the various weights
within a given neural network become smaller and smaller
➢ The memory cell replaces hidden neurons used in traditional RNNs to
build a hidden layer
Proposed neural network
Input: The requested and used of CPU and memory resources
1st
LSTM: Finding the correlation between CPU and memory
2nd
LSTM: Finding the correlation between allocated and used
Fully Connected: Connected each neuron to one layers
Output: The efficient CPU and memory allocation
11
Software Design & Analysis Laboratory
Training LSTM Model
Improving resource utilization by
implement Long Short-Term Memory
model using requested CPU, requested
memory, used CPU and used memory.
12
Software Design & Analysis Laboratory
Allocated Resource
Used Resource
Memory (%)
CPU (%)
Memory (%)
CPU (%)
M
O
D
E
L
Memory cell size
➔ 20 minutes
➔ 40 minutes
➔ 60 minutes
The memory cell size in
Long Short-Term Memory
model is memorizing each
step input-output pair of
values in each sequence.
Usage Simulation
Simulate resource utilization in
data center from allocated resource
which is predicted using our time-
series predictive model to apply with
the actual computing resources.
13
Software Design & Analysis Laboratory
Google’s cluster usage data
(513,000 jobs)
Training dataset
(80%)
Testing dataset
(20%)
[LSTM/RNN] MODEL
Allocated Resource [Predicted]
CPU (%)
Memory (%)
Resource Allocation
CPU (%)
Memory (%)
Google’s simulation
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
14
Software Design & Analysis Laboratory
-3% -6% -4% -8% -7%
-11%
-12%
-27% -23%
-34% -35%
-48%
Decreased Computing Resource Wastage
15
Software Design & Analysis Laboratory
CPU Memory
Training time & Inference time
16
Software Design & Analysis Laboratory
408.93
35.67
130.82
49.77
35.13
28.78
Training time Inference time
*For 100 epochs *For 102,600 jobs
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
17
Software Design & Analysis Laboratory
ML models are becoming larger
ML model compression improves the storage usage efficiency by reducing
the size of ML models, and increases the availability of storage resources.
18
Software Design & Analysis Laboratory
Model Name Model Size Application
GPT-3 700 GB Language Processing
VGG-16 528 MB Image Classification
Mask RCNN 256 MB Object Detection
Normally, ML model compression reduces the model size, but it also
decreases the accuracy.
Compressing models while maintaining accuracy
19
Software Design & Analysis Laboratory
Quantization Retraining
Original Model Compressed Model
Quantized Model
Decrease model size
with loss of accuracy
Increase model accuracy
while keeping the model size
20
Software Design & Analysis Laboratory
Quantization Retraining
Original Model Compressed Model
Quantized Model
Decrease model size
with loss of accuracy
Increase model accuracy
while keeping the model size
Compressing models while maintaining accuracy
Calculate
clusters
Vector Quantization
21
Software Design & Analysis Laboratory
Calculate
centroids
Vector Quantization - lossy data compression
22
Software Design & Analysis Laboratory
Quantization Retraining
Original Model Compressed Model
Quantized Model
Decrease model size
with loss of accuracy
Increase model accuracy
while keeping the model size
Compressing models while maintaining accuracy
Retraining using unlabeled data
23
Software Design & Analysis Laboratory
Most existing retraining methods require the labeled datasets to retrain.
Using unlabeled dataset for retraining is highly useful when the labeled
dataset is unavailable.
DATA
LABEL
Researcher/Developer Labeled Dataset
Privacy policy, License limitation
DATA
Researcher/Developer Unlabeled Dataset
Proposed Retraining Method
24
Software Design & Analysis Laboratory
Unlabeled
Data set
Quantized model
Non-trainable layer
Trainable layer
Original model
Trainable layer
Output vector
Output vector
Loss
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
25
Software Design & Analysis Laboratory
Case Study
26
Software Design & Analysis Laboratory
VGG-16 ResNet-50
Case Study of VGG-16
27
Software Design & Analysis Laboratory
Model Architecture Bias Value Weight Value
10
8
10
3
Model Quantization
28
Software Design & Analysis Laboratory
Size of Quantized
VGG-16 models
Accuracy of Quantized
VGG-16 models
# of
quanized
layers
Model Retraining
29
Software Design & Analysis Laboratory
Retraining Quantized VGG-16 models
Quantizing the 14th
and 15th
layers using 32-256 centroids
achieved nearly the accuracy of the original model.
The best configuration for quantizing
VGG-16 model
- Quantize the biases in all layer using
1 centroid
- Quantize the weights in 14th
and 15th
layers using 32 centroids
It compressed to possible smallest model size without
significant accuracy loss.
# of centroids
Case Study of ResNet-50
30
Software Design & Analysis Laboratory
Model Architecture Bias Value Weight Value
Model Quantization
31
Software Design & Analysis Laboratory
Size of Quantized
ResNet-50 models
Accuracy of Quantized
ResNet-50 models
# of
quanized
layers
Model Retraining
32
Software Design & Analysis Laboratory
Retraining Quantized ResNet-50 models
Quantizing the 13th
- 49th
layers using 128 or less centroids
clearly degrades the accuracy of the model.
The best configuration for quantizing
ResNet-50 model
- Quantize the biases in all layer using
1 centroid
- Quantize the weights in 13th
- 49th
layers using 256 centroids
It compressed to possible smallest model size without
significant accuracy loss.
# of centroids
Conventional & Proposed Retraining
33
Software Design & Analysis Laboratory
Accuracy of quantized model through retraining Retraining time of quantized model
85%
82%
*Conventional retraining method is retraining all layers in the model
Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
34
Software Design & Analysis Laboratory
Conclusion
➢ Improving availability of computing resources
○ We proposed the method for predicting the efficient allocated computing resources from the
proposed LSTM-based prediction model to improve computing resource availability
○ The proposed method is able to improve computing resource availability of the CPU and
memory by 11% and 48%, respectively
➢ Improving availability of storage resources
○ We proposed the method for reducing the size of the neural network models without the
significant accuracy loss to improve storage resource availability
○ The proposed method is able to improve storage resource availability of VGG16 and
ResNet50 by 81% and 52%, respectively
35
Software Design & Analysis Laboratory
Future Work
➢ Improving availability of computing resources
○ The significant features that impact to computing resource availability should be
investigated for conducting the efficient method
○ We would like to apply the other time-series forecasting techniques to improve the
availability of computing resources
➢ Improving availability of storage resources
○ The structure of other neural network models should be investigated to conduct the efficient
retraining method
○ We would like to apply the compression techniques other than quantization technique for
reducing the size of neural network models
36
Software Design & Analysis Laboratory
Publications
➢ Improving availability of computing resources
○ Kundjanasith Thonglek, Kohei Ichikawa, Keichi Takahashi, Chawanat Nakasan, and Hajimu
Iida, “Improving Resource Utilization in Data Centers using an LSTM-based Prediction
Model”, Proceedings of Workshop on Monitoring and Analysis for High Performance
Computing System Plus Applications (HCPMASPA 2019), September, 2019.
➢ Improving availability of storage resources
○ Kundjanasith Thonglek, Keichi Takahashi, Kohei Ichikawa, Chawanat Nakasan, Nakada
Hidemoto, Ryousei Takano, and Hajimu Iida, “Retraining Quantized Neural Network Model
without Unlabeled Data”, Proceedings of International Joint Conference on Neural Networks
(IJCNN 2020), July, 2020.
37
Software Design & Analysis Laboratory
Q&A
Thank you
Email: thonglek.kundjanasith.ti7@is.naist.jp
Software Design & Analysis Laboratory

More Related Content

PDF
Improving Resource Utilization in Data Centers using an LSTM-based Prediction...
PDF
Retraining Quantized Neural Network Models with Unlabeled Data.pdf
PDF
Leading Research Across the AI Spectrum
PPTX
Research on the Application of Deep Learning Algorithms in Image Classificati...
PDF
AI firsts: Leading from research to proof-of-concept
PDF
Intelligence at scale through AI model efficiency
PDF
Deep learning and applications in non-cognitive domains II
PDF
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
Improving Resource Utilization in Data Centers using an LSTM-based Prediction...
Retraining Quantized Neural Network Models with Unlabeled Data.pdf
Leading Research Across the AI Spectrum
Research on the Application of Deep Learning Algorithms in Image Classificati...
AI firsts: Leading from research to proof-of-concept
Intelligence at scale through AI model efficiency
Deep learning and applications in non-cognitive domains II
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...

Similar to Improving Resource Availability in Data Center using Deep Learning.pdf (20)

PDF
Machine learning quality for production
PDF
Enabling Power-Efficient AI Through Quantization
PPTX
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
PDF
A White Paper On Neural Network Quantization
PDF
Application of Neural Networks in Embedded Systems Applications (Ihor Starepr...
PPTX
Implementing Machine Learning in the Real World
PPTX
Testing for the deeplearning folks
PPTX
Lessons Learned from Building Machine Learning Software at Netflix
PPTX
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
PPTX
Foutse_Khomh.pptx
PDF
Deep learning in manufacturing predicting and preventing manufacturing defect...
PDF
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
PDF
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
PDF
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
PPTX
Nuts and Bolts of Transfer Learning.pptx
PDF
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
PDF
Deep Domain
PPTX
Computer Design Concepts for Machine Learning
PDF
Pushing the boundaries of AI research
PDF
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Machine learning quality for production
Enabling Power-Efficient AI Through Quantization
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
A White Paper On Neural Network Quantization
Application of Neural Networks in Embedded Systems Applications (Ihor Starepr...
Implementing Machine Learning in the Real World
Testing for the deeplearning folks
Lessons Learned from Building Machine Learning Software at Netflix
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
Foutse_Khomh.pptx
Deep learning in manufacturing predicting and preventing manufacturing defect...
ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
Nuts and Bolts of Transfer Learning.pptx
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Deep Domain
Computer Design Concepts for Machine Learning
Pushing the boundaries of AI research
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Ad

More from Kundjanasith Thonglek (6)

PDF
Sparse Communication for Federated Learning
PDF
Enhancing the Prediction Accuracy of Solar Power Generation using a Generativ...
PDF
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
PDF
Abnormal Gait Recognition in Real-Time using Recurrent Neural Networks.pdf
PDF
Intelligent Vehicle Accident Analysis System.pdf
PDF
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Sparse Communication for Federated Learning
Enhancing the Prediction Accuracy of Solar Power Generation using a Generativ...
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
Abnormal Gait Recognition in Real-Time using Recurrent Neural Networks.pdf
Intelligent Vehicle Accident Analysis System.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Ad

Recently uploaded (20)

PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
1_English_Language_Set_2.pdf probationary
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Empowerment Technology for Senior High School Guide
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Trump Administration's workforce development strategy
PDF
Classroom Observation Tools for Teachers
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
Introduction to Building Materials
PDF
Complications of Minimal Access Surgery at WLH
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Digestion and Absorption of Carbohydrates, Proteina and Fats
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
1_English_Language_Set_2.pdf probationary
Unit 4 Skeletal System.ppt.pptxopresentatiom
Supply Chain Operations Speaking Notes -ICLT Program
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Empowerment Technology for Senior High School Guide
LDMMIA Reiki Yoga Finals Review Spring Summer
Trump Administration's workforce development strategy
Classroom Observation Tools for Teachers
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Introduction to Building Materials
Complications of Minimal Access Surgery at WLH
A systematic review of self-coping strategies used by university students to ...
Paper A Mock Exam 9_ Attempt review.pdf.
202450812 BayCHI UCSC-SV 20250812 v17.pptx
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
Final Presentation General Medicine 03-08-2024.pptx
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE

Improving Resource Availability in Data Center using Deep Learning.pdf

  • 1. Improving Resource Availability in Data Centers using Deep Learning (深層学習を使用したデータセンタにおける資源利用効率の向上) Kundjanasith Thonglek Software Design & Analysis Laboratory
  • 2. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 2 Software Design & Analysis Laboratory
  • 3. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 3 Software Design & Analysis Laboratory
  • 4. Data Centers Data centers are centralized facilities where computing and storage hardware are aggregated to handle large amounts of data and computation. 4 Software Design & Analysis Laboratory Technical challenges ➢ System monitoring ➢ Energy management ➢ Continuous migration ➢ Availability improvement
  • 5. Objective I aim to improve the availability of computing and storage resources in data centers by applying deep learning. 5 Software Design & Analysis Laboratory Resource utilization is paramount to many cloud providers as they need to utilize their hardware resources efficiently to maximize profit. Storage Resources ❖ Hard Disk Computing Resources ❖ CPU, Memory
  • 6. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 6 Software Design & Analysis Laboratory
  • 7. Users excessively request computing resources ➢ Users tend to request more computing resources than their applications actually need ○ Unused computing resources by application are wasted ○ Overall computing resource utilization in the data centers degrades 7 Software Design & Analysis Laboratory wasted resource
  • 8. Overview of Proposed Method 8 Software Design & Analysis Laboratory Analyzing Cluster Usage Designing Neural Network Evaluation Training LSTM Model Analyzing Cluster Usage Designing Neural Network Training LSTM Model Evaluation Analyze Google’s cluster usage trace obtained from a production data center Design an LSTM-based model to predict better resource allocation from historical data of resource usage and allocation. Train our model using Google’s cluster usage trace Evaluate improvement of resource utilization using Google’s cluster scheduler simulator
  • 9. Analyzing Cluster Usage 9 Software Design & Analysis Laboratory Google’s cluster usage trace is real workload data in Google’s data center Computing Resource Requested Resource Used Resource CPU Requested CPU Used CPU Memory Requested memory Used memory
  • 10. Long Short-Term Memory Recurrent Neural Network (RNN) ➢ Deep learning model for time-series forecasting ➢ Model size not increasing with size of input ➢ Weights are shared across time 10 Software Design & Analysis Laboratory Long Short-Term Memory (LSTM) introduces long-term memory into RNN ➢ LSTM migrates the vanishing gradient problem, where the neural network stops learning because the updates to the various weights within a given neural network become smaller and smaller ➢ The memory cell replaces hidden neurons used in traditional RNNs to build a hidden layer
  • 11. Proposed neural network Input: The requested and used of CPU and memory resources 1st LSTM: Finding the correlation between CPU and memory 2nd LSTM: Finding the correlation between allocated and used Fully Connected: Connected each neuron to one layers Output: The efficient CPU and memory allocation 11 Software Design & Analysis Laboratory
  • 12. Training LSTM Model Improving resource utilization by implement Long Short-Term Memory model using requested CPU, requested memory, used CPU and used memory. 12 Software Design & Analysis Laboratory Allocated Resource Used Resource Memory (%) CPU (%) Memory (%) CPU (%) M O D E L Memory cell size ➔ 20 minutes ➔ 40 minutes ➔ 60 minutes The memory cell size in Long Short-Term Memory model is memorizing each step input-output pair of values in each sequence.
  • 13. Usage Simulation Simulate resource utilization in data center from allocated resource which is predicted using our time- series predictive model to apply with the actual computing resources. 13 Software Design & Analysis Laboratory Google’s cluster usage data (513,000 jobs) Training dataset (80%) Testing dataset (20%) [LSTM/RNN] MODEL Allocated Resource [Predicted] CPU (%) Memory (%) Resource Allocation CPU (%) Memory (%) Google’s simulation
  • 14. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 14 Software Design & Analysis Laboratory
  • 15. -3% -6% -4% -8% -7% -11% -12% -27% -23% -34% -35% -48% Decreased Computing Resource Wastage 15 Software Design & Analysis Laboratory CPU Memory
  • 16. Training time & Inference time 16 Software Design & Analysis Laboratory 408.93 35.67 130.82 49.77 35.13 28.78 Training time Inference time *For 100 epochs *For 102,600 jobs
  • 17. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 17 Software Design & Analysis Laboratory
  • 18. ML models are becoming larger ML model compression improves the storage usage efficiency by reducing the size of ML models, and increases the availability of storage resources. 18 Software Design & Analysis Laboratory Model Name Model Size Application GPT-3 700 GB Language Processing VGG-16 528 MB Image Classification Mask RCNN 256 MB Object Detection Normally, ML model compression reduces the model size, but it also decreases the accuracy.
  • 19. Compressing models while maintaining accuracy 19 Software Design & Analysis Laboratory Quantization Retraining Original Model Compressed Model Quantized Model Decrease model size with loss of accuracy Increase model accuracy while keeping the model size
  • 20. 20 Software Design & Analysis Laboratory Quantization Retraining Original Model Compressed Model Quantized Model Decrease model size with loss of accuracy Increase model accuracy while keeping the model size Compressing models while maintaining accuracy
  • 21. Calculate clusters Vector Quantization 21 Software Design & Analysis Laboratory Calculate centroids Vector Quantization - lossy data compression
  • 22. 22 Software Design & Analysis Laboratory Quantization Retraining Original Model Compressed Model Quantized Model Decrease model size with loss of accuracy Increase model accuracy while keeping the model size Compressing models while maintaining accuracy
  • 23. Retraining using unlabeled data 23 Software Design & Analysis Laboratory Most existing retraining methods require the labeled datasets to retrain. Using unlabeled dataset for retraining is highly useful when the labeled dataset is unavailable. DATA LABEL Researcher/Developer Labeled Dataset Privacy policy, License limitation DATA Researcher/Developer Unlabeled Dataset
  • 24. Proposed Retraining Method 24 Software Design & Analysis Laboratory Unlabeled Data set Quantized model Non-trainable layer Trainable layer Original model Trainable layer Output vector Output vector Loss
  • 25. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 25 Software Design & Analysis Laboratory
  • 26. Case Study 26 Software Design & Analysis Laboratory VGG-16 ResNet-50
  • 27. Case Study of VGG-16 27 Software Design & Analysis Laboratory Model Architecture Bias Value Weight Value 10 8 10 3
  • 28. Model Quantization 28 Software Design & Analysis Laboratory Size of Quantized VGG-16 models Accuracy of Quantized VGG-16 models # of quanized layers
  • 29. Model Retraining 29 Software Design & Analysis Laboratory Retraining Quantized VGG-16 models Quantizing the 14th and 15th layers using 32-256 centroids achieved nearly the accuracy of the original model. The best configuration for quantizing VGG-16 model - Quantize the biases in all layer using 1 centroid - Quantize the weights in 14th and 15th layers using 32 centroids It compressed to possible smallest model size without significant accuracy loss. # of centroids
  • 30. Case Study of ResNet-50 30 Software Design & Analysis Laboratory Model Architecture Bias Value Weight Value
  • 31. Model Quantization 31 Software Design & Analysis Laboratory Size of Quantized ResNet-50 models Accuracy of Quantized ResNet-50 models # of quanized layers
  • 32. Model Retraining 32 Software Design & Analysis Laboratory Retraining Quantized ResNet-50 models Quantizing the 13th - 49th layers using 128 or less centroids clearly degrades the accuracy of the model. The best configuration for quantizing ResNet-50 model - Quantize the biases in all layer using 1 centroid - Quantize the weights in 13th - 49th layers using 256 centroids It compressed to possible smallest model size without significant accuracy loss. # of centroids
  • 33. Conventional & Proposed Retraining 33 Software Design & Analysis Laboratory Accuracy of quantized model through retraining Retraining time of quantized model 85% 82% *Conventional retraining method is retraining all layers in the model
  • 34. Outline ➢ Introduction ➢ Improving availability of computing resources ○ Methodology ○ Evaluation ➢ Improving availability of storage resources ○ Methodology ○ Evaluation ➢ Conclusion 34 Software Design & Analysis Laboratory
  • 35. Conclusion ➢ Improving availability of computing resources ○ We proposed the method for predicting the efficient allocated computing resources from the proposed LSTM-based prediction model to improve computing resource availability ○ The proposed method is able to improve computing resource availability of the CPU and memory by 11% and 48%, respectively ➢ Improving availability of storage resources ○ We proposed the method for reducing the size of the neural network models without the significant accuracy loss to improve storage resource availability ○ The proposed method is able to improve storage resource availability of VGG16 and ResNet50 by 81% and 52%, respectively 35 Software Design & Analysis Laboratory
  • 36. Future Work ➢ Improving availability of computing resources ○ The significant features that impact to computing resource availability should be investigated for conducting the efficient method ○ We would like to apply the other time-series forecasting techniques to improve the availability of computing resources ➢ Improving availability of storage resources ○ The structure of other neural network models should be investigated to conduct the efficient retraining method ○ We would like to apply the compression techniques other than quantization technique for reducing the size of neural network models 36 Software Design & Analysis Laboratory
  • 37. Publications ➢ Improving availability of computing resources ○ Kundjanasith Thonglek, Kohei Ichikawa, Keichi Takahashi, Chawanat Nakasan, and Hajimu Iida, “Improving Resource Utilization in Data Centers using an LSTM-based Prediction Model”, Proceedings of Workshop on Monitoring and Analysis for High Performance Computing System Plus Applications (HCPMASPA 2019), September, 2019. ➢ Improving availability of storage resources ○ Kundjanasith Thonglek, Keichi Takahashi, Kohei Ichikawa, Chawanat Nakasan, Nakada Hidemoto, Ryousei Takano, and Hajimu Iida, “Retraining Quantized Neural Network Model without Unlabeled Data”, Proceedings of International Joint Conference on Neural Networks (IJCNN 2020), July, 2020. 37 Software Design & Analysis Laboratory