Improving Resource Availability in Data Center using Deep Learning.pdf

Improving Resource Availability in
Data Centers using Deep Learning
(深層学習を使用したデータセンタにおける資源利用効率の向上)
Kundjanasith Thonglek
Software Design & Analysis Laboratory

Outline
➢ Introduction
➢ Improving availability of computing resources
○ Methodology
○ Evaluation
➢ Improving availability of storage resources
○ Methodology
○ Evaluation
➢ Conclusion
2

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
3

Data Centers
Data centers are centralized facilities where computing and storage
hardware are aggregated to handle large amounts of data and computation.
4
Technical challenges
➢ System monitoring
➢ Energy management
➢ Continuous migration
➢ Availability improvement

Objective
I aim to improve the availability of computing and storage resources in
data centers by applying deep learning.
5
Resource utilization is paramount to many cloud providers as they need
to utilize their hardware resources efficiently to maximize proﬁt.
Storage Resources
❖ Hard Disk
Computing Resources
❖ CPU, Memory

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
6

Users excessively request computing resources
➢ Users tend to request more computing resources than their applications actually need
○ Unused computing resources by application are wasted
○ Overall computing resource utilization in the data centers degrades
7
wasted resource

Overview of Proposed Method
8
Analyzing
Cluster Usage
Designing
Neural Network
Evaluation
Training
LSTM Model
Analyzing Cluster Usage
Designing Neural Network
Training LSTM Model
Evaluation
Analyze Google’s cluster
usage trace obtained from
a production data center
Design an LSTM-based model to
predict better resource
allocation from historical data of
resource usage and allocation.
Train our model using
Google’s cluster usage trace
Evaluate improvement of
resource utilization using
Google’s cluster scheduler
simulator

Analyzing Cluster Usage
9
Google’s cluster usage trace is real workload data in Google’s data center
Computing Resource Requested Resource Used Resource
CPU Requested CPU Used CPU
Memory Requested memory Used memory

Long Short-Term Memory
Recurrent Neural Network (RNN)
➢ Deep learning model for time-series forecasting
➢ Model size not increasing with size of input
➢ Weights are shared across time
10
Long Short-Term Memory (LSTM) introduces long-term memory into RNN
➢ LSTM migrates the vanishing gradient problem, where the neural
network stops learning because the updates to the various weights
within a given neural network become smaller and smaller
➢ The memory cell replaces hidden neurons used in traditional RNNs to
build a hidden layer

Proposed neural network
Input: The requested and used of CPU and memory resources
1st
LSTM: Finding the correlation between CPU and memory
2nd
LSTM: Finding the correlation between allocated and used
Fully Connected: Connected each neuron to one layers
Output: The efficient CPU and memory allocation
11

Training LSTM Model
Improving resource utilization by
implement Long Short-Term Memory
model using requested CPU, requested
memory, used CPU and used memory.
12
Allocated Resource
Used Resource
Memory (%)
CPU (%)
Memory (%)
CPU (%)
M
O
D
E
L
Memory cell size
➔ 20 minutes
➔ 40 minutes
➔ 60 minutes
The memory cell size in
Long Short-Term Memory
model is memorizing each
step input-output pair of
values in each sequence.

Usage Simulation
Simulate resource utilization in
data center from allocated resource
which is predicted using our time-
series predictive model to apply with
the actual computing resources.
13
Google’s cluster usage data
(513,000 jobs)
Training dataset
(80%)
Testing dataset
(20%)
[LSTM/RNN] MODEL
Allocated Resource [Predicted]
CPU (%)
Memory (%)
Resource Allocation
CPU (%)
Memory (%)
Google’s simulation

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
14

-3% -6% -4% -8% -7%
-11%
-12%
-27% -23%
-34% -35%
-48%
Decreased Computing Resource Wastage
15
CPU Memory

Training time & Inference time
16
408.93
35.67
130.82
49.77
35.13
28.78
Training time Inference time
*For 100 epochs *For 102,600 jobs

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
17

ML models are becoming larger
ML model compression improves the storage usage efficiency by reducing
the size of ML models, and increases the availability of storage resources.
18
Model Name Model Size Application
GPT-3 700 GB Language Processing
VGG-16 528 MB Image Classiﬁcation
Mask RCNN 256 MB Object Detection
Normally, ML model compression reduces the model size, but it also
decreases the accuracy.

Compressing models while maintaining accuracy
19
Quantization Retraining
Original Model Compressed Model
Quantized Model
Decrease model size
with loss of accuracy
Increase model accuracy
while keeping the model size

20
Quantized Model
Decrease model size

Calculate
clusters
Vector Quantization
21
Calculate
centroids
Vector Quantization - lossy data compression

22
Quantized Model
Decrease model size

Retraining using unlabeled data
23
Most existing retraining methods require the labeled datasets to retrain.
Using unlabeled dataset for retraining is highly useful when the labeled
dataset is unavailable.
DATA
LABEL
Researcher/Developer Labeled Dataset
Privacy policy, License limitation
DATA
Researcher/Developer Unlabeled Dataset

Proposed Retraining Method
24
Unlabeled
Data set
Quantized model
Non-trainable layer
Trainable layer
Original model
Trainable layer
Output vector
Output vector
Loss

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
25

Case Study
26
VGG-16 ResNet-50

Case Study of VGG-16
27
Model Architecture Bias Value Weight Value
10
8
10
3

Model Quantization
28
Size of Quantized
VGG-16 models
Accuracy of Quantized
VGG-16 models
# of
quanized
layers

Model Retraining
29
Retraining Quantized VGG-16 models
Quantizing the 14th
and 15th
layers using 32-256 centroids
achieved nearly the accuracy of the original model.
The best conﬁguration for quantizing
VGG-16 model
- Quantize the biases in all layer using
1 centroid
- Quantize the weights in 14th
and 15th
layers using 32 centroids
It compressed to possible smallest model size without
signiﬁcant accuracy loss.
# of centroids

Case Study of ResNet-50
30
Model Architecture Bias Value Weight Value

Model Quantization
31
Size of Quantized
ResNet-50 models
Accuracy of Quantized
ResNet-50 models
# of
quanized
layers

Model Retraining
32
Retraining Quantized ResNet-50 models
Quantizing the 13th
- 49th
layers using 128 or less centroids
clearly degrades the accuracy of the model.
The best conﬁguration for quantizing
ResNet-50 model
- Quantize the biases in all layer using
1 centroid
- Quantize the weights in 13th
- 49th
layers using 256 centroids
It compressed to possible smallest model size without
signiﬁcant accuracy loss.
# of centroids

Conventional & Proposed Retraining
33
Accuracy of quantized model through retraining Retraining time of quantized model
85%
82%
*Conventional retraining method is retraining all layers in the model

Outline
➢ Introduction
○ Methodology
○ Evaluation
○ Methodology
○ Evaluation
➢ Conclusion
34

Conclusion
○ We proposed the method for predicting the efficient allocated computing resources from the
proposed LSTM-based prediction model to improve computing resource availability
○ The proposed method is able to improve computing resource availability of the CPU and
memory by 11% and 48%, respectively
○ We proposed the method for reducing the size of the neural network models without the
signiﬁcant accuracy loss to improve storage resource availability
○ The proposed method is able to improve storage resource availability of VGG16 and
ResNet50 by 81% and 52%, respectively
35

Future Work
○ The signiﬁcant features that impact to computing resource availability should be
investigated for conducting the efficient method
○ We would like to apply the other time-series forecasting techniques to improve the
availability of computing resources
○ The structure of other neural network models should be investigated to conduct the efficient
retraining method
○ We would like to apply the compression techniques other than quantization technique for
reducing the size of neural network models
36

Publications
○ Kundjanasith Thonglek, Kohei Ichikawa, Keichi Takahashi, Chawanat Nakasan, and Hajimu
Iida, “Improving Resource Utilization in Data Centers using an LSTM-based Prediction
Model”, Proceedings of Workshop on Monitoring and Analysis for High Performance
Computing System Plus Applications (HCPMASPA 2019), September, 2019.
○ Kundjanasith Thonglek, Keichi Takahashi, Kohei Ichikawa, Chawanat Nakasan, Nakada
Hidemoto, Ryousei Takano, and Hajimu Iida, “Retraining Quantized Neural Network Model
without Unlabeled Data”, Proceedings of International Joint Conference on Neural Networks
(IJCNN 2020), July, 2020.
37

Q&A
Thank you
Email: thonglek.kundjanasith.ti7@is.naist.jp

Improving Resource Availability in Data Center using Deep Learning.pdf

More Related Content

Similar to Improving Resource Availability in Data Center using Deep Learning.pdf (20)

More from Kundjanasith Thonglek (6)

Recently uploaded (20)

Improving Resource Availability in Data Center using Deep Learning.pdf