SlideShare a Scribd company logo
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 13, No. 6, December 2023, pp. 6378~6387
ISSN: 2088-8708, DOI: 10.11591/ijece.v13i6.pp6378-6387  6378
Journal homepage: http://ijece.iaescore.com
Optimal coding unit decision for early termination in high
efficiency video coding using enhanced whale optimization
algorithm
Suhas Shankarnahalli Krishnegowda, Hosanna Princye Periapandi
Department of Electronics and Communication Engineering, S.E.A. College of Engineering and Technology, Visvesvaraya
Technological University, Belagavi, India
Article Info ABSTRACT
Article history:
Received Jan 28, 2023
Revised Jul 8, 2023
Accepted Jul 17, 2023
Video compression is an emerging research topic in the field of block based
video encoders. Due to the growth of video coding technologies, high
efficiency video coding (HEVC) delivers superior coding performance. With
the increased encoding complexity, the HEVC enhances the rate-distortion
(RD) performance. In the video compression, the out-sized coding units (CUs)
have higher encoding complexity. Therefore, the computational encoding cost
and complexity remain vital concerns, which need to be considered as an
optimization task. In this manuscript, an enhanced whale optimization
algorithm (EWOA) is implemented to reduce the computational time and
complexity of the HEVC. In the EWOA, a cosine function is incorporated
with the controlling parameter A and two correlation factors are included in
the WOA for controlling the position of whales and regulating the movement
of search mechanism during the optimization and search processes. The bit
streams in the Luma-coding tree block are selected using EWOA that defines
the CU neighbors and is used in the HEVC. The results indicate that the
EWOA achieves best bit rate (BR), time saving, and peak signal to noise ratio
(PSNR). The EWOA showed 0.006-0.012 dB higher PSNR than the existing
models in the real-time videos.
Keywords:
Coding units
Discrete cosine transform
Faster encoding
High efficiency video coding
Prediction units
Whale optimization algorithm
This is an open access article under the CC BY-SA license.
Corresponding Author:
Suhas Shankarnahalli Krishnegowda
Department of Electronics and Communication Engineering, S.E.A. College of Engineering and Technology,
Visvesvaraya Technological University
Belagavi-590018, India
Email: suhashoodi@gmail.com
1. INTRODUCTION
In recent times, the demand for higher-definition video services has increased in the applications like
digital broadcast and internet streaming [1], [2]. To meet the need for transmission and storage of higher
resolution videos, a new video coding standard is developed named high efficiency video coding (HEVC)
[3]–[5]. HEVC is effectively related to the conventional video coding standards developed prior such as moving
picture experts group (MPEG)-2, H.264, and MPEG-4 part 2 [6], [7]. In HEVC, the compression improvement
is usually based on the implementation of new encoding methodologies like asymmetric motion partition, intra-
prediction modes, and so on. Among the available methodologies, the flexible quad-tree partitioning
methodology of coding tree unit (CTU) is efficient [8]. When the size of CTU is 64 × 64, the size of coding
unit (CU) is 8 × 8, 32 × 32, 64 × 64, and 16 × 16 and its depth size will be 0, 1, 2, or 3. The CUs in the CTUs
are partitioned into 4 blocks based on depth range. In HEVC, the optimum CU partition is selected according
to the rate distortion (RD) costs [9], [10]. In HEVC, CU is further-partitioned into several prediction-units
Int J Elec & Comp Eng ISSN: 2088-8708 
Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda)
6379
(PUs). Hence, the optimal PUs modes are considered as the modes with minimal RD costs among several inter
and intra PUs modes. In the higher resolution videos, the mode decision process ensures compression
efficiency. In a larger space HEVC, the search for the best PU and CU decision results in high computational
time and complexity, where it limits the usage of HEVC encoders in real time applications [11]–[13].
Some of the conventional methods used for video compression are mentioned as follows;
convolutional neural networks (CNN) [14], and adaptive switching neural networks [15]. Duvar et al. [16]
presented an effective decision algorithm for reducing the encoding time of the HEVC. Initially, the intra block
similarity was carried out at the PU level by using the integral images. In addition, at the CU level, an early
termination mode was developed. Hence, the developed fast inter mode decision algorithm significantly
bypasses the PU modes in PU phase, and also removes the unnecessary controls in the CU phase at a lower
depth. In this literature study, the efficacy of the developed algorithm was investigated by means of bit rate
and peak signal to noise ratio (PSNR). The experimental outcomes demonstrated that the developed algorithm
significantly improves the coding efficacy with low system complexity. In addition, the developed algorithm
delivers a good relationship between time savings and coding efficacy related to the earlier approaches. The
negative side lobes and artifacts edges were generated in the developed algorithm that affects the system
performances. Cen et al. [17] developed a new fast CU depth decision framework for decreasing the
computational complexity in the HEVC. The developed framework includes a CU depth range determination
and a new CU depth comparison algorithm. In this study, the CU depth range was identified based on CU-
depth’s distribution in similar sequences. The experimental results confirmed that the computational
complexity of the developed framework was low compared to the existing works. The developed algorithm
lacks in retaining higher quality videos at the receiver side.
At dissimilar levels of coding abstraction, Jiang and Nooshabadi [18] presented a series of
optimization methods for multi-view HEVC. In this literature, the optimized resource scheduled wavefront
parallel processing and the quantization parameters based on the early termination of CTU were performed for
disparity estimation and parallel motion estimation. From the experimental investigation, the developed
optimization methods achieved better experimental results compared to the previous research work in light of
PSNR and bit error rate. The developed algorithm effectively reduces the system complexity, but it did not
concentrate on the major issue of poor video resolution. Bouaafia et al. [19] presented deep convolutional
neural network (DCNN) and support vector machine (SVM) in the inter mode HEVC for optimizing the
complexity allocations at the CU level. Initially, the SVM based fast CU model was developed for decreasing
the HEVC complexity, and further, the DCNN model was utilized for predicting the CU partition. The
experimental outcome indicates that the developed online SVM and DCNN models achieved better results in
light of time saving and bit rate. In contrast, the developed algorithm reduces the importance of the color
components in the compressed videos.
Ma et al. [20] introduced a new faster intra-coding algorithm for speeding up the encoding
mechanism. At first, a faster CU-sized decision model was implemented for selecting dissimilar depth decision
algorithms for every coding unit. Then, a faster directional mode decision technique was employed, which
compares the directional modes of the parent units. The best directional mode of the parent units and the RD
cost of the first directional mode was integrated for selecting the best directional mode for the current unit
efficiently. The experimental outcome represents that the developed algorithm attained good performance in
the video encoding in light of Bjontegaard delta bit rate (BDBR) and time-saving. The developed algorithm
was not able to handle massive workloads at higher speeds. Kuanar et al. [21] implemented a new CNN model
for effective CU mode selection in the HEVC. Hence, the extensive experimental investigation showed that
the developed CNN model has significantly decreased the encoding time related to other state-of-the-art
machine learning models, but it was computationally expensive.
Hassan et al. [22] developed a surgical telemonitoring system based on HEVC by implementing a
shallow CNN model. The experimental investigation confirmed that the shallow CNN model maintains higher
visual quality with a better bit rate. Compared to the state-of-the-art models, the developed shallow-based CNN
model was effective and efficient for surgical tele-monitoring systems. He et al. [23] developed a new fuzzy based
SVM classifier for improving the compression efficiency of the HEVC. In addition to this, the fuzzy based SVM
classifier was improved by utilizing the information entropy measure for solving the outliers and the negative
impact of data noise problems. However, the undertaken CNN model and the fuzzy based SVM classifier was
computationally complex and needed high-end specification systems. Imen et al. [24] integrated modified
AlexNet and modified LeNet-5 for predicting the HEVC’s CU partition. The experimental analysis states that the
developed model was computationally complex. The key contributions of this research paper are given below;
a. Proposed enhanced whale optimization algorithm (EWOA) to decrease the computational time and
complexity of the HEVC, which selects the bit streams in the luma coding tree block for effectively
determining the CU neighbors. The EWOA is effective in the optimization problems related to the
conventional optimization algorithms such as puzzle optimization algorithm [25] and stochastic Komodo
algorithm [26].
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387
6380
b. Implemented discrete cosine transform (DCT) for generating the residuals by subtracting the prediction
values from the input values. The efficacy of the EWOA is analyzed in light of ∆BR, time-saving, and
∆PSNR. This paper is prepared in this manner: methodology details, results and discussion, and the
conclusion of the EWOA are depicted in sections 2 to 4, respectively.
2. RESEARCH METHOD
In this research, the efficacy of the EWOA is tested on a few online videos: PeopleOnStreet, Traffic,
kimono, ParkScene, Cactus, BQTerrace, FourPeople, PartyScene, BasketballDrive, Johnny, BasketballDrill,
BQMall, RaceHorses, BasketballPass, BQSquare, BlowingBubbles, and KristenAndSara. The sample video
frames are graphically indicated in Figure 1. The proposed framework includes three major steps like optimal
bit-streams selection in HEVC using EWOA, inter and intra prediction in HEVC, and data transformation by
DCT. The workflow of the proposed framework is represented in Figure 2.
Figure 1. Sample collected video frames
Figure 2. Workflow of the proposed framework
Initially, the frames from the videos are extracted and the separated frames are given to the HEVC for
predicting the motion from the video sequences. The basic design of the HEVC is very similar to the H.264.
In this scenario, the block-based coding approach significantly exploits both spatial and temporal statistical
dependencies. Generally, the HEVC utilizes flexible and adaptive quad-tree coding block partitions for
effective coding, transformation and prediction. The basic information about HEVC is given as follows.
2.1. Prediction structure
Generally, the quad-tree block partition works based on the CTU structure, which is more similar to
the macro-block. The sequence of frames is named a video, and in HEVC, every coded video frame is
categorized into slices and CTUs. Further, the CTUs are sub-categorized into the square regions named as CU.
Int J Elec & Comp Eng ISSN: 2088-8708 
Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda)
6381
In the HEVC, the CU is predicted using inter or intra-predictions, and the 1st frame of the video sequence at
each random access point is coded utilizing intra-predictions. The residual video frames are coded by
performing inter-predictions, and further, the residual frames are transformed into transform units (TU) by
implementing the DCT algorithm. Usually, the CTUs are made up of two chroma coding tree blocks (CTB),
quadtree syntax and luma CTB, where every Chroma CTBs has the block size of (𝑁/2) × (𝑁/2) and luma
CTB has the block size of 𝑁 × 𝑁. The CTB size is the same as the size of Coding Blocks (CBs), where the
CTB has more CUs and is associated with the Tus and PUs. The inter prediction, intra prediction, and coding
modes are selected at the CU level, where 𝑁 is represented as bit-streams and it will be 64, 32, 16, and 8 bits.
In this research manuscript, the bit-streams are chosen by implementing an effective meta-heuristics
algorithm named EWOA, which generally follows the behavior of humpback whales, here, error rate is
considered as the objective function. After creating an initial population, the humpback whale improves its
location based on the encircling method, and it is mathematically defined in (1) and (2) [27], [28]. Where,
𝑡 indicates iteration number, 𝐷 indicates the distance between a prey 𝑃′(𝑡) and the humpback whale’s
position 𝑃(𝑡) and 𝐴 and 𝐵 states coefficient values, which are determined in (3) and (4).
𝐷 = |𝐵 ⊙ 𝑃′(𝑡) − 𝑃(𝑡)| (1)
𝑃(𝑡 + 1) = |𝑃′(𝑡) − 𝐴 ⊙ 𝐷| (2)
𝐴 = 2𝑙 ⊙ 𝑟 − 𝑙 (3)
𝐵 = 2𝑟 (4)
where, 𝑟 represents a random vector, which usually ranges between 0 to 1, and 𝑙 represents the linearity values,
which range between 0 to 2. On the other hand, the bubble-net method is accomplished based on shrinking
encircling and spiral updating position, as shown in (5) and (6). Where, ⊙ indicates element by element
multiplication process, 𝑏 represents constant value that is used to determine the logarithmic spiral shape,
𝑎 denotes random value, which ranges between [-1, 1], and 𝐷
́ = |𝑃′(𝑡) − 𝑃(𝑡)| represents the distance between
the humpback whales and the prey.
𝑃 (𝑡 + 1) = 𝐷
́ ⊙ 𝑒𝑏𝑎
⊙ 𝑐𝑜𝑠(2𝜋𝑎) + 𝑃′(𝑡) (5)
𝑃(𝑡 + 1) = {
𝑃′(𝑡) − 𝐴 ⊙ 𝐷 𝑖𝑓 𝑝 ≥ 0.5
𝐷
́ ⊙ 𝑒𝑏𝑎
⊙ 𝑐𝑜𝑠(2𝜋𝑎) + 𝑃′(𝑡) 𝑖𝑓 𝑝 < 0.5
(6)
where 𝑝 ∈ [0,1] indicates the probability of choosing the shrinking encircling method or spiral method to adjust
the whales' position. The humpback whales search for their prey in the exploration section. The position of the
humpback whales is updated by computing the random search agents and then finding the best search agents.
This process is mathematically indicated in (7) and (8) [29], [30].
𝐷 = |𝐵 ⊙ 𝑃𝑟𝑎𝑛𝑑 − 𝑃(𝑡)| (7)
𝑃(𝑡 + 1) = |𝑃𝑟𝑎𝑛𝑑 − 𝐴 ⊙ 𝐷| (8)
where, 𝑃𝑟𝑎𝑛𝑑 indicates random position, which is determined based on the current population. Due to the lack
of prior knowledge, updating the positions of search agents is trapped into local optima problems in the existing
WOA. Therefore, a novel cosine function is added with the control parameter 𝐴 for controlling the whale’s
position. The inclusion of cosine function in the control parameter provides a better balance of exploitation and
exploration and it is mathematically indicated in (9).
𝐴 = 1 + 0.5 × 𝑐𝑜𝑠 (𝜋
𝑡
𝐼𝑡𝑒𝑟𝑚𝑎𝑥
) (9)
During the search process, two correlation factors 𝐶𝐹1 and 𝐶𝐹2 are used for regulating the movement
of the search agents. As shown in (7) and (8) are updated as in (10) and (11). The assumed parameters of the
EWOA are: number of agents is represented as 100, 𝑡 indicates current iteration, 𝐶𝐹1 = 2.5, 𝐶𝐹2 = 1.5 and
𝐼𝑡𝑒𝑟𝑚𝑎𝑥 = 100 represents maximum iteration. Once the maximum iteration is reached, the EWOA
automatically terminates.
𝐷 = |𝐵 ⊙ 𝑃𝑟𝑎𝑛𝑑 − 𝑃(𝑡)|/𝐶𝐹1 (10)
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387
6382
𝑃(𝑡 + 1) = |𝑃𝑟𝑎𝑛𝑑 − 𝐴 ⊙ 𝐷|/𝐶𝐹2 (11)
2.2. Inter-prediction in HEVC
In the HEVC, the inter-predictions support the division of prediction blocks (PBs) related to the intra-
predictions. Generally, the inter-coded PUs have numerous motion parameters that include reference image
indexes, usage flags, motion vectors, and reference image lists. The CU is indicated as one PU, while the CU
is coded with a skip model, and it has no efficient motion parameters and transformation coefficients obtained
by merging the modes. The encoder utilizes explicit transmission or merges mode of motion parameters for
every PU in the inter-coded PUs. Hence, the merged model is employed in the skip mode and inter-coded PU.
In the HEVC, the merge mode is utilized for identifying the neighbor inter-coded PUs. The inter-prediction in
HEVC has motion vectors with units of one-eight and one-quarter for determining the distance between chroma
samples and luma samples.
2.3. Intra-prediction in HEVC
In the HEVC, the intra-units generally exploit the spatial correlation of PU and its neighborhood image
pixels for effective prediction. The new features like TU, PU, CU, and CTU are defined in the HEVC for
achieving higher compression and removing spatial redundancy. On the other hand, the optimization of rate-
distortion is carried-out to identify the superiorly best prediction mode of every CU. The RD cost function of
intra-prediction in HEVC is defined in (12).
𝑅𝐷 = 𝑆𝑆𝐸 + 𝜆 × 𝑅 (12)
where, 𝑅 indicates bit-rate, 𝑆𝑆𝐸 represents the sum of squared distance between the original and reconstructed
pixels, and 𝜆 denotes the quantization parameter. Additionally, the HEVC uses a recursive structure and squad
tree for CUs splitting. Every CU is categorized into 4 PUs and further, the intra-prediction is carried-out for
every PUs. The CUs size ranges from 8 × 8 to 64 × 64 pixels and the PUs size ranges from 4 × 4 to 64 × 64
pixels. Subsequently, the HEVC performs the intra block predictions for 4 × 4 to 64 × 64 pixels. Generally,
the HEVC supports 35 intra-predictions and it includes 33 angular predictions. From the reconstructed PUs,
two reference array sets are used for intra-prediction in the HEVC. The present image pixel 𝐶𝑥,𝑦 is projected
towards the reference image pixels with a fixed displacement parameter 𝑑 that helps in defining the angularity
of vertical and horizontal prediction modes. The interpolation is carried out at an accuracy of 1/32, once the
reference samples 𝑅𝑖 and 𝑅𝑖+1 are determined and it is mathematically represented in (13).
𝐶𝑥,𝑦 = ((32 − 𝑑) × 𝑅𝑖 + 𝑑 × 𝑅𝑖+1 + 16) ≫ 5 (13)
In HEVC, the prediction of the angular modes delivers effective intra-prediction, while more edges
are presented. The DC predictions are extensively used for predicting the flat-surfaces. The block prediction is
generated by a weighted average of four reference samples in the planar prediction, which is determined in (14).
𝑃ℎ (𝑥,𝑦) = 𝑑 × (𝑥 + 1) + 𝑏 × (𝑁 − (𝑥 + 1))
𝑃𝑣 (𝑥,𝑦) = 𝑎 × (𝑦 + 1) + 𝑐 × (𝑁 − (𝑦 + 1))
𝑃𝑃𝐿 (𝑥,𝑦) = (𝑃ℎ (𝑥,𝑦) + 𝑃𝑣 (𝑥,𝑦) + 𝑁) ≫ (𝑙𝑜𝑔2 𝑁 + 1) (14)
where, a and d indicate bottom left and top right samples. In the HEVC, the filtering process is managed by
TB size and intra-prediction mode. The neighborhood samples are not filtered, when the DC-intra prediction
mode is chosen. The bi-linear filter is enabled when the distance between the horizontal/vertical mode and
intra-prediction mode is higher than that of the threshold value. The sample 35 intra-prediction modes are
represented in Figure 3.
2.4. Transformation
In the transformation procedure, the residuals are transformed into TU utilizing DCT. In video
compression, the DCT is an extensively utilized transformation technique, which is effective in energy
compaction, computation efficiency, and correlation reduction. The DCT includes 16 members, and the one-
dimensional DCT of 1 × 𝑁 vector 𝑥(𝑛) is determined in (15) and (16).
𝑌[𝑘] = 𝐶[𝑘] ∑ 𝑥[𝑛]𝑐𝑜𝑠 [
(2𝑛+1)𝑘𝜋
2𝑁
]
𝑁−1
𝑛=0 (15)
where 𝑘 = 0,1,2, … 𝑁 − 1.
Int J Elec & Comp Eng ISSN: 2088-8708 
Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda)
6383
𝐶[𝑘] =
[
√
1
𝑁
𝑓𝑜𝑟 𝑘 = 0
√
1
𝑁
𝑓𝑜𝑟 𝑘 = 1,2, … 𝑁 − 1
]
(16)
Figure 3. Quad-tree structure of the CUs
The original feature vectors 𝑥(𝑛) are re-constructed from the DCT coefficients 𝑌[𝑘] utilizing the
Inverse DCT operation, and it is mathematically denoted in (17). Then, the DCT is extended to the
transformation of the image, which is achieved by computing the individual rows and columns of the two-
dimensional image. The mathematical equation of two dimensional DCT is indicated in (18) and (19).
𝑥[𝑛] = ∑ 𝐶[𝑘]𝑌[𝑘]𝑐𝑜𝑠 [
(2𝑛+1)𝑘𝜋
2𝑁
]
𝑁−1
𝑘=0 (17)
where, 𝑛 = 0,1,2, … . 𝑁 − 1
[𝑗, 𝑘] = 𝐶[𝑗]𝐶[𝑘] ∑ ∑ 𝑥[𝑚, 𝑛] 𝑐𝑜𝑠 (
(2𝑚+1)𝑗𝜋
2𝑁
)
𝑁−1
𝑛=0
𝑁−1
𝑚=0 𝑐𝑜𝑠 (
(2𝑛+1)𝑘𝜋
2𝑁
) (18)
where the size of the image is represented as 𝑥(𝑛1, 𝑛2), and 𝑗, 𝑘, 𝑚, 𝑛 = 0,1,2, … 𝑁 − 1.
𝐶[𝑗] 𝑎𝑛𝑑 𝐶[𝑘] =
[
√
1
𝑁
𝑓𝑜𝑟 𝑗, 𝑘 = 0
√
1
𝑁
𝑓𝑜𝑟 𝑗, 𝑘 = 1,2, … 𝑁 − 1
]
(19)
Correspondingly, the two-dimensional inverse DCT is determined in (20).
𝑥[𝑚, 𝑛] = ∑ ∑ 𝐶[𝑗]𝐶[𝑘]𝑌[𝑗, 𝑘] 𝑐𝑜𝑠 (
(2𝑚+1)𝑗𝜋
2𝑁
)
𝑁−1
𝑘=0
𝑁−1
𝑗=0 𝑐𝑜𝑠 (
(2𝑛+1)𝑘𝜋
2𝑁
) (20)
The DCT represented in (18) and (20) are orthonormal and perfectly reconstructed the coefficients for
achieving infinite precision. At last, the reconstructed samples are achieved from the inverse transformation,
and the reconstructed CTUs are arranged for constructing a final image. The experimental results of the EWOA
are depicted in section 3.
3. RESULTS AND DISCUSSION
In this research, the EWOA is implemented using MATLAB R2020a software. The simulation is
performed with an i7 processor system with 8 GB random access memory, and 1 TB hard disk. This research
study mainly uses HEVC/H.265 for motion estimation. The performance of the EWOA is analyzed in light of
∆BR, time saving, and ∆PSNR. Additionally, the effectiveness of the EWOA is compared to the prior research
model: online SVM+DCNN [19]. The most crucial performance measures of fast encoding: time saving ∆𝑇
and ∆𝐵𝑅 are mathematically denoted in (21) and (22).
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387
6384
∆𝑇 =
𝑇𝑝−𝑇𝑂
𝑇𝑂
× 100 (21)
∆𝑅 =
𝑅𝑝−𝑅𝑜
𝑅𝑂
× 100 (22)
where, 𝑇𝑝 and 𝑅𝑝 are denoted as computational time and bit rate of the EWOA, 𝑇𝑂 and 𝑅𝑂 are stated as
computational time and bit rate of the existing models. Similarly, PSNR is utilized for measuring the quality
of original and compressed frames which is mathematically denoted in (23). Where, the PSNR value of the
EWOA is indicated as 𝑃𝑆𝑁𝑅𝑝 and the PSNR value of the existing model is specified as 𝑃𝑆𝑁𝑅𝑜.
∆𝑃𝑆𝑁𝑅 = 𝑃𝑆𝑁𝑅𝑝 − 𝑃𝑆𝑁𝑅𝑜 (𝑑𝐵) (23)
3.1. Quantitative analysis
By viewing Table 1, the effectiveness of the EWOA is validated with the existing models such as
deep CNN [19], online SVM [19] and conventional WOA by means of 𝛥𝐵𝑅. Here, the EWOA’s performance
is evaluated on seventeen real time videos. From the experimental investigation, the overall performance shows
that the EWOA outperforms the online SVM [19], deep CNN [19] and WOA in terms of 𝛥𝐵𝑅, as shown in
Table 1. It implies that the EWOA is more robust in diminishing complexity of the inter-mode HEVC related
to other models online SVM [19], deep CNN [19], and WOA. Correspondingly, in Table 2, the experimental
investigation of the EWOA is done in terms of ∆PSNR value. From the inspection, ∆PSNR value of the EWOA
is higher than the prior models: deep CNN [19], online SVM [19] and WOA. In this scenario, the EWOA
almost showed 0.006-0.012 dB value higher than the existing models in the real time videos.
Table 1. Performance investigation of the EWOA and the existing models in light of Δ𝐵𝑅
ΔBR (%)
Videos Deep CNN [19] Online SVM [19] WOA EWOA
PeopleOnStreet 0.57200 2.23500 0.55734 0.51824
Traffic -0.57100 2.02200 -0.58390 -0.71332
Kimono 0.72800 0.44900 0.71962 0.62093
ParkScene -0.40100 0.79000 -0.41199 -0.87320
Cactus 0.41200 0.71700 0.40141 0.29312
BQTerrace -2.60600 0.32800 -2.61793 -2.87302
BasketballDrive 1.13000 0.58300 1.11834 1.00923
BasketballDrill -0.04400 1.44000 -0.04866 -0.08392
BQMall 1.01900 3.31300 1.00660 -0.30293
PartyScene -0.70900 2.41500 -0.72106 -0.89203
RaceHorses 0.52900 2.65000 0.52264 0.33025
BasketballPass 0.69000 3.16700 0.68472 0.62039
BQSquare -2.64700 3.69200 -2.65648 -2.77823
BlowingBubbles -0.32700 2.20700 -0.32899 -0.30977
FourPeople -0.65900 1.01700 -0.66884 -0.67898
Johnny -0.36100 1.72900 -0.37648 -1.09797
KristenAndSara -1.02600 1.35700 -1.04073 -1.20393
Table 2. Performance investigation of the EWOA and the existing models in light of ∆PSNR
∆PSNR (dB)
Videos Deep CNN [19] Online SVM [19] WOA EWOA
PeopleOnStreet -0.03000 -0.05500 -0.03703 -0.02883
Traffic -0.06100 -0.04780 -0.06254 -0.05009
Kimono -0.02500 -0.02000 -0.03453 -0.02456
ParkScene -0.05200 -0.02300 -0.05741 -0.04565
Cactus -0.05200 -0.01900 -0.05880 -0.04446
BQTerrace -0.07100 -0.02200 -0.07137 -0.06754
BasketballDrive -0.03100 -0.02300 -0.03909 -0.02157
BasketballDrill -0.05600 -0.04700 -0.06349 -0.04876
BQMall -0.05100 -0.05800 -0.05220 -0.04267
PartyScene -0.08500 -0.06200 -0.09025 -0.07066
RaceHorses -0.04000 -0.05300 -0.04326 -0.03355
BasketballPass -0.05400 -0.06700 -0.05946 -0.04953
BQSquare -0.16200 -0.08600 -0.16599 -0.12577
BlowingBubbles -0.07100 -0.06700 -0.07515 -0.05518
FourPeople -0.06200 -0.01400 -0.06455 -0.05459
Johnny -0.12400 -0.03600 -0.12421 -0.10467
KristenAndSara -0.07000 -0.02800 -0.07924 -0.06965
Int J Elec & Comp Eng ISSN: 2088-8708 
Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda)
6385
As represented in Table 3, seventeen online real time videos are utilized for investigating the
effectiveness of the EWOA. In Table 3, the EWOA’s efficacy is validated in light of time saving ΔT. From the
inspection, the EWOA achieved better results compared to the online SVM [19], deep CNN [19] and WOA in
light of time saving ΔT on the real time videos, which is the major problem highlighted in the literature section.
Table 3. Performance investigation of the EWOA and the existing models in light of ∆T
ΔT (%)
Videos Deep CNN [19] Online SVM [19] WOA EWOA
PeopleOnStreet -50.67000 -56.56000 -51.64113 -49.73822
Traffic -57.90000 -58.07000 -59.64715 -56.20333
Kimono -43.26000 -44.18000 -44.21259 -42.48243
ParkScene -64.14000 -52.60000 -65.91336 -63.93794
Cactus -52.57000 -41.38000 -53.87029 -51.08473
BQTerrace -58.43000 -41.45000 -59.64306 -57.46738
BasketballDrive -51.30000 -51.17000 -52.69093 -51.84837
BasketballDrill -53.54000 -55.87000 -54.24158 -52.46364
BQMall -52.25000 -55.96000 -53.06197 -52.84949
PartyScene -51.54000 -52.93000 -52.63542 -50.63434
RaceHorses -42.22000 -51.25000 -42.91538 -40.93940
BasketballPass -52.42000 -55.49000 -53.54240 -51.54234
BQSquare -52.79000 -55.92000 -54.45821 -51.44434
BlowingBubbles -46.55000 -51.87000 -47.48938 -44.48394
FourPeople -67.54000 -51.90000 -68.56404 -66.56344
Johnny -69.66000 -60.12000 -70.43801 -68.43444
KristenAndSara -67.20000 -53.57000 -68.15833 -65.15849
3.2. Discussion
In the present decade, the HEVC has better coding efficiency, because of the rapid growth of video
coding technology. The encoding complexity is increased in the HEVC, while improving the performance of
RD. In addition, the emerging HEVC uses new coding structures, which are characterized by TU, PU and CU.
It enhances the coding efficiency superiorly, but increases computational complexity on the decision of optimal
TU, PU, and CU sizes. Computational complexity remains a vital problem, and it should be considered in the
optimization task. As discussed in the previous sections, we proposed a EWOA with quad tree coding and DCT
for fast CU partition that superiorly decreases the complexity of HEVC at inter-mode. The proposed framework
achieved a good trade-off between the RD performance and complexity reduction. The EWOA with quad tree
coding and DCT not only predicts the HEVC CU partition at inter-mode and reduces the HEVC complexity
with minimal error value. In this manuscript, almost seventeen online real time videos are used for analyzing
the effectiveness of the proposed framework in light of PSNR, time saving ∆𝑇 and ∆𝐵𝑅.
4. CONCLUSION
In this study, an efficient video compression is achieved by implementing HEVC with an optimization
algorithm: EWOA. The EWOA is utilized for estimating the motions from the video sequences. In the proposed
framework, the quad tree coding block is employed for partitioning the structures, and DCT is applied to the
extracted video frames for improving the coding efficiency. The performance of the EWOA is investigated by
comparing the input video sequences with decompressed video sequences in terms of ΔBR, ΔT, and ΔPSNR.
The simulation analysis concluded that the EWOA attained better performance in the video compression, and
it showed 0.006-0.012 dB higher PSNR than the existing models in the real time videos like asketballPass,
BQTerrace, BasketballDrive, RaceHorses, BQMall, BlowingBubbles, Cactus, FourPeople, PartyScene,
PeopleOnStreet, Johnny, Kimono, KristenAndSara, Traffic, ParkScene, BasketballDrill, and BQSquare. The
proposed framework is specifically used for surveillance or conversational videos that largely reduces the
bandwidth without degrading the visual quality. The future studies will focus on video coding optimization or
perceptual based medical image. Additionally, a novel algorithm can be developed for fast mode selection
based on the pattern directions of the neighboring PU.
AUTHOR CONTRIBUTIONS
The paper conceptualization, methodology, software, validation, formal analysis, investigation,
resources, data curation, writing—original draft preparation, writing—review and editing, visualization, have
been done by 1st
author. The supervision and project administration, have been done by 2nd
author.
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387
6386
REFERENCES
[1] W. El-Shafai, I. M. Almomani, and A. Alkhayer, “Optical bit-plane-based 3D-JST cryptography algorithm with cascaded 2D-FrFT
encryption for efficient and secure HEVC communication,” IEEE Access, vol. 9, pp. 35004–35026, 2021, doi:
10.1109/ACCESS.2021.3062403.
[2] Y. Zhang, C. Zhang, R. Fan, S. Ma, Z. Chen, and C.-C. J. Kuo, “Recent advances on HEVC inter-frame coding: From optimization
to implementation and beyond,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 11, pp. 4321–4339,
Nov. 2020, doi: 10.1109/TCSVT.2019.2954474.
[3] A. A. Elrowayati, M. A. Alrshah, M. F. L. Abdullah, and R. Latip, “HEVC watermarking techniques for authentication and
copyright applications: Challenges and opportunities,” IEEE Access, vol. 8, pp. 114172–114189, 2020, doi:
10.1109/ACCESS.2020.3004049.
[4] W. Zhu, Y. Yi, H. Zhang, P. Chen, and H. Zhang, “Fast mode decision algorithm for HEVC intra coding based on texture
partition and direction,” Journal of Real-Time Image Processing, vol. 17, no. 2, pp. 275–292, Apr. 2020, doi: 10.1007/s11554-
018-0766-z.
[5] J.-K. Lee, N. Kim, S. Cho, and J.-W. Kang, “Deep video prediction network-based inter-frame coding in HEVC,” IEEE Access,
vol. 8, pp. 95906–95917, 2020, doi: 10.1109/ACCESS.2020.2993566.
[6] D. Xu, “Commutative encryption and data hiding in HEVC video compression,” IEEE Access, vol. 7, pp. 66028–66041, 2019, doi:
10.1109/ACCESS.2019.2916484.
[7] X. Sun, H. Ma, W. Zuo, and M. Liu, “Perceptual-based HEVC intra coding optimization using deep convolution networks,” IEEE
Access, vol. 7, pp. 56308–56316, 2019, doi: 10.1109/ACCESS.2019.2910245.
[8] H. Huang, I. Schiopu, and A. Munteanu, “Frame-wise CNN-based filtering for intra-frame quality enhancement of HEVC videos,”
IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 6, pp. 2100–2113, Jun. 2021, doi:
10.1109/TCSVT.2020.3018230.
[9] A. Mercat, A. Makinen, J. Sainio, A. Lemmetti, M. Viitanen, and J. Vanne, “Comparative rate-distortion-complexity analysis of
VVC and HEVC video codecs,” IEEE Access, vol. 9, pp. 67813–67828, 2021, doi: 10.1109/ACCESS.2021.3077116.
[10] M. Saldanha, G. Sanchez, C. Marcon, and L. Agostini, “Fast 3D-HEVC depth map encoding using machine learning,” IEEE
Transactions on Circuits and Systems for Video Technology, vol. 30, no. 3, pp. 850–861, Mar. 2020, doi:
10.1109/TCSVT.2019.2898122.
[11] X. Sun, X. Yang, S. Wang, and M. Liu, “Content-aware rate control scheme for HEVC based on static and dynamic saliency
detection,” Neurocomputing, vol. 411, pp. 393–405, Oct. 2020, doi: 10.1016/j.neucom.2020.06.003.
[12] J. Lin, D. Liu, H. Yang, H. Li, and F. Wu, “Convolutional neural network-based block up-sampling for HEVC,” IEEE Transactions
on Circuits and Systems for Video Technology, vol. 29, no. 12, pp. 3701–3715, Dec. 2019, doi: 10.1109/TCSVT.2018.2884203.
[13] M. Zhou et al., “SSIM-based global optimization for CTU-level rate control in HEVC,” IEEE Transactions on Multimedia, vol. 21,
no. 8, pp. 1921–1933, Aug. 2019, doi: 10.1109/TMM.2019.2895281.
[14] Y. Zhang, T. Shen, X. Ji, Y. Zhang, R. Xiong, and Q. Dai, “Residual highway convolutional neural networks for in-loop filtering
in HEVC,” IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 3827–3841, Aug. 2018, doi: 10.1109/TIP.2018.2815841.
[15] W. Lin et al., “Partition-aware adaptive switching neural networks for post-processing in HEVC,” IEEE Transactions on
Multimedia, vol. 22, no. 11, pp. 2749–2763, Nov. 2020, doi: 10.1109/TMM.2019.2962310.
[16] R. Duvar, O. Akbulut, and O. Urhan, “Fast inter mode decision exploiting intra-block similarity in HEVC,” Signal Processing:
Image Communication, vol. 78, pp. 503–510, Oct. 2019, doi: 10.1016/j.image.2019.08.010.
[17] Y.-F. Cen, W.-L. Wang, and X.-W. Yao, “A fast CU depth decision mechanism for HEVC,” Information Processing Letters,
vol. 115, no. 9, pp. 719–724, Sep. 2015, doi: 10.1016/j.ipl.2015.04.001.
[18] C. Jiang and S. Nooshabadi, “Multi-level complexity reduction for HEVC multiview coding,” Journal of Real-Time Image
Processing, vol. 17, no. 2, pp. 197–213, Apr. 2020, doi: 10.1007/s11554-018-0757-0.
[19] S. Bouaafia, R. Khemiri, F. E. Sayadi, and M. Atri, “Fast CU partition-based machine learning approach for reducing HEVC
complexity,” Journal of Real-Time Image Processing, vol. 17, no. 1, pp. 185–196, Feb. 2020, doi: 10.1007/s11554-019-00936-0.
[20] Y. Ma, Z. Liu, X. Wang, and S. Cao, “Fast intra coding based on CU size decision and direction mode decision for HEVC,”
Multimedia Tools and Applications, vol. 77, no. 12, pp. 14907–14929, Jun. 2018, doi: 10.1007/s11042-017-5074-2.
[21] S. Kuanar, K. R. Rao, M. Bilas, and J. Bredow, “Adaptive CU mode selection in HEVC intra prediction: A deeplearning approach,”
Circuits, Systems, and Signal Processing, vol. 38, no. 11, pp. 5081–5102, Nov. 2019, doi: 10.1007/s00034-019-01110-4.
[22] A. Hassan, M. Ghafoor, S. A. Tariq, T. Zia, and W. Ahmad, “High efficiency video coding (HEVC)–based surgical telementoring
system using shallow convolutional neural network,” Journal of Digital Imaging, vol. 32, no. 6, pp. 1027–1043, Dec. 2019, doi:
10.1007/s10278-019-00206-2.
[23] S. He, Z. Deng, and C. Shi, “Fast decision algorithm of CU size for HEVC intra-prediction based on a kernel fuzzy SVM classifier,”
Electronics, vol. 11, no. 17, Sep. 2022, doi: 10.3390/electronics11172791.
[24] W. Imen, M. Amna, B. Fatma, S. F. Ezahra, and N. Masmoudi, “Fast HEVC intra-CU decision partition algorithm with modified
LeNet-5 and AlexNet,” Signal, Image and Video Processing, vol. 16, no. 7, pp. 1811–1819, Oct. 2022, doi: 10.1007/s11760-022-
02139-w.
[25] F. A. Zeidabadi and M. Dehghani, “POA: Puzzle optimization algorithm,” International Journal of Intelligent Engineering and
Systems, vol. 15, no. 1, pp. 273–280, Feb. 2022, doi: 10.22266/ijies2022.0228.25.
[26] P. D. Kusuma and M. Kallista, “Stochastic komodo algorithm,” International Journal of Intelligent Engineering and Systems,
vol. 15, no. 4, pp. 156–166, Aug. 2022, doi: 10.22266/ijies2022.0831.15.
[27] N. Rana, M. S. A. Latiff, S. M. Abdulhamid, and H. Chiroma, “Whale optimization algorithm: a systematic review of contemporary
applications, modifications and developments,” Neural Computing and Applications, vol. 32, no. 20, pp. 16245–16277, Oct. 2020,
doi: 10.1007/s00521-020-04849-z.
[28] S. Chakraborty, A. K. Saha, R. Chakraborty, and M. Saha, “An enhanced whale optimization algorithm for large scale optimization
problems,” Knowledge-Based Systems, vol. 233, Dec. 2021, doi: 10.1016/j.knosys.2021.107543.
[29] Q.-V. Pham, S. Mirjalili, N. Kumar, M. Alazab, and W.-J. Hwang, “Whale optimization algorithm with applications to resource
allocation in wireless networks,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 4285–4297, Apr. 2020, doi:
10.1109/TVT.2020.2973294.
[30] Z. Yan, J. Zhang, J. Zeng, and J. Tang, “Nature-inspired approach: An enhanced whale optimization algorithm for global
optimization,” Mathematics and Computers in Simulation, vol. 185, pp. 17–46, Jul. 2021, doi: 10.1016/j.matcom.2020.12.008.
Int J Elec & Comp Eng ISSN: 2088-8708 
Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda)
6387
BIOGRAPHIES OF AUTHORS
Suhas Shankarnahalli Krishnegowda is a research scholar in VTU Belgaum.
Received his BE (Electronics and Communication Engineering) from SEA College of
Engineering and Technology, Visvesvaraya Technological University, Belgaum, Karnataka,
India, in the year 2011 and completed his MTech (Digital Electronics) from Srinivas Institute
of Technology, Mangalore, affiliated to Visvesvaraya Technological University, Belgaum,
Karnataka, India, in the year of 2013 and since then he is actively involved in teaching and
research and has ten years of experience in teaching. He is persuing PhD (ECE) from VTU.
At present, he is working as Assistant Professor in South East Asian college of Engineering
and Technology, Bangalore Affiliated to Visveswaraya Technological University. His area of
interest is in the field of signal processing, bio medical signal processing, and communication
system. He can be contacted at email: suhashoodi@gmail.com.
Hosanna Princye Periapandi received his BE(E&I) from Sapthagiri College of
Engineering from Periyar University, Tamilnadu, India in the year 2002 and completed his
Masters in Engineering from Anna University in the year of 2004 and since then actively
involved in teaching and research and has thirteen years of experience in Teaching. She
obtained his PhD in the field of Information and Communication Engineering from Anna
University in the year of 2018. At Present, she is working as an Associate Professor in SEA
College of Engineering and Technology, Bangalore affiliated to Visveswaraya Technological
University, her area of interest is the field of medical image processing, signal processing and
VLSI. She can be contacted at email: hprincye@gmail.com.

More Related Content

PDF
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
PDF
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
PDF
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
PDF
Deep learning-based switchable network for in-loop filtering in high efficie...
PDF
Efficient Realization of Parallel HEVC Intra Coding
PDF
INCEPT: Intra CU Depth Prediction for HEVC
PDF
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
PPTX
Tree structured partitioning into transform blocks and units and interpicture...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
Deep learning-based switchable network for in-loop filtering in high efficie...
Efficient Realization of Parallel HEVC Intra Coding
INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
Tree structured partitioning into transform blocks and units and interpicture...

Similar to Optimal coding unit decision for early termination in high efficiency video coding using enhanced whale optimization algorithm (20)

DOCX
Algorithm and architecture design of the h.265 hevc intra encoder
PDF
40120140504006
PDF
HEVC overview main
PDF
HEVC intra coding
PDF
Online Bitrate ladder prediction for Adaptive VVC Streaming
PDF
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
PDF
Tree structured partitioning into transform blocks and units and interpicture...
PDF
FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Le...
PDF
Efficient pu mode decision and motion estimation for h.264 avc to hevc transc...
PDF
12 11 aug17 29may 7301 8997-1-ed edit satria
PPTX
High Efficiency Video Codec
PDF
A fast pu mode decision algorithm for h.264 avc to hevc transcoding
PPTX
Current developments in video quality: From the emerging HEVC standard to tem...
PDF
Estimation of bitlength of transformed quantized residue
PDF
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
PDF
Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming
PDF
Next generation video compression
PDF
Next generation video compression
PDF
HEVC VIDEO CODEC By Vinayagam Mariappan
PDF
Low complexity video coding for sensor network
Algorithm and architecture design of the h.265 hevc intra encoder
40120140504006
HEVC overview main
HEVC intra coding
Online Bitrate ladder prediction for Adaptive VVC Streaming
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Tree structured partitioning into transform blocks and units and interpicture...
FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Le...
Efficient pu mode decision and motion estimation for h.264 avc to hevc transc...
12 11 aug17 29may 7301 8997-1-ed edit satria
High Efficiency Video Codec
A fast pu mode decision algorithm for h.264 avc to hevc transcoding
Current developments in video quality: From the emerging HEVC standard to tem...
Estimation of bitlength of transformed quantized residue
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming
Next generation video compression
Next generation video compression
HEVC VIDEO CODEC By Vinayagam Mariappan
Low complexity video coding for sensor network
Ad

More from IJECEIAES (20)

PDF
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
PDF
Embedded machine learning-based road conditions and driving behavior monitoring
PDF
Advanced control scheme of doubly fed induction generator for wind turbine us...
PDF
Neural network optimizer of proportional-integral-differential controller par...
PDF
An improved modulation technique suitable for a three level flying capacitor ...
PDF
A review on features and methods of potential fishing zone
PDF
Electrical signal interference minimization using appropriate core material f...
PDF
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
PDF
Bibliometric analysis highlighting the role of women in addressing climate ch...
PDF
Voltage and frequency control of microgrid in presence of micro-turbine inter...
PDF
Enhancing battery system identification: nonlinear autoregressive modeling fo...
PDF
Smart grid deployment: from a bibliometric analysis to a survey
PDF
Use of analytical hierarchy process for selecting and prioritizing islanding ...
PDF
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...
PDF
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...
PDF
Adaptive synchronous sliding control for a robot manipulator based on neural ...
PDF
Remote field-programmable gate array laboratory for signal acquisition and de...
PDF
Detecting and resolving feature envy through automated machine learning and m...
PDF
Smart monitoring technique for solar cell systems using internet of things ba...
PDF
An efficient security framework for intrusion detection and prevention in int...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Embedded machine learning-based road conditions and driving behavior monitoring
Advanced control scheme of doubly fed induction generator for wind turbine us...
Neural network optimizer of proportional-integral-differential controller par...
An improved modulation technique suitable for a three level flying capacitor ...
A review on features and methods of potential fishing zone
Electrical signal interference minimization using appropriate core material f...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Bibliometric analysis highlighting the role of women in addressing climate ch...
Voltage and frequency control of microgrid in presence of micro-turbine inter...
Enhancing battery system identification: nonlinear autoregressive modeling fo...
Smart grid deployment: from a bibliometric analysis to a survey
Use of analytical hierarchy process for selecting and prioritizing islanding ...
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...
Adaptive synchronous sliding control for a robot manipulator based on neural ...
Remote field-programmable gate array laboratory for signal acquisition and de...
Detecting and resolving feature envy through automated machine learning and m...
Smart monitoring technique for solar cell systems using internet of things ba...
An efficient security framework for intrusion detection and prevention in int...
Ad

Recently uploaded (20)

PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
web development for engineering and engineering
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Well-logging-methods_new................
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
PPT on Performance Review to get promotions
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPT
Project quality management in manufacturing
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
Construction Project Organization Group 2.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
web development for engineering and engineering
bas. eng. economics group 4 presentation 1.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mechanical Engineering MATERIALS Selection
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Well-logging-methods_new................
CH1 Production IntroductoryConcepts.pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPT on Performance Review to get promotions
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Project quality management in manufacturing
Operating System & Kernel Study Guide-1 - converted.pdf
Construction Project Organization Group 2.pptx
Internet of Things (IOT) - A guide to understanding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx

Optimal coding unit decision for early termination in high efficiency video coding using enhanced whale optimization algorithm

  • 1. International Journal of Electrical and Computer Engineering (IJECE) Vol. 13, No. 6, December 2023, pp. 6378~6387 ISSN: 2088-8708, DOI: 10.11591/ijece.v13i6.pp6378-6387  6378 Journal homepage: http://ijece.iaescore.com Optimal coding unit decision for early termination in high efficiency video coding using enhanced whale optimization algorithm Suhas Shankarnahalli Krishnegowda, Hosanna Princye Periapandi Department of Electronics and Communication Engineering, S.E.A. College of Engineering and Technology, Visvesvaraya Technological University, Belagavi, India Article Info ABSTRACT Article history: Received Jan 28, 2023 Revised Jul 8, 2023 Accepted Jul 17, 2023 Video compression is an emerging research topic in the field of block based video encoders. Due to the growth of video coding technologies, high efficiency video coding (HEVC) delivers superior coding performance. With the increased encoding complexity, the HEVC enhances the rate-distortion (RD) performance. In the video compression, the out-sized coding units (CUs) have higher encoding complexity. Therefore, the computational encoding cost and complexity remain vital concerns, which need to be considered as an optimization task. In this manuscript, an enhanced whale optimization algorithm (EWOA) is implemented to reduce the computational time and complexity of the HEVC. In the EWOA, a cosine function is incorporated with the controlling parameter A and two correlation factors are included in the WOA for controlling the position of whales and regulating the movement of search mechanism during the optimization and search processes. The bit streams in the Luma-coding tree block are selected using EWOA that defines the CU neighbors and is used in the HEVC. The results indicate that the EWOA achieves best bit rate (BR), time saving, and peak signal to noise ratio (PSNR). The EWOA showed 0.006-0.012 dB higher PSNR than the existing models in the real-time videos. Keywords: Coding units Discrete cosine transform Faster encoding High efficiency video coding Prediction units Whale optimization algorithm This is an open access article under the CC BY-SA license. Corresponding Author: Suhas Shankarnahalli Krishnegowda Department of Electronics and Communication Engineering, S.E.A. College of Engineering and Technology, Visvesvaraya Technological University Belagavi-590018, India Email: suhashoodi@gmail.com 1. INTRODUCTION In recent times, the demand for higher-definition video services has increased in the applications like digital broadcast and internet streaming [1], [2]. To meet the need for transmission and storage of higher resolution videos, a new video coding standard is developed named high efficiency video coding (HEVC) [3]–[5]. HEVC is effectively related to the conventional video coding standards developed prior such as moving picture experts group (MPEG)-2, H.264, and MPEG-4 part 2 [6], [7]. In HEVC, the compression improvement is usually based on the implementation of new encoding methodologies like asymmetric motion partition, intra- prediction modes, and so on. Among the available methodologies, the flexible quad-tree partitioning methodology of coding tree unit (CTU) is efficient [8]. When the size of CTU is 64 × 64, the size of coding unit (CU) is 8 × 8, 32 × 32, 64 × 64, and 16 × 16 and its depth size will be 0, 1, 2, or 3. The CUs in the CTUs are partitioned into 4 blocks based on depth range. In HEVC, the optimum CU partition is selected according to the rate distortion (RD) costs [9], [10]. In HEVC, CU is further-partitioned into several prediction-units
  • 2. Int J Elec & Comp Eng ISSN: 2088-8708  Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda) 6379 (PUs). Hence, the optimal PUs modes are considered as the modes with minimal RD costs among several inter and intra PUs modes. In the higher resolution videos, the mode decision process ensures compression efficiency. In a larger space HEVC, the search for the best PU and CU decision results in high computational time and complexity, where it limits the usage of HEVC encoders in real time applications [11]–[13]. Some of the conventional methods used for video compression are mentioned as follows; convolutional neural networks (CNN) [14], and adaptive switching neural networks [15]. Duvar et al. [16] presented an effective decision algorithm for reducing the encoding time of the HEVC. Initially, the intra block similarity was carried out at the PU level by using the integral images. In addition, at the CU level, an early termination mode was developed. Hence, the developed fast inter mode decision algorithm significantly bypasses the PU modes in PU phase, and also removes the unnecessary controls in the CU phase at a lower depth. In this literature study, the efficacy of the developed algorithm was investigated by means of bit rate and peak signal to noise ratio (PSNR). The experimental outcomes demonstrated that the developed algorithm significantly improves the coding efficacy with low system complexity. In addition, the developed algorithm delivers a good relationship between time savings and coding efficacy related to the earlier approaches. The negative side lobes and artifacts edges were generated in the developed algorithm that affects the system performances. Cen et al. [17] developed a new fast CU depth decision framework for decreasing the computational complexity in the HEVC. The developed framework includes a CU depth range determination and a new CU depth comparison algorithm. In this study, the CU depth range was identified based on CU- depth’s distribution in similar sequences. The experimental results confirmed that the computational complexity of the developed framework was low compared to the existing works. The developed algorithm lacks in retaining higher quality videos at the receiver side. At dissimilar levels of coding abstraction, Jiang and Nooshabadi [18] presented a series of optimization methods for multi-view HEVC. In this literature, the optimized resource scheduled wavefront parallel processing and the quantization parameters based on the early termination of CTU were performed for disparity estimation and parallel motion estimation. From the experimental investigation, the developed optimization methods achieved better experimental results compared to the previous research work in light of PSNR and bit error rate. The developed algorithm effectively reduces the system complexity, but it did not concentrate on the major issue of poor video resolution. Bouaafia et al. [19] presented deep convolutional neural network (DCNN) and support vector machine (SVM) in the inter mode HEVC for optimizing the complexity allocations at the CU level. Initially, the SVM based fast CU model was developed for decreasing the HEVC complexity, and further, the DCNN model was utilized for predicting the CU partition. The experimental outcome indicates that the developed online SVM and DCNN models achieved better results in light of time saving and bit rate. In contrast, the developed algorithm reduces the importance of the color components in the compressed videos. Ma et al. [20] introduced a new faster intra-coding algorithm for speeding up the encoding mechanism. At first, a faster CU-sized decision model was implemented for selecting dissimilar depth decision algorithms for every coding unit. Then, a faster directional mode decision technique was employed, which compares the directional modes of the parent units. The best directional mode of the parent units and the RD cost of the first directional mode was integrated for selecting the best directional mode for the current unit efficiently. The experimental outcome represents that the developed algorithm attained good performance in the video encoding in light of Bjontegaard delta bit rate (BDBR) and time-saving. The developed algorithm was not able to handle massive workloads at higher speeds. Kuanar et al. [21] implemented a new CNN model for effective CU mode selection in the HEVC. Hence, the extensive experimental investigation showed that the developed CNN model has significantly decreased the encoding time related to other state-of-the-art machine learning models, but it was computationally expensive. Hassan et al. [22] developed a surgical telemonitoring system based on HEVC by implementing a shallow CNN model. The experimental investigation confirmed that the shallow CNN model maintains higher visual quality with a better bit rate. Compared to the state-of-the-art models, the developed shallow-based CNN model was effective and efficient for surgical tele-monitoring systems. He et al. [23] developed a new fuzzy based SVM classifier for improving the compression efficiency of the HEVC. In addition to this, the fuzzy based SVM classifier was improved by utilizing the information entropy measure for solving the outliers and the negative impact of data noise problems. However, the undertaken CNN model and the fuzzy based SVM classifier was computationally complex and needed high-end specification systems. Imen et al. [24] integrated modified AlexNet and modified LeNet-5 for predicting the HEVC’s CU partition. The experimental analysis states that the developed model was computationally complex. The key contributions of this research paper are given below; a. Proposed enhanced whale optimization algorithm (EWOA) to decrease the computational time and complexity of the HEVC, which selects the bit streams in the luma coding tree block for effectively determining the CU neighbors. The EWOA is effective in the optimization problems related to the conventional optimization algorithms such as puzzle optimization algorithm [25] and stochastic Komodo algorithm [26].
  • 3.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387 6380 b. Implemented discrete cosine transform (DCT) for generating the residuals by subtracting the prediction values from the input values. The efficacy of the EWOA is analyzed in light of ∆BR, time-saving, and ∆PSNR. This paper is prepared in this manner: methodology details, results and discussion, and the conclusion of the EWOA are depicted in sections 2 to 4, respectively. 2. RESEARCH METHOD In this research, the efficacy of the EWOA is tested on a few online videos: PeopleOnStreet, Traffic, kimono, ParkScene, Cactus, BQTerrace, FourPeople, PartyScene, BasketballDrive, Johnny, BasketballDrill, BQMall, RaceHorses, BasketballPass, BQSquare, BlowingBubbles, and KristenAndSara. The sample video frames are graphically indicated in Figure 1. The proposed framework includes three major steps like optimal bit-streams selection in HEVC using EWOA, inter and intra prediction in HEVC, and data transformation by DCT. The workflow of the proposed framework is represented in Figure 2. Figure 1. Sample collected video frames Figure 2. Workflow of the proposed framework Initially, the frames from the videos are extracted and the separated frames are given to the HEVC for predicting the motion from the video sequences. The basic design of the HEVC is very similar to the H.264. In this scenario, the block-based coding approach significantly exploits both spatial and temporal statistical dependencies. Generally, the HEVC utilizes flexible and adaptive quad-tree coding block partitions for effective coding, transformation and prediction. The basic information about HEVC is given as follows. 2.1. Prediction structure Generally, the quad-tree block partition works based on the CTU structure, which is more similar to the macro-block. The sequence of frames is named a video, and in HEVC, every coded video frame is categorized into slices and CTUs. Further, the CTUs are sub-categorized into the square regions named as CU.
  • 4. Int J Elec & Comp Eng ISSN: 2088-8708  Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda) 6381 In the HEVC, the CU is predicted using inter or intra-predictions, and the 1st frame of the video sequence at each random access point is coded utilizing intra-predictions. The residual video frames are coded by performing inter-predictions, and further, the residual frames are transformed into transform units (TU) by implementing the DCT algorithm. Usually, the CTUs are made up of two chroma coding tree blocks (CTB), quadtree syntax and luma CTB, where every Chroma CTBs has the block size of (𝑁/2) × (𝑁/2) and luma CTB has the block size of 𝑁 × 𝑁. The CTB size is the same as the size of Coding Blocks (CBs), where the CTB has more CUs and is associated with the Tus and PUs. The inter prediction, intra prediction, and coding modes are selected at the CU level, where 𝑁 is represented as bit-streams and it will be 64, 32, 16, and 8 bits. In this research manuscript, the bit-streams are chosen by implementing an effective meta-heuristics algorithm named EWOA, which generally follows the behavior of humpback whales, here, error rate is considered as the objective function. After creating an initial population, the humpback whale improves its location based on the encircling method, and it is mathematically defined in (1) and (2) [27], [28]. Where, 𝑡 indicates iteration number, 𝐷 indicates the distance between a prey 𝑃′(𝑡) and the humpback whale’s position 𝑃(𝑡) and 𝐴 and 𝐵 states coefficient values, which are determined in (3) and (4). 𝐷 = |𝐵 ⊙ 𝑃′(𝑡) − 𝑃(𝑡)| (1) 𝑃(𝑡 + 1) = |𝑃′(𝑡) − 𝐴 ⊙ 𝐷| (2) 𝐴 = 2𝑙 ⊙ 𝑟 − 𝑙 (3) 𝐵 = 2𝑟 (4) where, 𝑟 represents a random vector, which usually ranges between 0 to 1, and 𝑙 represents the linearity values, which range between 0 to 2. On the other hand, the bubble-net method is accomplished based on shrinking encircling and spiral updating position, as shown in (5) and (6). Where, ⊙ indicates element by element multiplication process, 𝑏 represents constant value that is used to determine the logarithmic spiral shape, 𝑎 denotes random value, which ranges between [-1, 1], and 𝐷 ́ = |𝑃′(𝑡) − 𝑃(𝑡)| represents the distance between the humpback whales and the prey. 𝑃 (𝑡 + 1) = 𝐷 ́ ⊙ 𝑒𝑏𝑎 ⊙ 𝑐𝑜𝑠(2𝜋𝑎) + 𝑃′(𝑡) (5) 𝑃(𝑡 + 1) = { 𝑃′(𝑡) − 𝐴 ⊙ 𝐷 𝑖𝑓 𝑝 ≥ 0.5 𝐷 ́ ⊙ 𝑒𝑏𝑎 ⊙ 𝑐𝑜𝑠(2𝜋𝑎) + 𝑃′(𝑡) 𝑖𝑓 𝑝 < 0.5 (6) where 𝑝 ∈ [0,1] indicates the probability of choosing the shrinking encircling method or spiral method to adjust the whales' position. The humpback whales search for their prey in the exploration section. The position of the humpback whales is updated by computing the random search agents and then finding the best search agents. This process is mathematically indicated in (7) and (8) [29], [30]. 𝐷 = |𝐵 ⊙ 𝑃𝑟𝑎𝑛𝑑 − 𝑃(𝑡)| (7) 𝑃(𝑡 + 1) = |𝑃𝑟𝑎𝑛𝑑 − 𝐴 ⊙ 𝐷| (8) where, 𝑃𝑟𝑎𝑛𝑑 indicates random position, which is determined based on the current population. Due to the lack of prior knowledge, updating the positions of search agents is trapped into local optima problems in the existing WOA. Therefore, a novel cosine function is added with the control parameter 𝐴 for controlling the whale’s position. The inclusion of cosine function in the control parameter provides a better balance of exploitation and exploration and it is mathematically indicated in (9). 𝐴 = 1 + 0.5 × 𝑐𝑜𝑠 (𝜋 𝑡 𝐼𝑡𝑒𝑟𝑚𝑎𝑥 ) (9) During the search process, two correlation factors 𝐶𝐹1 and 𝐶𝐹2 are used for regulating the movement of the search agents. As shown in (7) and (8) are updated as in (10) and (11). The assumed parameters of the EWOA are: number of agents is represented as 100, 𝑡 indicates current iteration, 𝐶𝐹1 = 2.5, 𝐶𝐹2 = 1.5 and 𝐼𝑡𝑒𝑟𝑚𝑎𝑥 = 100 represents maximum iteration. Once the maximum iteration is reached, the EWOA automatically terminates. 𝐷 = |𝐵 ⊙ 𝑃𝑟𝑎𝑛𝑑 − 𝑃(𝑡)|/𝐶𝐹1 (10)
  • 5.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387 6382 𝑃(𝑡 + 1) = |𝑃𝑟𝑎𝑛𝑑 − 𝐴 ⊙ 𝐷|/𝐶𝐹2 (11) 2.2. Inter-prediction in HEVC In the HEVC, the inter-predictions support the division of prediction blocks (PBs) related to the intra- predictions. Generally, the inter-coded PUs have numerous motion parameters that include reference image indexes, usage flags, motion vectors, and reference image lists. The CU is indicated as one PU, while the CU is coded with a skip model, and it has no efficient motion parameters and transformation coefficients obtained by merging the modes. The encoder utilizes explicit transmission or merges mode of motion parameters for every PU in the inter-coded PUs. Hence, the merged model is employed in the skip mode and inter-coded PU. In the HEVC, the merge mode is utilized for identifying the neighbor inter-coded PUs. The inter-prediction in HEVC has motion vectors with units of one-eight and one-quarter for determining the distance between chroma samples and luma samples. 2.3. Intra-prediction in HEVC In the HEVC, the intra-units generally exploit the spatial correlation of PU and its neighborhood image pixels for effective prediction. The new features like TU, PU, CU, and CTU are defined in the HEVC for achieving higher compression and removing spatial redundancy. On the other hand, the optimization of rate- distortion is carried-out to identify the superiorly best prediction mode of every CU. The RD cost function of intra-prediction in HEVC is defined in (12). 𝑅𝐷 = 𝑆𝑆𝐸 + 𝜆 × 𝑅 (12) where, 𝑅 indicates bit-rate, 𝑆𝑆𝐸 represents the sum of squared distance between the original and reconstructed pixels, and 𝜆 denotes the quantization parameter. Additionally, the HEVC uses a recursive structure and squad tree for CUs splitting. Every CU is categorized into 4 PUs and further, the intra-prediction is carried-out for every PUs. The CUs size ranges from 8 × 8 to 64 × 64 pixels and the PUs size ranges from 4 × 4 to 64 × 64 pixels. Subsequently, the HEVC performs the intra block predictions for 4 × 4 to 64 × 64 pixels. Generally, the HEVC supports 35 intra-predictions and it includes 33 angular predictions. From the reconstructed PUs, two reference array sets are used for intra-prediction in the HEVC. The present image pixel 𝐶𝑥,𝑦 is projected towards the reference image pixels with a fixed displacement parameter 𝑑 that helps in defining the angularity of vertical and horizontal prediction modes. The interpolation is carried out at an accuracy of 1/32, once the reference samples 𝑅𝑖 and 𝑅𝑖+1 are determined and it is mathematically represented in (13). 𝐶𝑥,𝑦 = ((32 − 𝑑) × 𝑅𝑖 + 𝑑 × 𝑅𝑖+1 + 16) ≫ 5 (13) In HEVC, the prediction of the angular modes delivers effective intra-prediction, while more edges are presented. The DC predictions are extensively used for predicting the flat-surfaces. The block prediction is generated by a weighted average of four reference samples in the planar prediction, which is determined in (14). 𝑃ℎ (𝑥,𝑦) = 𝑑 × (𝑥 + 1) + 𝑏 × (𝑁 − (𝑥 + 1)) 𝑃𝑣 (𝑥,𝑦) = 𝑎 × (𝑦 + 1) + 𝑐 × (𝑁 − (𝑦 + 1)) 𝑃𝑃𝐿 (𝑥,𝑦) = (𝑃ℎ (𝑥,𝑦) + 𝑃𝑣 (𝑥,𝑦) + 𝑁) ≫ (𝑙𝑜𝑔2 𝑁 + 1) (14) where, a and d indicate bottom left and top right samples. In the HEVC, the filtering process is managed by TB size and intra-prediction mode. The neighborhood samples are not filtered, when the DC-intra prediction mode is chosen. The bi-linear filter is enabled when the distance between the horizontal/vertical mode and intra-prediction mode is higher than that of the threshold value. The sample 35 intra-prediction modes are represented in Figure 3. 2.4. Transformation In the transformation procedure, the residuals are transformed into TU utilizing DCT. In video compression, the DCT is an extensively utilized transformation technique, which is effective in energy compaction, computation efficiency, and correlation reduction. The DCT includes 16 members, and the one- dimensional DCT of 1 × 𝑁 vector 𝑥(𝑛) is determined in (15) and (16). 𝑌[𝑘] = 𝐶[𝑘] ∑ 𝑥[𝑛]𝑐𝑜𝑠 [ (2𝑛+1)𝑘𝜋 2𝑁 ] 𝑁−1 𝑛=0 (15) where 𝑘 = 0,1,2, … 𝑁 − 1.
  • 6. Int J Elec & Comp Eng ISSN: 2088-8708  Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda) 6383 𝐶[𝑘] = [ √ 1 𝑁 𝑓𝑜𝑟 𝑘 = 0 √ 1 𝑁 𝑓𝑜𝑟 𝑘 = 1,2, … 𝑁 − 1 ] (16) Figure 3. Quad-tree structure of the CUs The original feature vectors 𝑥(𝑛) are re-constructed from the DCT coefficients 𝑌[𝑘] utilizing the Inverse DCT operation, and it is mathematically denoted in (17). Then, the DCT is extended to the transformation of the image, which is achieved by computing the individual rows and columns of the two- dimensional image. The mathematical equation of two dimensional DCT is indicated in (18) and (19). 𝑥[𝑛] = ∑ 𝐶[𝑘]𝑌[𝑘]𝑐𝑜𝑠 [ (2𝑛+1)𝑘𝜋 2𝑁 ] 𝑁−1 𝑘=0 (17) where, 𝑛 = 0,1,2, … . 𝑁 − 1 [𝑗, 𝑘] = 𝐶[𝑗]𝐶[𝑘] ∑ ∑ 𝑥[𝑚, 𝑛] 𝑐𝑜𝑠 ( (2𝑚+1)𝑗𝜋 2𝑁 ) 𝑁−1 𝑛=0 𝑁−1 𝑚=0 𝑐𝑜𝑠 ( (2𝑛+1)𝑘𝜋 2𝑁 ) (18) where the size of the image is represented as 𝑥(𝑛1, 𝑛2), and 𝑗, 𝑘, 𝑚, 𝑛 = 0,1,2, … 𝑁 − 1. 𝐶[𝑗] 𝑎𝑛𝑑 𝐶[𝑘] = [ √ 1 𝑁 𝑓𝑜𝑟 𝑗, 𝑘 = 0 √ 1 𝑁 𝑓𝑜𝑟 𝑗, 𝑘 = 1,2, … 𝑁 − 1 ] (19) Correspondingly, the two-dimensional inverse DCT is determined in (20). 𝑥[𝑚, 𝑛] = ∑ ∑ 𝐶[𝑗]𝐶[𝑘]𝑌[𝑗, 𝑘] 𝑐𝑜𝑠 ( (2𝑚+1)𝑗𝜋 2𝑁 ) 𝑁−1 𝑘=0 𝑁−1 𝑗=0 𝑐𝑜𝑠 ( (2𝑛+1)𝑘𝜋 2𝑁 ) (20) The DCT represented in (18) and (20) are orthonormal and perfectly reconstructed the coefficients for achieving infinite precision. At last, the reconstructed samples are achieved from the inverse transformation, and the reconstructed CTUs are arranged for constructing a final image. The experimental results of the EWOA are depicted in section 3. 3. RESULTS AND DISCUSSION In this research, the EWOA is implemented using MATLAB R2020a software. The simulation is performed with an i7 processor system with 8 GB random access memory, and 1 TB hard disk. This research study mainly uses HEVC/H.265 for motion estimation. The performance of the EWOA is analyzed in light of ∆BR, time saving, and ∆PSNR. Additionally, the effectiveness of the EWOA is compared to the prior research model: online SVM+DCNN [19]. The most crucial performance measures of fast encoding: time saving ∆𝑇 and ∆𝐵𝑅 are mathematically denoted in (21) and (22).
  • 7.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387 6384 ∆𝑇 = 𝑇𝑝−𝑇𝑂 𝑇𝑂 × 100 (21) ∆𝑅 = 𝑅𝑝−𝑅𝑜 𝑅𝑂 × 100 (22) where, 𝑇𝑝 and 𝑅𝑝 are denoted as computational time and bit rate of the EWOA, 𝑇𝑂 and 𝑅𝑂 are stated as computational time and bit rate of the existing models. Similarly, PSNR is utilized for measuring the quality of original and compressed frames which is mathematically denoted in (23). Where, the PSNR value of the EWOA is indicated as 𝑃𝑆𝑁𝑅𝑝 and the PSNR value of the existing model is specified as 𝑃𝑆𝑁𝑅𝑜. ∆𝑃𝑆𝑁𝑅 = 𝑃𝑆𝑁𝑅𝑝 − 𝑃𝑆𝑁𝑅𝑜 (𝑑𝐵) (23) 3.1. Quantitative analysis By viewing Table 1, the effectiveness of the EWOA is validated with the existing models such as deep CNN [19], online SVM [19] and conventional WOA by means of 𝛥𝐵𝑅. Here, the EWOA’s performance is evaluated on seventeen real time videos. From the experimental investigation, the overall performance shows that the EWOA outperforms the online SVM [19], deep CNN [19] and WOA in terms of 𝛥𝐵𝑅, as shown in Table 1. It implies that the EWOA is more robust in diminishing complexity of the inter-mode HEVC related to other models online SVM [19], deep CNN [19], and WOA. Correspondingly, in Table 2, the experimental investigation of the EWOA is done in terms of ∆PSNR value. From the inspection, ∆PSNR value of the EWOA is higher than the prior models: deep CNN [19], online SVM [19] and WOA. In this scenario, the EWOA almost showed 0.006-0.012 dB value higher than the existing models in the real time videos. Table 1. Performance investigation of the EWOA and the existing models in light of Δ𝐵𝑅 ΔBR (%) Videos Deep CNN [19] Online SVM [19] WOA EWOA PeopleOnStreet 0.57200 2.23500 0.55734 0.51824 Traffic -0.57100 2.02200 -0.58390 -0.71332 Kimono 0.72800 0.44900 0.71962 0.62093 ParkScene -0.40100 0.79000 -0.41199 -0.87320 Cactus 0.41200 0.71700 0.40141 0.29312 BQTerrace -2.60600 0.32800 -2.61793 -2.87302 BasketballDrive 1.13000 0.58300 1.11834 1.00923 BasketballDrill -0.04400 1.44000 -0.04866 -0.08392 BQMall 1.01900 3.31300 1.00660 -0.30293 PartyScene -0.70900 2.41500 -0.72106 -0.89203 RaceHorses 0.52900 2.65000 0.52264 0.33025 BasketballPass 0.69000 3.16700 0.68472 0.62039 BQSquare -2.64700 3.69200 -2.65648 -2.77823 BlowingBubbles -0.32700 2.20700 -0.32899 -0.30977 FourPeople -0.65900 1.01700 -0.66884 -0.67898 Johnny -0.36100 1.72900 -0.37648 -1.09797 KristenAndSara -1.02600 1.35700 -1.04073 -1.20393 Table 2. Performance investigation of the EWOA and the existing models in light of ∆PSNR ∆PSNR (dB) Videos Deep CNN [19] Online SVM [19] WOA EWOA PeopleOnStreet -0.03000 -0.05500 -0.03703 -0.02883 Traffic -0.06100 -0.04780 -0.06254 -0.05009 Kimono -0.02500 -0.02000 -0.03453 -0.02456 ParkScene -0.05200 -0.02300 -0.05741 -0.04565 Cactus -0.05200 -0.01900 -0.05880 -0.04446 BQTerrace -0.07100 -0.02200 -0.07137 -0.06754 BasketballDrive -0.03100 -0.02300 -0.03909 -0.02157 BasketballDrill -0.05600 -0.04700 -0.06349 -0.04876 BQMall -0.05100 -0.05800 -0.05220 -0.04267 PartyScene -0.08500 -0.06200 -0.09025 -0.07066 RaceHorses -0.04000 -0.05300 -0.04326 -0.03355 BasketballPass -0.05400 -0.06700 -0.05946 -0.04953 BQSquare -0.16200 -0.08600 -0.16599 -0.12577 BlowingBubbles -0.07100 -0.06700 -0.07515 -0.05518 FourPeople -0.06200 -0.01400 -0.06455 -0.05459 Johnny -0.12400 -0.03600 -0.12421 -0.10467 KristenAndSara -0.07000 -0.02800 -0.07924 -0.06965
  • 8. Int J Elec & Comp Eng ISSN: 2088-8708  Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda) 6385 As represented in Table 3, seventeen online real time videos are utilized for investigating the effectiveness of the EWOA. In Table 3, the EWOA’s efficacy is validated in light of time saving ΔT. From the inspection, the EWOA achieved better results compared to the online SVM [19], deep CNN [19] and WOA in light of time saving ΔT on the real time videos, which is the major problem highlighted in the literature section. Table 3. Performance investigation of the EWOA and the existing models in light of ∆T ΔT (%) Videos Deep CNN [19] Online SVM [19] WOA EWOA PeopleOnStreet -50.67000 -56.56000 -51.64113 -49.73822 Traffic -57.90000 -58.07000 -59.64715 -56.20333 Kimono -43.26000 -44.18000 -44.21259 -42.48243 ParkScene -64.14000 -52.60000 -65.91336 -63.93794 Cactus -52.57000 -41.38000 -53.87029 -51.08473 BQTerrace -58.43000 -41.45000 -59.64306 -57.46738 BasketballDrive -51.30000 -51.17000 -52.69093 -51.84837 BasketballDrill -53.54000 -55.87000 -54.24158 -52.46364 BQMall -52.25000 -55.96000 -53.06197 -52.84949 PartyScene -51.54000 -52.93000 -52.63542 -50.63434 RaceHorses -42.22000 -51.25000 -42.91538 -40.93940 BasketballPass -52.42000 -55.49000 -53.54240 -51.54234 BQSquare -52.79000 -55.92000 -54.45821 -51.44434 BlowingBubbles -46.55000 -51.87000 -47.48938 -44.48394 FourPeople -67.54000 -51.90000 -68.56404 -66.56344 Johnny -69.66000 -60.12000 -70.43801 -68.43444 KristenAndSara -67.20000 -53.57000 -68.15833 -65.15849 3.2. Discussion In the present decade, the HEVC has better coding efficiency, because of the rapid growth of video coding technology. The encoding complexity is increased in the HEVC, while improving the performance of RD. In addition, the emerging HEVC uses new coding structures, which are characterized by TU, PU and CU. It enhances the coding efficiency superiorly, but increases computational complexity on the decision of optimal TU, PU, and CU sizes. Computational complexity remains a vital problem, and it should be considered in the optimization task. As discussed in the previous sections, we proposed a EWOA with quad tree coding and DCT for fast CU partition that superiorly decreases the complexity of HEVC at inter-mode. The proposed framework achieved a good trade-off between the RD performance and complexity reduction. The EWOA with quad tree coding and DCT not only predicts the HEVC CU partition at inter-mode and reduces the HEVC complexity with minimal error value. In this manuscript, almost seventeen online real time videos are used for analyzing the effectiveness of the proposed framework in light of PSNR, time saving ∆𝑇 and ∆𝐵𝑅. 4. CONCLUSION In this study, an efficient video compression is achieved by implementing HEVC with an optimization algorithm: EWOA. The EWOA is utilized for estimating the motions from the video sequences. In the proposed framework, the quad tree coding block is employed for partitioning the structures, and DCT is applied to the extracted video frames for improving the coding efficiency. The performance of the EWOA is investigated by comparing the input video sequences with decompressed video sequences in terms of ΔBR, ΔT, and ΔPSNR. The simulation analysis concluded that the EWOA attained better performance in the video compression, and it showed 0.006-0.012 dB higher PSNR than the existing models in the real time videos like asketballPass, BQTerrace, BasketballDrive, RaceHorses, BQMall, BlowingBubbles, Cactus, FourPeople, PartyScene, PeopleOnStreet, Johnny, Kimono, KristenAndSara, Traffic, ParkScene, BasketballDrill, and BQSquare. The proposed framework is specifically used for surveillance or conversational videos that largely reduces the bandwidth without degrading the visual quality. The future studies will focus on video coding optimization or perceptual based medical image. Additionally, a novel algorithm can be developed for fast mode selection based on the pattern directions of the neighboring PU. AUTHOR CONTRIBUTIONS The paper conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparation, writing—review and editing, visualization, have been done by 1st author. The supervision and project administration, have been done by 2nd author.
  • 9.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6378-6387 6386 REFERENCES [1] W. El-Shafai, I. M. Almomani, and A. Alkhayer, “Optical bit-plane-based 3D-JST cryptography algorithm with cascaded 2D-FrFT encryption for efficient and secure HEVC communication,” IEEE Access, vol. 9, pp. 35004–35026, 2021, doi: 10.1109/ACCESS.2021.3062403. [2] Y. Zhang, C. Zhang, R. Fan, S. Ma, Z. Chen, and C.-C. J. Kuo, “Recent advances on HEVC inter-frame coding: From optimization to implementation and beyond,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 11, pp. 4321–4339, Nov. 2020, doi: 10.1109/TCSVT.2019.2954474. [3] A. A. Elrowayati, M. A. Alrshah, M. F. L. Abdullah, and R. Latip, “HEVC watermarking techniques for authentication and copyright applications: Challenges and opportunities,” IEEE Access, vol. 8, pp. 114172–114189, 2020, doi: 10.1109/ACCESS.2020.3004049. [4] W. Zhu, Y. Yi, H. Zhang, P. Chen, and H. Zhang, “Fast mode decision algorithm for HEVC intra coding based on texture partition and direction,” Journal of Real-Time Image Processing, vol. 17, no. 2, pp. 275–292, Apr. 2020, doi: 10.1007/s11554- 018-0766-z. [5] J.-K. Lee, N. Kim, S. Cho, and J.-W. Kang, “Deep video prediction network-based inter-frame coding in HEVC,” IEEE Access, vol. 8, pp. 95906–95917, 2020, doi: 10.1109/ACCESS.2020.2993566. [6] D. Xu, “Commutative encryption and data hiding in HEVC video compression,” IEEE Access, vol. 7, pp. 66028–66041, 2019, doi: 10.1109/ACCESS.2019.2916484. [7] X. Sun, H. Ma, W. Zuo, and M. Liu, “Perceptual-based HEVC intra coding optimization using deep convolution networks,” IEEE Access, vol. 7, pp. 56308–56316, 2019, doi: 10.1109/ACCESS.2019.2910245. [8] H. Huang, I. Schiopu, and A. Munteanu, “Frame-wise CNN-based filtering for intra-frame quality enhancement of HEVC videos,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 6, pp. 2100–2113, Jun. 2021, doi: 10.1109/TCSVT.2020.3018230. [9] A. Mercat, A. Makinen, J. Sainio, A. Lemmetti, M. Viitanen, and J. Vanne, “Comparative rate-distortion-complexity analysis of VVC and HEVC video codecs,” IEEE Access, vol. 9, pp. 67813–67828, 2021, doi: 10.1109/ACCESS.2021.3077116. [10] M. Saldanha, G. Sanchez, C. Marcon, and L. Agostini, “Fast 3D-HEVC depth map encoding using machine learning,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 3, pp. 850–861, Mar. 2020, doi: 10.1109/TCSVT.2019.2898122. [11] X. Sun, X. Yang, S. Wang, and M. Liu, “Content-aware rate control scheme for HEVC based on static and dynamic saliency detection,” Neurocomputing, vol. 411, pp. 393–405, Oct. 2020, doi: 10.1016/j.neucom.2020.06.003. [12] J. Lin, D. Liu, H. Yang, H. Li, and F. Wu, “Convolutional neural network-based block up-sampling for HEVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 12, pp. 3701–3715, Dec. 2019, doi: 10.1109/TCSVT.2018.2884203. [13] M. Zhou et al., “SSIM-based global optimization for CTU-level rate control in HEVC,” IEEE Transactions on Multimedia, vol. 21, no. 8, pp. 1921–1933, Aug. 2019, doi: 10.1109/TMM.2019.2895281. [14] Y. Zhang, T. Shen, X. Ji, Y. Zhang, R. Xiong, and Q. Dai, “Residual highway convolutional neural networks for in-loop filtering in HEVC,” IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 3827–3841, Aug. 2018, doi: 10.1109/TIP.2018.2815841. [15] W. Lin et al., “Partition-aware adaptive switching neural networks for post-processing in HEVC,” IEEE Transactions on Multimedia, vol. 22, no. 11, pp. 2749–2763, Nov. 2020, doi: 10.1109/TMM.2019.2962310. [16] R. Duvar, O. Akbulut, and O. Urhan, “Fast inter mode decision exploiting intra-block similarity in HEVC,” Signal Processing: Image Communication, vol. 78, pp. 503–510, Oct. 2019, doi: 10.1016/j.image.2019.08.010. [17] Y.-F. Cen, W.-L. Wang, and X.-W. Yao, “A fast CU depth decision mechanism for HEVC,” Information Processing Letters, vol. 115, no. 9, pp. 719–724, Sep. 2015, doi: 10.1016/j.ipl.2015.04.001. [18] C. Jiang and S. Nooshabadi, “Multi-level complexity reduction for HEVC multiview coding,” Journal of Real-Time Image Processing, vol. 17, no. 2, pp. 197–213, Apr. 2020, doi: 10.1007/s11554-018-0757-0. [19] S. Bouaafia, R. Khemiri, F. E. Sayadi, and M. Atri, “Fast CU partition-based machine learning approach for reducing HEVC complexity,” Journal of Real-Time Image Processing, vol. 17, no. 1, pp. 185–196, Feb. 2020, doi: 10.1007/s11554-019-00936-0. [20] Y. Ma, Z. Liu, X. Wang, and S. Cao, “Fast intra coding based on CU size decision and direction mode decision for HEVC,” Multimedia Tools and Applications, vol. 77, no. 12, pp. 14907–14929, Jun. 2018, doi: 10.1007/s11042-017-5074-2. [21] S. Kuanar, K. R. Rao, M. Bilas, and J. Bredow, “Adaptive CU mode selection in HEVC intra prediction: A deeplearning approach,” Circuits, Systems, and Signal Processing, vol. 38, no. 11, pp. 5081–5102, Nov. 2019, doi: 10.1007/s00034-019-01110-4. [22] A. Hassan, M. Ghafoor, S. A. Tariq, T. Zia, and W. Ahmad, “High efficiency video coding (HEVC)–based surgical telementoring system using shallow convolutional neural network,” Journal of Digital Imaging, vol. 32, no. 6, pp. 1027–1043, Dec. 2019, doi: 10.1007/s10278-019-00206-2. [23] S. He, Z. Deng, and C. Shi, “Fast decision algorithm of CU size for HEVC intra-prediction based on a kernel fuzzy SVM classifier,” Electronics, vol. 11, no. 17, Sep. 2022, doi: 10.3390/electronics11172791. [24] W. Imen, M. Amna, B. Fatma, S. F. Ezahra, and N. Masmoudi, “Fast HEVC intra-CU decision partition algorithm with modified LeNet-5 and AlexNet,” Signal, Image and Video Processing, vol. 16, no. 7, pp. 1811–1819, Oct. 2022, doi: 10.1007/s11760-022- 02139-w. [25] F. A. Zeidabadi and M. Dehghani, “POA: Puzzle optimization algorithm,” International Journal of Intelligent Engineering and Systems, vol. 15, no. 1, pp. 273–280, Feb. 2022, doi: 10.22266/ijies2022.0228.25. [26] P. D. Kusuma and M. Kallista, “Stochastic komodo algorithm,” International Journal of Intelligent Engineering and Systems, vol. 15, no. 4, pp. 156–166, Aug. 2022, doi: 10.22266/ijies2022.0831.15. [27] N. Rana, M. S. A. Latiff, S. M. Abdulhamid, and H. Chiroma, “Whale optimization algorithm: a systematic review of contemporary applications, modifications and developments,” Neural Computing and Applications, vol. 32, no. 20, pp. 16245–16277, Oct. 2020, doi: 10.1007/s00521-020-04849-z. [28] S. Chakraborty, A. K. Saha, R. Chakraborty, and M. Saha, “An enhanced whale optimization algorithm for large scale optimization problems,” Knowledge-Based Systems, vol. 233, Dec. 2021, doi: 10.1016/j.knosys.2021.107543. [29] Q.-V. Pham, S. Mirjalili, N. Kumar, M. Alazab, and W.-J. Hwang, “Whale optimization algorithm with applications to resource allocation in wireless networks,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 4285–4297, Apr. 2020, doi: 10.1109/TVT.2020.2973294. [30] Z. Yan, J. Zhang, J. Zeng, and J. Tang, “Nature-inspired approach: An enhanced whale optimization algorithm for global optimization,” Mathematics and Computers in Simulation, vol. 185, pp. 17–46, Jul. 2021, doi: 10.1016/j.matcom.2020.12.008.
  • 10. Int J Elec & Comp Eng ISSN: 2088-8708  Optimal coding unit decision for early termination in high … (Suhas Shankarnahalli Krishnegowda) 6387 BIOGRAPHIES OF AUTHORS Suhas Shankarnahalli Krishnegowda is a research scholar in VTU Belgaum. Received his BE (Electronics and Communication Engineering) from SEA College of Engineering and Technology, Visvesvaraya Technological University, Belgaum, Karnataka, India, in the year 2011 and completed his MTech (Digital Electronics) from Srinivas Institute of Technology, Mangalore, affiliated to Visvesvaraya Technological University, Belgaum, Karnataka, India, in the year of 2013 and since then he is actively involved in teaching and research and has ten years of experience in teaching. He is persuing PhD (ECE) from VTU. At present, he is working as Assistant Professor in South East Asian college of Engineering and Technology, Bangalore Affiliated to Visveswaraya Technological University. His area of interest is in the field of signal processing, bio medical signal processing, and communication system. He can be contacted at email: suhashoodi@gmail.com. Hosanna Princye Periapandi received his BE(E&I) from Sapthagiri College of Engineering from Periyar University, Tamilnadu, India in the year 2002 and completed his Masters in Engineering from Anna University in the year of 2004 and since then actively involved in teaching and research and has thirteen years of experience in Teaching. She obtained his PhD in the field of Information and Communication Engineering from Anna University in the year of 2018. At Present, she is working as an Associate Professor in SEA College of Engineering and Technology, Bangalore affiliated to Visveswaraya Technological University, her area of interest is the field of medical image processing, signal processing and VLSI. She can be contacted at email: hprincye@gmail.com.