SlideShare a Scribd company logo
OPSE: Online Per-Scene Encoding for Adaptive HTTP Live
Streaming
Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and
Christian Timmerer1
1
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria
2
Bitmovin, Klagenfurt, Austria
3
School of Computer Science and Electronic Engineering, University of Essex, UK
21 July 2022
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
Outline
1 Introduction
2 OPSE
3 Evaluation
4 Q & A
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
Introduction
Motivation
Per-scene encoding schemes are based on the fact that each resolution performs better
than others in a scene for a given bitrate range, and these regions depend on the video
complexity.
Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as
introduced for VoD services.1
Figure: The bitrate ladder prediction envisioned using OPSE.
1
J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016,
pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
Introduction
Why not in live yet?
Though per-title encoding schemes2 enhance the quality of video delivery, determining the
convex-hull is computationally expensive, making it suitable for only VoD streaming
applications.
Some methods pre-analyze the video contents3.
Katsenou et al.4
introduced a content-gnostic method that employs machine learning to find
the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5
proposed a Random Forest (RF) classifier to decide encoding resolution best suited over
different quality ranges and studied machine learning based adaptive resolution prediction.
However, these approaches still yield latency much higher than the accepted latency in
live streaming.
2
De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal
Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247.
3
https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022.
4
A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi:
10.1109/PCS48520.2019.8954529.
5
Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE
International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
OPSE
OPSE
Input Video
Video Complexity
Feature Extraction
Scene Detection
Resolution
Prediction
Resolutions (R)
Bitrates (B)
Per-Scene
Encoding
(E, h, ϵ)
(E, h)
Scenes (ˆ
r, b)
Figure: OPSE architecture.
E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6
6
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839.
doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
OPSE
OPSE
Phase 1: Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame is averaged to determine the energy per frame.7
E =
C−1
X
k=0
Hp,k
C · w2
(2)
7
Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007,
pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
OPSE
OPSE
Phase 1: Feature Extraction
hp: SAD of the block level energy values of frame p to that of the previous frame p − 1.
hp =
C−1
X
k=0
| Hp,k, Hp−1,k |
C · w2
(3)
where C denotes the number of blocks in frame p.
The gradient of h per frame p, ϵp is also defined, which is given by:
ϵp =
hp−1 − hp
hp−1
(4)
Latency
Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86
SIMD optimization
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
OPSE
OPSE
Phase 2: Scene Detection
Objective:
Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh
(IDR) frame.
Encode the subsequent frames of the new shot based on the first one via motion compen-
sation and prediction.
Shot transitions can be present in two ways:
hard shot-cuts
gradual shot transitions
The detection of gradual changes is much more difficult owing to the fact it is difficult to
determine the change in the visual information in a quantitative format.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
OPSE
OPSE
Phase 2: Scene Detection
Step 1: while Parsing all video frames do
if ϵk > T1 then
k ← IDR-frame, a new shot.
else if ϵk ≤ T2 then
k ← P-frame or B-frame, not a new shot.
T1 , T2 : maximum and minimum threshold for ϵk
f : video fps
Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3
q0: current frame number in the set Q
q−1: previous frame number in the set Q
q1: next frame number in the set Q
Step 2: while Parsing Q do
if q0 − q−1 > f and q1 − q0 > f then
q0 ← IDR-frame, a new shot.
Eliminate q0 from Q.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
OPSE
OPSE
Phase 3: Resolution Prediction
For each detected scene, the optimized bitrate ladder is predicted using the E and h features
of the first GOP of each scene and the sets R and B. The optimized resolution ˆ
r is predicted
for each target bitrate b ∈ B. The resolution scaling factor s is defined as:
s =
 r
rmax

; r ∈ R (5)
where rmax is the maximum resolution in R.
Hidden Layer
E R4
Hidden Layer
E R4
Input Layer
E R3
Output Layer
E R1
E
h
log(b)
ŝ
Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum
resolution rmax and framerate f .
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
Evaluation
Evaluation
R = {360p, 432p, 540p, 720p, 1080p}
B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}.
Figure: BDRV results for scenes characterized by various average E and h.
BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations
compared with that of the fixed bitrate ladder encoding to maintain the same VMAF.
8
G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001).
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
Evaluation
Evaluation
(a) Scene1 (b) Scene2
Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and
Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
Q  A
Q  A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13

More Related Content

PDF
OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming
PDF
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
PDF
OPTE: Online Per-title Encoding for Live Video Streaming
PDF
Efficient bitrate ladder construction for live video streaming
PDF
Online Bitrate ladder prediction for Adaptive VVC Streaming
PDF
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
PDF
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
PDF
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming
Efficient bitrate ladder construction for live video streaming
Online Bitrate ladder prediction for Adaptive VVC Streaming
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...

Similar to OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf (20)

PDF
Improving Per-title Encoding for HTTP Adaptive Streaming by Utilizing Video S...
PDF
JASLA_presentation.pdf
PDF
Doctoral Symposium presentation.pdf
PDF
CAPS_Presentation.pdf
PDF
HTTP Adaptive Streaming – Quo Vadis?
PDF
HTTP Adaptive Streaming – Where Is It Heading?
PDF
HTTP Adaptive Streaming – Quo Vadis (2024)
PDF
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
PDF
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
PDF
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
PDF
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
PDF
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
PPTX
MHV'22 - Super-resolution Based Bitrate Adaptation for HTTP Adaptive Streamin...
PPTX
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
PDF
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
PDF
Content-adaptive Video Coding for HTTP Adaptive Streaming
PDF
LiveVBR presentation at VQEG NORM.pdf
PDF
A Framework for Adaptive Delivery of Omnidirectional Video
PDF
Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning
PDF
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
Improving Per-title Encoding for HTTP Adaptive Streaming by Utilizing Video S...
JASLA_presentation.pdf
Doctoral Symposium presentation.pdf
CAPS_Presentation.pdf
HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Quo Vadis (2024)
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
MHV'22 - Super-resolution Based Bitrate Adaptation for HTTP Adaptive Streamin...
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content-adaptive Video Coding for HTTP Adaptive Streaming
LiveVBR presentation at VQEG NORM.pdf
A Framework for Adaptive Delivery of Omnidirectional Video
Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming

More from Vignesh V Menon (18)

PDF
Learning Quality from Complexity and Structure: A Feature-Fused XGBoost Model...
PDF
Film Grain Coding for Versatile Video Coding Systems: Techniques, Challenges,...
PDF
Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming
PDF
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
PDF
Convex-hull Estimation using XPSNR for Versatile Video Coding
PDF
A Tutorial on Latency- and Energy-Aware Video Coding and Delivery Streaming S...
PDF
Video Super-Resolution for Optimized Bitrate and Green Online Streaming
PDF
Enhancing Film Grain Coding in VVC: Improving Encoding Quality and Efficiency
PDF
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
PDF
VCIP_MCBE_presentation.pdf
PDF
Green Variable framerate encoding for Adaptive Live Streaming
PDF
Green_VCA_presentation.pdf
PDF
TQPM.pdf
PDF
Research@Lunch_Presentation.pdf
PDF
Video Complexity Dataset (VCD).pdf
PDF
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
PDF
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
PDF
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
Learning Quality from Complexity and Structure: A Feature-Fused XGBoost Model...
Film Grain Coding for Versatile Video Coding Systems: Techniques, Challenges,...
Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
Convex-hull Estimation using XPSNR for Versatile Video Coding
A Tutorial on Latency- and Energy-Aware Video Coding and Delivery Streaming S...
Video Super-Resolution for Optimized Bitrate and Green Online Streaming
Enhancing Film Grain Coding in VVC: Improving Encoding Quality and Efficiency
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
VCIP_MCBE_presentation.pdf
Green Variable framerate encoding for Adaptive Live Streaming
Green_VCA_presentation.pdf
TQPM.pdf
Research@Lunch_Presentation.pdf
Video Complexity Dataset (VCD).pdf
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...

Recently uploaded (20)

PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Presentation on HIE in infants and its manifestations
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Lesson notes of climatology university.
PPTX
Cell Types and Its function , kingdom of life
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Classroom Observation Tools for Teachers
Chinmaya Tiranga quiz Grand Finale.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Presentation on HIE in infants and its manifestations
A systematic review of self-coping strategies used by university students to ...
Module 4: Burden of Disease Tutorial Slides S2 2025
Microbial disease of the cardiovascular and lymphatic systems
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Microbial diseases, their pathogenesis and prophylaxis
Complications of Minimal Access Surgery at WLH
Lesson notes of climatology university.
Cell Types and Its function , kingdom of life
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
VCE English Exam - Section C Student Revision Booklet
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS

OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

  • 1. OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and Christian Timmerer1 1 Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria 2 Bitmovin, Klagenfurt, Austria 3 School of Computer Science and Electronic Engineering, University of Essex, UK 21 July 2022 Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
  • 2. Outline 1 Introduction 2 OPSE 3 Evaluation 4 Q & A Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
  • 3. Introduction Motivation Per-scene encoding schemes are based on the fact that each resolution performs better than others in a scene for a given bitrate range, and these regions depend on the video complexity. Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as introduced for VoD services.1 Figure: The bitrate ladder prediction envisioned using OPSE. 1 J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016, pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
  • 4. Introduction Why not in live yet? Though per-title encoding schemes2 enhance the quality of video delivery, determining the convex-hull is computationally expensive, making it suitable for only VoD streaming applications. Some methods pre-analyze the video contents3. Katsenou et al.4 introduced a content-gnostic method that employs machine learning to find the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5 proposed a Random Forest (RF) classifier to decide encoding resolution best suited over different quality ranges and studied machine learning based adaptive resolution prediction. However, these approaches still yield latency much higher than the accepted latency in live streaming. 2 De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247. 3 https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022. 4 A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi: 10.1109/PCS48520.2019.8954529. 5 Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
  • 5. OPSE OPSE Input Video Video Complexity Feature Extraction Scene Detection Resolution Prediction Resolutions (R) Bitrates (B) Per-Scene Encoding (E, h, ϵ) (E, h) Scenes (ˆ r, b) Figure: OPSE architecture. E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6 6 Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
  • 6. OPSE OPSE Phase 1: Feature Extraction Compute texture energy per block A DCT-based energy function is used to determine the block-wise feature of each frame defined as: Hk = w−1 X i=0 w−1 X j=0 e|( ij wh )2−1| |DCT(i, j)| (1) where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i + j > 0, and 0 otherwise. The energy values of blocks in a frame is averaged to determine the energy per frame.7 E = C−1 X k=0 Hp,k C · w2 (2) 7 Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
  • 7. OPSE OPSE Phase 1: Feature Extraction hp: SAD of the block level energy values of frame p to that of the previous frame p − 1. hp = C−1 X k=0 | Hp,k, Hp−1,k | C · w2 (3) where C denotes the number of blocks in frame p. The gradient of h per frame p, ϵp is also defined, which is given by: ϵp = hp−1 − hp hp−1 (4) Latency Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86 SIMD optimization Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
  • 8. OPSE OPSE Phase 2: Scene Detection Objective: Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh (IDR) frame. Encode the subsequent frames of the new shot based on the first one via motion compen- sation and prediction. Shot transitions can be present in two ways: hard shot-cuts gradual shot transitions The detection of gradual changes is much more difficult owing to the fact it is difficult to determine the change in the visual information in a quantitative format. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
  • 9. OPSE OPSE Phase 2: Scene Detection Step 1: while Parsing all video frames do if ϵk > T1 then k ← IDR-frame, a new shot. else if ϵk ≤ T2 then k ← P-frame or B-frame, not a new shot. T1 , T2 : maximum and minimum threshold for ϵk f : video fps Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3 q0: current frame number in the set Q q−1: previous frame number in the set Q q1: next frame number in the set Q Step 2: while Parsing Q do if q0 − q−1 > f and q1 − q0 > f then q0 ← IDR-frame, a new shot. Eliminate q0 from Q. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
  • 10. OPSE OPSE Phase 3: Resolution Prediction For each detected scene, the optimized bitrate ladder is predicted using the E and h features of the first GOP of each scene and the sets R and B. The optimized resolution ˆ r is predicted for each target bitrate b ∈ B. The resolution scaling factor s is defined as: s = r rmax ; r ∈ R (5) where rmax is the maximum resolution in R. Hidden Layer E R4 Hidden Layer E R4 Input Layer E R3 Output Layer E R1 E h log(b) ŝ Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum resolution rmax and framerate f . Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
  • 11. Evaluation Evaluation R = {360p, 432p, 540p, 720p, 1080p} B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}. Figure: BDRV results for scenes characterized by various average E and h. BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations compared with that of the fixed bitrate ladder encoding to maintain the same VMAF. 8 G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001). Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
  • 12. Evaluation Evaluation (a) Scene1 (b) Scene2 Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
  • 13. Q A Q A Thank you for your attention! Vignesh V Menon (vignesh.menon@aau.at) Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13