SlideShare a Scribd company logo
All rights reserved. ©2020
All rights reserved. ©2021
Video complexity analyzer (VCA) for streaming
applications
December 14, 2021
Presenters: Vignesh V Menon & Hadi Amirpour
1
All rights reserved. ©2020
Vignesh V Menon Hadi Amirpour
Researchers @ Christian Doppler Laboratory ATHENA, University of Klagenfurt, Austria
All rights reserved. ©2021
2
About Us
All rights reserved. ©2020
● Motivation for VCA
● Features
● Experimental Results
● Applications
● Future Roadmap
All rights reserved. ©2021
3
All rights reserved. ©2020
Motivation
All rights reserved. ©2021
4
5
Motivation
● We aim to develop online prediction systems tailor-made for live streaming applications.
● The state-of-the-art spatial and temporal complexity feature is SI-TI.
Fig.1: Correlation of SI feature with number of bits (in kb) per frame in IDR encoding with QP27 of x265 for 24 test sequences from MCML[1] and
SJTU[2] dataset.
[1] M. Cheon and J.-S. Lee, “Subjective and Objective Quality Assessment of Compressed 4K UHD Videos for Immersive Experience,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 28,
no. 7, pp. 1467–1480, 2018.
[2] L. Song, X. Tang, W. Zhang, X. Yang, and P. Xia, “The SJTU 4K Video Sequence Dataset,”Fifth International Workshop on Quality of Multimedia Experience (QoMEX2013), Jul. 2013.
Pearson correlation coefficient (PCC) of SI with bits per frame is ~0.79.
6
Motivation
● Time taken to compute SI-TI features is very high!
➢ ~0.05 seconds per frame for 1080p, ~0.2 seconds per frame for 2160p
➢ Not suitable for live applications
➢ Higher computational cost in VoD applications
Fig.2: Time taken to compute SI-TI features [1] in Intel Xeon Gold 5218R
[1] SITI source code: https://github.com/Telecommunication-Telemedia-Assessment/SITI
7
Motivation
Video Complexity Analyzer (VCA) can be realized as a fast preprocessor which
determines the spatial and temporal complexity of videos (segments) to aid the encoding
process.
Fig.3: The proposed framework for streaming applications.
All rights reserved. ©2020
Features
All rights reserved. ©2021
8
9
Spatial complexity feature
k is the block address in the pth frame,w×w pixels is the size of the block, and DCT(i, j) is
the (i, j)th
DCT component when i+j >1, and 0 otherwise.
10
Spatial complexity feature
C represents the number of blocks per frame.
Fig. 4: Number of bits (in kb) per frame and E feature of Wood SJTU sequence.
11
Temporal complexity feature
The block-wise SAD of the texture energy of each frame (p) compared to its previous
frame (p-1) is computed.
All rights reserved. ©2020
Experimental Results
All rights reserved. ©2021
12
13
Results
PCC(SI, Bits per frame) = 0.787
PCC(E, Bits per frame) = 0.856
Fig. 5: Correlation of SI and E features with number of bits (in kb) per frame.
14
Results
Fig. 6: Average time to compute E-h for various resolutions (with x86 SIMD).
Note: Presently, E-h computation is 5 times faster than SI-TI computation.
All rights reserved. ©2020
Applications
All rights reserved. ©2021
15
16
Shot Detection
We define the gradient of ‘h’ per frame ‘p’ as:
Fig. 7: Epsilon values for ToS sequence. Please note that shot transitions happen at frames: 107, 110,
238,338,465,531,596,778,850,917,1018,1361,1437.
17
Shot Detection
The algorithm is classified into two steps:
● Feature extraction
● Successive Elimination Algorithm
Source: V. V. Menon, H. Amirpour, M. Ghanbari, and C. Timmerer,“Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming,”
in 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 2174–2178.
Benchmark is the default shot detection algorithm of x265.
All rights reserved. ©2020
Future Roadmap
All rights reserved. ©2021
18
19
Future Roadmap
● The initial version will be released before March 1, 2022.
● Adding Multi-threading support
○ H computation for blocks in each frame can be realized concurrently.
○ ~6x speedup expected with 8 threads.
● Adding CUDA/ OpenCL support
All rights reserved. ©2020
Thanks for your attention!
All rights reserved. ©2021
20
Vignesh V Menon (vignesh.menon@aau.at)
Hadi Amirpour (hadi.amirpour@aau.at)

More Related Content

PDF
Presentation - Model Efficiency for Edge AI
PDF
Internship report - Copy1
PPTX
Xbox 360 Kinect
PDF
SPAJAMでやったこと
PPT
Security Cameras & IP Cameras
PDF
Deep Residual Learning (ILSVRC2015 winner)
PPTX
Physics pbl
PPTX
Detailed Study of CCTV Cameras
Presentation - Model Efficiency for Edge AI
Internship report - Copy1
Xbox 360 Kinect
SPAJAMでやったこと
Security Cameras & IP Cameras
Deep Residual Learning (ILSVRC2015 winner)
Physics pbl
Detailed Study of CCTV Cameras

What's hot (20)

PDF
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
PDF
Fundamental of dwdm
PDF
VTC on Unity の 進捗について
PDF
Cours d’Analyse - Topologie Leçon 2 - T. Masrour
PPTX
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
PPTX
画像処理の高性能計算
PPTX
Fiber Optics Presentation
PDF
SSII2018TS: コンピュテーショナルイルミネーション
PPTX
CURSO DE CCTV - SALIDA LABORAL
PPTX
CNN-SLAMざっくり
PPTX
Object detection
PPT
Fields of digital image processing slides
PDF
CCTV Video vigilancia
PDF
【CVPR 2020 メタサーベイ】Image Retrieval
PPTX
FPGAを用いたフルパイプラインによるバイラテラルフィルタの高速化手法
PPT
H263.ppt
PDF
Cvpr 2021 manydepth
PPTX
Explaining video summarization based on the focus of attention
PPTX
You Only Look Once: Unified, Real-Time Object Detection
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
Fundamental of dwdm
VTC on Unity の 進捗について
Cours d’Analyse - Topologie Leçon 2 - T. Masrour
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
画像処理の高性能計算
Fiber Optics Presentation
SSII2018TS: コンピュテーショナルイルミネーション
CURSO DE CCTV - SALIDA LABORAL
CNN-SLAMざっくり
Object detection
Fields of digital image processing slides
CCTV Video vigilancia
【CVPR 2020 メタサーベイ】Image Retrieval
FPGAを用いたフルパイプラインによるバイラテラルフィルタの高速化手法
H263.ppt
Cvpr 2021 manydepth
Explaining video summarization based on the focus of attention
You Only Look Once: Unified, Real-Time Object Detection
Ad

Similar to Video complexity analyzer (VCA) for streaming applications (20)

PDF
Green_VCA_presentation.pdf
PDF
Video compressiontechniques&standards lamamahmoud_report#2
PDF
Efficient bitrate ladder construction for live video streaming
PDF
PDF
HTTP Adaptive Streaming – Quo Vadis? (2023)
PDF
MSU Codec Comparison 2018
PDF
Video Hyperlinking Tutorial (Part B)
PDF
TQPM.pdf
PDF
Online Bitrate ladder prediction for Adaptive VVC Streaming
PDF
Perceptual evaluation of Immersive Media – From video quality towards a holi...
PDF
Spatial Scalable Video Compression Using H.264
PDF
E010132529
PDF
Video Compression Algorithm Based on Frame Difference Approaches
PDF
CODA_presentation.pdf
PPT
JPEG XR objective and subjective evaluations
PDF
VLSI Design for Video Coding 2010th Edition Youn
PDF
HTTP Adaptive Streaming – Where Is It Heading?
PDF
Immersive Video Delivery: From Omnidirectional Video to Holography
PDF
Overview of the H.264/AVC video coding standard - Circuits ...
PDF
VLSI Design for Video Coding 2010th Edition Youn
Green_VCA_presentation.pdf
Video compressiontechniques&standards lamamahmoud_report#2
Efficient bitrate ladder construction for live video streaming
HTTP Adaptive Streaming – Quo Vadis? (2023)
MSU Codec Comparison 2018
Video Hyperlinking Tutorial (Part B)
TQPM.pdf
Online Bitrate ladder prediction for Adaptive VVC Streaming
Perceptual evaluation of Immersive Media – From video quality towards a holi...
Spatial Scalable Video Compression Using H.264
E010132529
Video Compression Algorithm Based on Frame Difference Approaches
CODA_presentation.pdf
JPEG XR objective and subjective evaluations
VLSI Design for Video Coding 2010th Edition Youn
HTTP Adaptive Streaming – Where Is It Heading?
Immersive Video Delivery: From Omnidirectional Video to Holography
Overview of the H.264/AVC video coding standard - Circuits ...
VLSI Design for Video Coding 2010th Edition Youn
Ad

More from Alpen-Adria-Universität (20)

PDF
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
PPTX
End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming
PDF
HTTP Adaptive Streaming – Quo Vadis (2024)
PDF
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
PDF
Video Streaming: Then, Now, and in the Future
PDF
VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instances
PDF
GREEM: An Open-Source Energy Measurement Tool for Video Processing
PDF
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
PDF
VEEP: Video Encoding Energy and CO₂ Emission Prediction
PDF
Content-adaptive Video Coding for HTTP Adaptive Streaming
PPTX
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video...
PPTX
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Vid...
PPTX
Optimizing Video Streaming for Sustainability and Quality: The Role of Prese...
PDF
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
PPTX
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
PDF
Evaluation of Quality of Experience of ABR Schemes in Gaming Stream
PDF
Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, S...
PDF
Multi-access Edge Computing for Adaptive Video Streaming
PPTX
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
PDF
VE-Match: Video Encoding Matching-based Model for Cloud and Edge Computing In...
Energy-Quality-aware Variable Framerate Pareto-Front for Adaptive Video Strea...
End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming
HTTP Adaptive Streaming – Quo Vadis (2024)
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Video Streaming: Then, Now, and in the Future
VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instances
GREEM: An Open-Source Energy Measurement Tool for Video Processing
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
VEEP: Video Encoding Energy and CO₂ Emission Prediction
Content-adaptive Video Coding for HTTP Adaptive Streaming
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video...
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Vid...
Optimizing Video Streaming for Sustainability and Quality: The Role of Prese...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Machine Learning Based Resource Utilization Prediction in the Computing Conti...
Evaluation of Quality of Experience of ABR Schemes in Gaming Stream
Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, S...
Multi-access Edge Computing for Adaptive Video Streaming
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
VE-Match: Video Encoding Matching-based Model for Cloud and Edge Computing In...

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Big Data Technologies - Introduction.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Cloud computing and distributed systems.
PDF
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
Big Data Technologies - Introduction.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
Understanding_Digital_Forensics_Presentation.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
Mobile App Security Testing_ A Comprehensive Guide.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Cloud computing and distributed systems.
Encapsulation_ Review paper, used for researhc scholars

Video complexity analyzer (VCA) for streaming applications

  • 1. All rights reserved. ©2020 All rights reserved. ©2021 Video complexity analyzer (VCA) for streaming applications December 14, 2021 Presenters: Vignesh V Menon & Hadi Amirpour 1
  • 2. All rights reserved. ©2020 Vignesh V Menon Hadi Amirpour Researchers @ Christian Doppler Laboratory ATHENA, University of Klagenfurt, Austria All rights reserved. ©2021 2 About Us
  • 3. All rights reserved. ©2020 ● Motivation for VCA ● Features ● Experimental Results ● Applications ● Future Roadmap All rights reserved. ©2021 3
  • 4. All rights reserved. ©2020 Motivation All rights reserved. ©2021 4
  • 5. 5 Motivation ● We aim to develop online prediction systems tailor-made for live streaming applications. ● The state-of-the-art spatial and temporal complexity feature is SI-TI. Fig.1: Correlation of SI feature with number of bits (in kb) per frame in IDR encoding with QP27 of x265 for 24 test sequences from MCML[1] and SJTU[2] dataset. [1] M. Cheon and J.-S. Lee, “Subjective and Objective Quality Assessment of Compressed 4K UHD Videos for Immersive Experience,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 7, pp. 1467–1480, 2018. [2] L. Song, X. Tang, W. Zhang, X. Yang, and P. Xia, “The SJTU 4K Video Sequence Dataset,”Fifth International Workshop on Quality of Multimedia Experience (QoMEX2013), Jul. 2013. Pearson correlation coefficient (PCC) of SI with bits per frame is ~0.79.
  • 6. 6 Motivation ● Time taken to compute SI-TI features is very high! ➢ ~0.05 seconds per frame for 1080p, ~0.2 seconds per frame for 2160p ➢ Not suitable for live applications ➢ Higher computational cost in VoD applications Fig.2: Time taken to compute SI-TI features [1] in Intel Xeon Gold 5218R [1] SITI source code: https://github.com/Telecommunication-Telemedia-Assessment/SITI
  • 7. 7 Motivation Video Complexity Analyzer (VCA) can be realized as a fast preprocessor which determines the spatial and temporal complexity of videos (segments) to aid the encoding process. Fig.3: The proposed framework for streaming applications.
  • 8. All rights reserved. ©2020 Features All rights reserved. ©2021 8
  • 9. 9 Spatial complexity feature k is the block address in the pth frame,w×w pixels is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i+j >1, and 0 otherwise.
  • 10. 10 Spatial complexity feature C represents the number of blocks per frame. Fig. 4: Number of bits (in kb) per frame and E feature of Wood SJTU sequence.
  • 11. 11 Temporal complexity feature The block-wise SAD of the texture energy of each frame (p) compared to its previous frame (p-1) is computed.
  • 12. All rights reserved. ©2020 Experimental Results All rights reserved. ©2021 12
  • 13. 13 Results PCC(SI, Bits per frame) = 0.787 PCC(E, Bits per frame) = 0.856 Fig. 5: Correlation of SI and E features with number of bits (in kb) per frame.
  • 14. 14 Results Fig. 6: Average time to compute E-h for various resolutions (with x86 SIMD). Note: Presently, E-h computation is 5 times faster than SI-TI computation.
  • 15. All rights reserved. ©2020 Applications All rights reserved. ©2021 15
  • 16. 16 Shot Detection We define the gradient of ‘h’ per frame ‘p’ as: Fig. 7: Epsilon values for ToS sequence. Please note that shot transitions happen at frames: 107, 110, 238,338,465,531,596,778,850,917,1018,1361,1437.
  • 17. 17 Shot Detection The algorithm is classified into two steps: ● Feature extraction ● Successive Elimination Algorithm Source: V. V. Menon, H. Amirpour, M. Ghanbari, and C. Timmerer,“Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming,” in 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 2174–2178. Benchmark is the default shot detection algorithm of x265.
  • 18. All rights reserved. ©2020 Future Roadmap All rights reserved. ©2021 18
  • 19. 19 Future Roadmap ● The initial version will be released before March 1, 2022. ● Adding Multi-threading support ○ H computation for blocks in each frame can be realized concurrently. ○ ~6x speedup expected with 8 threads. ● Adding CUDA/ OpenCL support
  • 20. All rights reserved. ©2020 Thanks for your attention! All rights reserved. ©2021 20 Vignesh V Menon (vignesh.menon@aau.at) Hadi Amirpour (hadi.amirpour@aau.at)