SlideShare a Scribd company logo
How to Run Audio and
Vision AI Algorithms at
Ultra-Low Power
Presenter:
Deepak Mital
Sr. Director, Architecture
Synaptics Incorporated
• Many IoT applications do not require “continuous maximum” compute
• Continuous monitoring results in battery drain
• Examples:
• Security camera: Turn on main processing for actual detection only when confirmed
necessary
• Human presence detection (HPD) and identification to turn device on: Run HPD
detection and identification algorithm only when detected “potential” presence
• Predictive maintenance: Enable advanced detection only when initial metrics are met
• Shoplift prevention: Enable detailed analytics only when “potential” threat detected
Problem statement
2
© 2024 Synaptics Inc
• Multistage hardware: Capable of running
Audio and Video AI algorithms
• Highly efficient AI models with different KPIs
for each stage
• Tight orchestration of software to invoke each
stage
Solution
3
© 2024 Synaptics Inc
Always-on domain
High performance
High efficiency
Power
management
System
memories
Security
USB
/
serial
/
MIPI
U55 NPU
Cortex-M55
μNPU
Cortex-M4
Vision AI
pipeline
JPEG
Audio
VAD
ISP, encoders
Sensing logic
Deep sleep: GPIO (Wake), internal clock
Reset
• Ultra-low power: Microwatts hardware,
always on
• Sound detection
• Image change detection
• Critical model requirements are for very
few false negatives
• False negatives will render device
unresponsive
Solution – Stage 1
4
© 2024 Synaptics Inc
Always-on domain
High performance
High efficiency
Power
management
System
memories
Security
USB
/
serial
/
MIPI
U55 NPU
Cortex-M55
μNPU
Cortex-M4
Vision AI
pipeline
JPEG
Audio
VAD
ISP, encoders
Sensing logic
Deep sleep: GPIO (Wake), internal clock
Reset
• Mid- to low power – 10s of microwatts
hardware, activated by stage 1 via software
• AI algorithms (example):
• Wake-word detection
• Human presence detection
• Critical model requirements are for very
few false negatives and false positives
• False negatives will render device
unresponsive
• False positives will increase power
consumption
Solution – Stage 2
5
© 2024 Synaptics Inc
Always-on domain
High performance
High efficiency
Power
management
System
memories
Security
USB
/
serial
/
MIPI
U55 NPU
Cortex-M55
μNPU
Cortex-M4
Vision AI
pipeline
JPEG
Audio
VAD
ISP, encoders
Sensing logic
Deep sleep: GPIO (Wake), internal clock
Reset
• High performance, activated by Stage 2 via
software
• AI algorithms (example):
• Person identification
• Object detection
• Critical model requirements are for very high
performance at low power
• Slow run times will increase power
consumption
Solution – Stage 3
6
© 2024 Synaptics Inc
Always-on domain
High performance
High efficiency
Power
management
System
memories
Security
USB
/
serial
/
MIPI
U55 NPU
Cortex-M55
μNPU
Cortex-M4
Vision Ai
pipeline
JPEG
Audio
VAD
ISP, encoders
Sensing logic
Deep sleep: GPIO (Wake), internal clock
Reset
• Different requirements for AI models at each stage
• Need AI models optimized for different KPIs: accuracy, performance, and size
• NAS-based model generation architecture where the models are purpose built for the
constrained silicon
• Primary factors affecting inference KPI
• Model architecture design
• Model quantization
• Approach: Jointly optimize model architecture and quantization under memory
constraints
AI models
7
© 2024 Synaptics Inc
• Resolution – [28x28 – 32x32]
• Kernel size – [3x3, 5x5, 7x7]
• Depth – [2, 3, 4]
• Width (channel expansion factor) – [2, 3, 4]
• Mixed-precision quantization parameters –
[4 bit, 6 bit, 8 bit]
Multi-precision NAS search range for classification
8
© 2024 Synaptics Inc
CIFAR-10 classification – Mixed vs 8- or 4-bit precision
9
© 2024 Synaptics Inc
CIFAR-10 classification comparison
10
© 2024 Synaptics Inc
• Resolution – [320x240 –
640x480]
• Kernel size – [3x3, 5x5, 7x7]
• Depth – [2, 3, 4]
• Width (channel expansion
factor) – [2, 3, 4]
• Mixed-precision quantization
parameters – [4 bit, 6 bit, 8 bit]
Object detection dataset
11
© 2024 Synaptics Inc
COCO person detection – Mixed vs 8- or 4-bit precision
12
© 2024 Synaptics Inc
COCO person detection comparison
13
© 2024 Synaptics Inc
• Model development stage KPI:
• COCO Instance Mask mAP: 0.636
• Latency: 92.19 ms
• Resolution: 480x640 (VGA)
• Weights: 1.57 M parameters
• Model run on hardware:
• Inference time: 96 ms
• Total frame time: 120 ms
Segmentation run on Stage 3
14
© 2024 Synaptics Inc
• Building full applications running at ultra-low power requires high levels of integration
of hardware and software
• Multiple levels of processing is needed to wake up silicon components as needed
• Stage 2 and Stage 3 come out of deep sleep based on results from previous stage
• The low-power orchestration demands tight software integration
• Each stage requires AI models with different KPIs on accuracy, model size, and speed
• Need to have NAS-based model generation/training software to enable the complete
solution
• Solution enables battery-powered devices that are AI capable and can run for many
months/years
Summary
15
© 2024 Synaptics Inc
Resources
16
Synaptics Astra embedded processors
https://www.synaptics.com/products/embedded-processors
Synaptics Astra evaluation Kit
https://synacsm.atlassian.net/servicedesk/customer/portal/543/grou
p/563/create/6387
Synaptics Astra software
https://github.com/synaptics-astra
© 2024 Synaptics Inc

More Related Content

PDF
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
PDF
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
PDF
FPGA Hardware Accelerator for Machine Learning
PDF
“Designing the Next Ultra-Low-Power Always-On Solution,” a Presentation from ...
PDF
AI Chip Trends and Forecast
PDF
Vertex Perspectives | AI Optimized Chipsets | Part II
PDF
ChipEx 2019 keynote
PDF
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
FPGA Hardware Accelerator for Machine Learning
“Designing the Next Ultra-Low-Power Always-On Solution,” a Presentation from ...
AI Chip Trends and Forecast
Vertex Perspectives | AI Optimized Chipsets | Part II
ChipEx 2019 keynote
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...

Similar to “How to Run Audio and Vision AI Algorithms at Ultra-low Power,” a Presentation from Synaptics (20)

PDF
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
PDF
Accelerating algorithmic and hardware advancements for power efficient on-dev...
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
PPTX
Edgeai Engr245 2021 Lessons Learned
PDF
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
PDF
Implementing AI: Running AI at the Edge
 
PPTX
CAQA5e_ch1 (3).pptx
PPTX
Artificial Intelligence and Machine Learning: Trends and Research in 2025.pptx
PPTX
Artificial Intelligence and Machine Learning: Trends and Research in 2025
PPTX
Artificial Intelligence and Machine Learning: Trends and Research in 2025.pptx
PPTX
Artificial Intelligence and Machine Learning: Trends and Research in 2025
PDF
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
 
PDF
“Enabling Embedded AI for Healthcare,” a Presentation from VeriSilicon
PDF
Priorities Shift In IC Design
PPTX
Mirabilis_Presentation_DAC_June_2024.pptx
PDF
Leading Research Across the AI Spectrum
PDF
Smart Data Slides: Emerging Hardware Choices for Modern AI Data Management
PDF
Making AI Ubiquitous
PPTX
Chapter 1.pptx
PDF
Machine Learning Challenges and Opportunities in Education, Industry, and Res...
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
Accelerating algorithmic and hardware advancements for power efficient on-dev...
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edgeai Engr245 2021 Lessons Learned
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
Implementing AI: Running AI at the Edge
 
CAQA5e_ch1 (3).pptx
Artificial Intelligence and Machine Learning: Trends and Research in 2025.pptx
Artificial Intelligence and Machine Learning: Trends and Research in 2025
Artificial Intelligence and Machine Learning: Trends and Research in 2025.pptx
Artificial Intelligence and Machine Learning: Trends and Research in 2025
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
 
“Enabling Embedded AI for Healthcare,” a Presentation from VeriSilicon
Priorities Shift In IC Design
Mirabilis_Presentation_DAC_June_2024.pptx
Leading Research Across the AI Spectrum
Smart Data Slides: Emerging Hardware Choices for Modern AI Data Management
Making AI Ubiquitous
Chapter 1.pptx
Machine Learning Challenges and Opportunities in Education, Industry, and Res...

More from Edge AI and Vision Alliance (20)

PDF
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
PDF
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
PDF
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
PDF
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
PDF
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
PDF
“Introduction to Data Types for AI: Trade-Offs and Trends,” a Presentation fr...
PDF
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
PDF
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
PDF
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
PDF
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
PDF
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
PDF
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
PDF
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
PDF
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
“Introduction to Data Types for AI: Trade-Offs and Trends,” a Presentation fr...
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Electronic commerce courselecture one. Pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Approach and Philosophy of On baking technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Understanding_Digital_Forensics_Presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
Approach and Philosophy of On baking technology
Digital-Transformation-Roadmap-for-Companies.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Building Integrated photovoltaic BIPV_UPV.pdf
Programs and apps: productivity, graphics, security and other tools
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf

“How to Run Audio and Vision AI Algorithms at Ultra-low Power,” a Presentation from Synaptics

  • 1. How to Run Audio and Vision AI Algorithms at Ultra-Low Power Presenter: Deepak Mital Sr. Director, Architecture Synaptics Incorporated
  • 2. • Many IoT applications do not require “continuous maximum” compute • Continuous monitoring results in battery drain • Examples: • Security camera: Turn on main processing for actual detection only when confirmed necessary • Human presence detection (HPD) and identification to turn device on: Run HPD detection and identification algorithm only when detected “potential” presence • Predictive maintenance: Enable advanced detection only when initial metrics are met • Shoplift prevention: Enable detailed analytics only when “potential” threat detected Problem statement 2 © 2024 Synaptics Inc
  • 3. • Multistage hardware: Capable of running Audio and Video AI algorithms • Highly efficient AI models with different KPIs for each stage • Tight orchestration of software to invoke each stage Solution 3 © 2024 Synaptics Inc Always-on domain High performance High efficiency Power management System memories Security USB / serial / MIPI U55 NPU Cortex-M55 μNPU Cortex-M4 Vision AI pipeline JPEG Audio VAD ISP, encoders Sensing logic Deep sleep: GPIO (Wake), internal clock Reset
  • 4. • Ultra-low power: Microwatts hardware, always on • Sound detection • Image change detection • Critical model requirements are for very few false negatives • False negatives will render device unresponsive Solution – Stage 1 4 © 2024 Synaptics Inc Always-on domain High performance High efficiency Power management System memories Security USB / serial / MIPI U55 NPU Cortex-M55 μNPU Cortex-M4 Vision AI pipeline JPEG Audio VAD ISP, encoders Sensing logic Deep sleep: GPIO (Wake), internal clock Reset
  • 5. • Mid- to low power – 10s of microwatts hardware, activated by stage 1 via software • AI algorithms (example): • Wake-word detection • Human presence detection • Critical model requirements are for very few false negatives and false positives • False negatives will render device unresponsive • False positives will increase power consumption Solution – Stage 2 5 © 2024 Synaptics Inc Always-on domain High performance High efficiency Power management System memories Security USB / serial / MIPI U55 NPU Cortex-M55 μNPU Cortex-M4 Vision AI pipeline JPEG Audio VAD ISP, encoders Sensing logic Deep sleep: GPIO (Wake), internal clock Reset
  • 6. • High performance, activated by Stage 2 via software • AI algorithms (example): • Person identification • Object detection • Critical model requirements are for very high performance at low power • Slow run times will increase power consumption Solution – Stage 3 6 © 2024 Synaptics Inc Always-on domain High performance High efficiency Power management System memories Security USB / serial / MIPI U55 NPU Cortex-M55 μNPU Cortex-M4 Vision Ai pipeline JPEG Audio VAD ISP, encoders Sensing logic Deep sleep: GPIO (Wake), internal clock Reset
  • 7. • Different requirements for AI models at each stage • Need AI models optimized for different KPIs: accuracy, performance, and size • NAS-based model generation architecture where the models are purpose built for the constrained silicon • Primary factors affecting inference KPI • Model architecture design • Model quantization • Approach: Jointly optimize model architecture and quantization under memory constraints AI models 7 © 2024 Synaptics Inc
  • 8. • Resolution – [28x28 – 32x32] • Kernel size – [3x3, 5x5, 7x7] • Depth – [2, 3, 4] • Width (channel expansion factor) – [2, 3, 4] • Mixed-precision quantization parameters – [4 bit, 6 bit, 8 bit] Multi-precision NAS search range for classification 8 © 2024 Synaptics Inc
  • 9. CIFAR-10 classification – Mixed vs 8- or 4-bit precision 9 © 2024 Synaptics Inc
  • 11. • Resolution – [320x240 – 640x480] • Kernel size – [3x3, 5x5, 7x7] • Depth – [2, 3, 4] • Width (channel expansion factor) – [2, 3, 4] • Mixed-precision quantization parameters – [4 bit, 6 bit, 8 bit] Object detection dataset 11 © 2024 Synaptics Inc
  • 12. COCO person detection – Mixed vs 8- or 4-bit precision 12 © 2024 Synaptics Inc
  • 13. COCO person detection comparison 13 © 2024 Synaptics Inc
  • 14. • Model development stage KPI: • COCO Instance Mask mAP: 0.636 • Latency: 92.19 ms • Resolution: 480x640 (VGA) • Weights: 1.57 M parameters • Model run on hardware: • Inference time: 96 ms • Total frame time: 120 ms Segmentation run on Stage 3 14 © 2024 Synaptics Inc
  • 15. • Building full applications running at ultra-low power requires high levels of integration of hardware and software • Multiple levels of processing is needed to wake up silicon components as needed • Stage 2 and Stage 3 come out of deep sleep based on results from previous stage • The low-power orchestration demands tight software integration • Each stage requires AI models with different KPIs on accuracy, model size, and speed • Need to have NAS-based model generation/training software to enable the complete solution • Solution enables battery-powered devices that are AI capable and can run for many months/years Summary 15 © 2024 Synaptics Inc
  • 16. Resources 16 Synaptics Astra embedded processors https://www.synaptics.com/products/embedded-processors Synaptics Astra evaluation Kit https://synacsm.atlassian.net/servicedesk/customer/portal/543/grou p/563/create/6387 Synaptics Astra software https://github.com/synaptics-astra © 2024 Synaptics Inc