SlideShare a Scribd company logo
© 2019 FotoNation
An Ultra-low-power Multi-core
Engine for Inference on
Encrypted DNNs
Petronel Bigioi
XPERI
May 2019
© 2019 FotoNation 2
Company Overview
© 2019 FotoNation
FotoNation –XPERI’s Trusted Brand
3
Portfolio of Trusted Brands
Licensing
Semiconductor
Intellectual Property
Imaging and
Computer Vision
silicon IP cores and
solutions
Audio Technology
Solutions
Automotive Audio,
Data, and Digital
Radio Broadcast
Solutions
Semiconductor
and Interconnect
Packaging
Technology &
Solutions
3.4+ B Devices 70+ M Cars 1+ B Devices 100+ B Devices2+ B Devices
© 2019 FotoNation 4
Imaging and Inference at the Edge
Always-on inference:
AI operates while the
device is “off”
(e.g. ultra low power face
detection as an enabler)
Battery-powered head-
mounted displays for
AR or MR
(e.g. IRIS rec for device
access, gaze, etc..)
Smart appliances:
from TVs to receivers
to toasters to…
(e.g. ultra low power people
detection as an enabler)
Driver and in cabin
monitoring for
autonomous driving
(e.g. always ON occupancy
assistant)
© 2019 FotoNation
Edge Challenges
5
© 2019 FotoNation
Edge Inference Challenges
- Ultra low power requirements for battery operated devices
- Protect consumer privacy by local processing only
- Similar quality and performance as for cloud solutions are expected
- Scalable and flexible engines are expected
- Depending on application, true parallel processing is expected
6
© 2019 FotoNation
Ingredients to Deliver for Successful AI Based Edge Solutions
7
Computer Vision R&D Infrastructure
Computer generated and real images –
ground truth data-sets for effective NN
training, testing and validation
Core Imaging & ML R&D
Research various image processing
and core ML methods and
architectures for differentiation
Many years of investment …
Ultra low power, high
performance and scalable
engines & development tools
Imaging & Inference Engines
Product testing with more than
40 million images market and
annotated for various features
Vision Testing Infrastructure
© 2019 FotoNation
8
Just some Acquisition Systems Investment
© 2019 FotoNation
Reverse Engineering – the Danger
- In the edge processing model, the neural networks sit in the device’s
permanent storage, widely exposed to various types of reverse
engineering
- Network representation patterns can be identified and localized in the
storage contents
- Once the network representation is known, architecture and weight
values can be obtained
- Networks can be remapped and inferred on alternative architectures
- Many years of investment… gone!
9
© 2019 FotoNation
Edge Inference Solution
10
© 2019 FotoNation
IPU – Image Processing Unit
Preprocessing Cores
- Stream conditioning & statistics for analytics
- Frame to frame registration
- Image enhancement or analytics
Dedicated Cores
- Face detection engine
- People detection engine
- Image enhancement engine
- Image resampling engine
Programmable CNN Cores
- PCNN 1.2: small 72 OPS/cycle and/or PCNN 2.1:
large 1024 OPS/cycle
- PCNN Cluster engine for scalable AI (supporting -
up to four small and/or large PCNNs)
- Multiple clusters supported
11
3rd Party
FN IP
Sensor
MIPI
DDR CTL
I/F
LCD
DDR
COMMS
FLASHI/F
--------
--------
--------
FlashGPU
ISP
Display/LTM
IPU
CPU
Preprocessing
cores
Dedicated cores
Programmable
cores
© 2019 FotoNation
IPU Highlights
- Focus on ultra-low-power imaging AI
- Maximizes quality and performance of imaging solutions
- Flexible deployment enabling market-specific, game-changing use cases
- Enables concurrent processing
- Secure deployment
- Use case driven: each IPU deployment is unique to the addressed use-
case and device
12
© 2019 FotoNation 13
Concurrent Execution Comparison (NPU1 vs IPU2)
Face Detection(NN01)
Object Detection (NN02)
Face Recognition(NN03)
Object Classification (NN04)
fps
fps
fps
fps
97 97 48 97
20 41
36 97
15 41
14 56
31 97
13 41
14 56
2 8
100% 9% 100% 22% 100% 63% 100% 100%
PERFORMANCE
Face Detection (NN01)
Object Detection (NN02)
Face Recognition(NN03)
Object Classification (NN04)
512 28
NPU IPU
NN01 NN (01 + 02) NN(01+02+03) NN(01+02+03+04)
256 28
NPU IPU
365 40
192 28
NPU IPU
274 40
266 155
163 28
NPU IPU
234 40
266 155
114 61
512 28 621 68 732 223 777 284
9x 3x 2.7x18x
mW
mW
mW
mW
POWER
Core Utilization for each scenario
2x 3x 3.5xreference
1) NPU – Neural Processing Unit (Competitor)
2) IPU – Image Processing Unit (FotoNation)
At the same utilization, IPU offers more than three times performance boost and is three times more power efficient than NPU
… in other words, at the same performance the IPU is more than 9 times more power efficient …
… at the same power consumption NPU is more than 9 times slower than IPU …
… IPU is simply better!
© 2019 FotoNation 14
PCNN – Programmable CNN Engine
Image pre-processing
engine as ‘layer 0’
72 OPS/cycle or
1024 OPS/cycle
Support for compression,
quantization and on the fly
decryption
16-bit floating-point
internal operation
supporting weights as
small as 2 bits
© 2019 FotoNation
PCNN Highlights
Scalable and flexible, configurable
SRAM size to address specific use
cases
Low power consumption
(22 nm FDSOI tech)
• 18 mW for PCNN 1.2
• 120 mW for PCNN 2.1
Built-in real-time, on-the-fly NN
decryption engine
Separate memory channels for network
fetch and intermediate layers/input
Separate cache for code/net and
intermediate layers
15
PCNN CORE
PCNN ENGINE
SYS BUS
(AXI)
DDR CPU
REGS
MAP RD MAP WR CODE RD IRQ APB
FLASH
CTL BUS (APB)
© 2019 FotoNation 16
NN Protection Engineering Solution
Kstream=
Npub * Csec
NN encrypted
with Kstream
Stream
decipher
Neural
Network
FN s NN
IP
- HW -
Stream
cipher
SOFTWARE
privately run by the NN designer
HARDWARE
FotoNation s IP and customer s NN
Chip secret
random key
Csec
NN
secret random
key Nsec
Neural
Network
On chip
(fuses)
Kstream
Npub
Npub =
base * Nsec
Cpub =
base * Csec
Kstream= Cpub
* Nsec
Kstream
SW
run by the chip manufacturer
Cpub
Csec
© 2019 FotoNation
Decryption Features
17
Encryption based on two secret
255-bit keys:
• Manufacturer key and
• NN owner key
Secret keys processed offline by a
proprietary software, based on public
domain curve25519
Secret manufacturer key stored on
fuses on-chip
NN designer’s public key upload on-
chip after power-up
• Plain message processed offline with
proprietary software
• Data encrypted/decrypted with Trivium
stream cipher
• Data decryption is implemented in hardware
• 128-bit plain data is generated on-the-fly
based on the 128-bit encrypted data
© 2019 FotoNation 18
PCNN Clusters
Clusters can have mixed
PCNN 1.2 and
PCNN 2.0 cores.
Accommodates up to
4xPCNNs executing
the same network or
individual networks.
For more processing
power, several
clusters can be
connected.
PCNN configuration is flexible (1xPCNN executing network 1 & 3xPCNNs executing network 2);
True concurrent network execution.
© 2019 FotoNation
PCNN - Cluster Architecture
Scalable and flexible to address
specific market needs:
- Mobile
- Automotive
- Home
Ultra low power
Secure
19
SYSTEM BUS
(AXI)
SYS MEMORY
(DDR, FLASH)
HOST CPU
PCNN-CLUSTER
CORE
RISC
MAILBOX
CFG
(AHB)
PCNN-C ENGINE
PCNN PCNNPCNN PCNN
SRAM CTL
SHARED SRAM
IRQ
CFG
1K Bus
*
*
ARBITER
TO/FROM
OTHER
PCNN-C
© 2019 FotoNation 20
IPU – PCNN Development Tools
Converter
NN Design and Train:
Caffe, Tensorflow , Torch,
Theano, MatConvNet, etc
NN
structure,
weights
PCNN
Configuration
Tool
Performance
report
PCNN Binary
Normalisation
Convertion to
FP16
Input Image
PCNN IP
SW Bit exact
Model
Normalised 8bit
or FP16
Input Map
PCNN results
© 2019 FotoNation 21
IPU – Dedicated Detectors Tool
Dedicated
detector tool
Training mode
- Database and ground truth data
- Neural Network settings
- Pre-trained networks
Binaries
Binary that will be load-able by both simulation
tool and HW accelerator on FPGA/ASIC
Dedicated
detector tool
Simulator mode
- Input images
- Detections
- Input images with bounding boxes overlayed
RESULTS
© 2019 FotoNation
Conclusions
22
© 2019 FotoNation
Conclusions
IPU is an ultra-low-power multi-core engine optimized for imaging AI on the
edge
IPU prevents neural network reverse engineering and intellectual property
theft by supporting on-the-fly decryption and inference on encrypted DNNs
IPU supports true multi-tasking networks via the “cluster” concept
IPU is scalable to ANY market via two dimensions:
• Cluster of multiple programmable CNN engines
• Cluster of clusters
23
© 2019 FotoNation
Resources
IPU: https://www.fotonation.com/products/optimize/
Encryption/decryption: https://en.wikipedia.org/wiki/Curve25519
24
© 2019 FotoNation
Thank you
25

More Related Content

PDF
MIPI DevCon 2021: MIPI I3C Signal Integrity Challenges on DDR5-based Server P...
PPTX
MIPI IP Modules for SoC Prototyping
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
PPTX
TPU paper slide
PPTX
Tensor Processing Unit (TPU)
PDF
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
PDF
"Object Detection for Embedded Markets," a Presentation from Imagination Tech...
PDF
In datacenter performance analysis of a tensor processing unit
MIPI DevCon 2021: MIPI I3C Signal Integrity Challenges on DDR5-based Server P...
MIPI IP Modules for SoC Prototyping
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
TPU paper slide
Tensor Processing Unit (TPU)
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
"Object Detection for Embedded Markets," a Presentation from Imagination Tech...
In datacenter performance analysis of a tensor processing unit

Similar to "An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Presentation from Xperi (20)

PDF
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
PDF
Edge AI Miramond technical seminCERN.pdf
PPTX
Dov Nimratz, Roman Chobik "Embedded artificial intelligence"
PDF
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...
PPTX
Google TPU
PDF
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
PDF
2017 04-13-google-tpu-04
PDF
FPGA Hardware Accelerator for Machine Learning
PDF
Hands on with Edge AI
PPTX
How AI and ML are driving Memory Architecture changes
PDF
“DEEPX’s New M1 NPU Delivers Flexibility, Accuracy, Efficiency and Performanc...
PDF
“Five Things You Might Overlook on Your Next Vision-enabled Product Design,” ...
PDF
Deep learning @ Edge using Intel's Neural Compute Stick
PDF
David Prendergast - Innovative Physics - From AI to Fukushima - Isle of Wight...
PPTX
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
PPTX
Presentation1ubv.pptx
PDF
“Toward the Era of AI Everywhere,” a Presentation from DEEPX
PDF
Making AI Ubiquitous
PDF
Study of Energy Efficient Images with Just Noticeable Difference Threshold Ba...
PDF
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
 
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Edge AI Miramond technical seminCERN.pdf
Dov Nimratz, Roman Chobik "Embedded artificial intelligence"
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...
Google TPU
"Designing Deep Neural Network Algorithms for Embedded Devices," a Presentati...
2017 04-13-google-tpu-04
FPGA Hardware Accelerator for Machine Learning
Hands on with Edge AI
How AI and ML are driving Memory Architecture changes
“DEEPX’s New M1 NPU Delivers Flexibility, Accuracy, Efficiency and Performanc...
“Five Things You Might Overlook on Your Next Vision-enabled Product Design,” ...
Deep learning @ Edge using Intel's Neural Compute Stick
David Prendergast - Innovative Physics - From AI to Fukushima - Isle of Wight...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Presentation1ubv.pptx
“Toward the Era of AI Everywhere,” a Presentation from DEEPX
Making AI Ubiquitous
Study of Energy Efficient Images with Just Noticeable Difference Threshold Ba...
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
 
Ad

More from Edge AI and Vision Alliance (20)

PDF
“An Introduction to the MIPI CSI-2 Image Sensor Standard and Its Latest Advan...
PDF
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
PDF
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
PDF
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
PDF
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
PDF
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
PDF
“Introduction to Data Types for AI: Trade-offs and Trends,” a Presentation fr...
PDF
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
PDF
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
PDF
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
PDF
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
PDF
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
PDF
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
PDF
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
“An Introduction to the MIPI CSI-2 Image Sensor Standard and Its Latest Advan...
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
“Introduction to Data Types for AI: Trade-offs and Trends,” a Presentation fr...
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
Ad

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Approach and Philosophy of On baking technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
A Presentation on Artificial Intelligence
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
Reach Out and Touch Someone: Haptics and Empathic Computing
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The AUB Centre for AI in Media Proposal.docx
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Approach and Philosophy of On baking technology
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
A Presentation on Artificial Intelligence
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Assigned Numbers - 2025 - Bluetooth® Document
A comparative analysis of optical character recognition models for extracting...
Spectral efficient network and resource selection model in 5G networks

"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Presentation from Xperi

  • 1. © 2019 FotoNation An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs Petronel Bigioi XPERI May 2019
  • 2. © 2019 FotoNation 2 Company Overview
  • 3. © 2019 FotoNation FotoNation –XPERI’s Trusted Brand 3 Portfolio of Trusted Brands Licensing Semiconductor Intellectual Property Imaging and Computer Vision silicon IP cores and solutions Audio Technology Solutions Automotive Audio, Data, and Digital Radio Broadcast Solutions Semiconductor and Interconnect Packaging Technology & Solutions 3.4+ B Devices 70+ M Cars 1+ B Devices 100+ B Devices2+ B Devices
  • 4. © 2019 FotoNation 4 Imaging and Inference at the Edge Always-on inference: AI operates while the device is “off” (e.g. ultra low power face detection as an enabler) Battery-powered head- mounted displays for AR or MR (e.g. IRIS rec for device access, gaze, etc..) Smart appliances: from TVs to receivers to toasters to… (e.g. ultra low power people detection as an enabler) Driver and in cabin monitoring for autonomous driving (e.g. always ON occupancy assistant)
  • 5. © 2019 FotoNation Edge Challenges 5
  • 6. © 2019 FotoNation Edge Inference Challenges - Ultra low power requirements for battery operated devices - Protect consumer privacy by local processing only - Similar quality and performance as for cloud solutions are expected - Scalable and flexible engines are expected - Depending on application, true parallel processing is expected 6
  • 7. © 2019 FotoNation Ingredients to Deliver for Successful AI Based Edge Solutions 7 Computer Vision R&D Infrastructure Computer generated and real images – ground truth data-sets for effective NN training, testing and validation Core Imaging & ML R&D Research various image processing and core ML methods and architectures for differentiation Many years of investment … Ultra low power, high performance and scalable engines & development tools Imaging & Inference Engines Product testing with more than 40 million images market and annotated for various features Vision Testing Infrastructure
  • 8. © 2019 FotoNation 8 Just some Acquisition Systems Investment
  • 9. © 2019 FotoNation Reverse Engineering – the Danger - In the edge processing model, the neural networks sit in the device’s permanent storage, widely exposed to various types of reverse engineering - Network representation patterns can be identified and localized in the storage contents - Once the network representation is known, architecture and weight values can be obtained - Networks can be remapped and inferred on alternative architectures - Many years of investment… gone! 9
  • 10. © 2019 FotoNation Edge Inference Solution 10
  • 11. © 2019 FotoNation IPU – Image Processing Unit Preprocessing Cores - Stream conditioning & statistics for analytics - Frame to frame registration - Image enhancement or analytics Dedicated Cores - Face detection engine - People detection engine - Image enhancement engine - Image resampling engine Programmable CNN Cores - PCNN 1.2: small 72 OPS/cycle and/or PCNN 2.1: large 1024 OPS/cycle - PCNN Cluster engine for scalable AI (supporting - up to four small and/or large PCNNs) - Multiple clusters supported 11 3rd Party FN IP Sensor MIPI DDR CTL I/F LCD DDR COMMS FLASHI/F -------- -------- -------- FlashGPU ISP Display/LTM IPU CPU Preprocessing cores Dedicated cores Programmable cores
  • 12. © 2019 FotoNation IPU Highlights - Focus on ultra-low-power imaging AI - Maximizes quality and performance of imaging solutions - Flexible deployment enabling market-specific, game-changing use cases - Enables concurrent processing - Secure deployment - Use case driven: each IPU deployment is unique to the addressed use- case and device 12
  • 13. © 2019 FotoNation 13 Concurrent Execution Comparison (NPU1 vs IPU2) Face Detection(NN01) Object Detection (NN02) Face Recognition(NN03) Object Classification (NN04) fps fps fps fps 97 97 48 97 20 41 36 97 15 41 14 56 31 97 13 41 14 56 2 8 100% 9% 100% 22% 100% 63% 100% 100% PERFORMANCE Face Detection (NN01) Object Detection (NN02) Face Recognition(NN03) Object Classification (NN04) 512 28 NPU IPU NN01 NN (01 + 02) NN(01+02+03) NN(01+02+03+04) 256 28 NPU IPU 365 40 192 28 NPU IPU 274 40 266 155 163 28 NPU IPU 234 40 266 155 114 61 512 28 621 68 732 223 777 284 9x 3x 2.7x18x mW mW mW mW POWER Core Utilization for each scenario 2x 3x 3.5xreference 1) NPU – Neural Processing Unit (Competitor) 2) IPU – Image Processing Unit (FotoNation) At the same utilization, IPU offers more than three times performance boost and is three times more power efficient than NPU … in other words, at the same performance the IPU is more than 9 times more power efficient … … at the same power consumption NPU is more than 9 times slower than IPU … … IPU is simply better!
  • 14. © 2019 FotoNation 14 PCNN – Programmable CNN Engine Image pre-processing engine as ‘layer 0’ 72 OPS/cycle or 1024 OPS/cycle Support for compression, quantization and on the fly decryption 16-bit floating-point internal operation supporting weights as small as 2 bits
  • 15. © 2019 FotoNation PCNN Highlights Scalable and flexible, configurable SRAM size to address specific use cases Low power consumption (22 nm FDSOI tech) • 18 mW for PCNN 1.2 • 120 mW for PCNN 2.1 Built-in real-time, on-the-fly NN decryption engine Separate memory channels for network fetch and intermediate layers/input Separate cache for code/net and intermediate layers 15 PCNN CORE PCNN ENGINE SYS BUS (AXI) DDR CPU REGS MAP RD MAP WR CODE RD IRQ APB FLASH CTL BUS (APB)
  • 16. © 2019 FotoNation 16 NN Protection Engineering Solution Kstream= Npub * Csec NN encrypted with Kstream Stream decipher Neural Network FN s NN IP - HW - Stream cipher SOFTWARE privately run by the NN designer HARDWARE FotoNation s IP and customer s NN Chip secret random key Csec NN secret random key Nsec Neural Network On chip (fuses) Kstream Npub Npub = base * Nsec Cpub = base * Csec Kstream= Cpub * Nsec Kstream SW run by the chip manufacturer Cpub Csec
  • 17. © 2019 FotoNation Decryption Features 17 Encryption based on two secret 255-bit keys: • Manufacturer key and • NN owner key Secret keys processed offline by a proprietary software, based on public domain curve25519 Secret manufacturer key stored on fuses on-chip NN designer’s public key upload on- chip after power-up • Plain message processed offline with proprietary software • Data encrypted/decrypted with Trivium stream cipher • Data decryption is implemented in hardware • 128-bit plain data is generated on-the-fly based on the 128-bit encrypted data
  • 18. © 2019 FotoNation 18 PCNN Clusters Clusters can have mixed PCNN 1.2 and PCNN 2.0 cores. Accommodates up to 4xPCNNs executing the same network or individual networks. For more processing power, several clusters can be connected. PCNN configuration is flexible (1xPCNN executing network 1 & 3xPCNNs executing network 2); True concurrent network execution.
  • 19. © 2019 FotoNation PCNN - Cluster Architecture Scalable and flexible to address specific market needs: - Mobile - Automotive - Home Ultra low power Secure 19 SYSTEM BUS (AXI) SYS MEMORY (DDR, FLASH) HOST CPU PCNN-CLUSTER CORE RISC MAILBOX CFG (AHB) PCNN-C ENGINE PCNN PCNNPCNN PCNN SRAM CTL SHARED SRAM IRQ CFG 1K Bus * * ARBITER TO/FROM OTHER PCNN-C
  • 20. © 2019 FotoNation 20 IPU – PCNN Development Tools Converter NN Design and Train: Caffe, Tensorflow , Torch, Theano, MatConvNet, etc NN structure, weights PCNN Configuration Tool Performance report PCNN Binary Normalisation Convertion to FP16 Input Image PCNN IP SW Bit exact Model Normalised 8bit or FP16 Input Map PCNN results
  • 21. © 2019 FotoNation 21 IPU – Dedicated Detectors Tool Dedicated detector tool Training mode - Database and ground truth data - Neural Network settings - Pre-trained networks Binaries Binary that will be load-able by both simulation tool and HW accelerator on FPGA/ASIC Dedicated detector tool Simulator mode - Input images - Detections - Input images with bounding boxes overlayed RESULTS
  • 23. © 2019 FotoNation Conclusions IPU is an ultra-low-power multi-core engine optimized for imaging AI on the edge IPU prevents neural network reverse engineering and intellectual property theft by supporting on-the-fly decryption and inference on encrypted DNNs IPU supports true multi-tasking networks via the “cluster” concept IPU is scalable to ANY market via two dimensions: • Cluster of multiple programmable CNN engines • Cluster of clusters 23
  • 24. © 2019 FotoNation Resources IPU: https://www.fotonation.com/products/optimize/ Encryption/decryption: https://en.wikipedia.org/wiki/Curve25519 24